Levenshtein distance/Alignment: Difference between revisions

From Rosetta Code
Content added Content deleted
m (precision on using a library)
Line 8: Line 8:
PALACE</pre>
PALACE</pre>


For this task, write a function that shows the alignment of two strings for the corresponding levenshtein distance. As an example, use the words "rosettacode" and "raisethysword"
For this task, write a function that shows the alignment of two strings for the corresponding levenshtein distance. As an example, use the words "rosettacode" and "raisethysword".

You can either implement an algorithm, or use a dedicated library (thus showing us how it is named in your language).


=={{header|Perl}}==
=={{header|Perl}}==

Revision as of 11:24, 2 May 2013

Levenshtein distance/Alignment is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

The Levenshtein distance algorithm returns the number of atomic operations (insertion, deletion or edition) that must be performed on a string in order to obtain an other one, but it does not say anything about the actual operations used or their order.

An alignment is a notation used to describe the operations used to turn a string into an other. At some point in the strings, the minus character ('-') is placed in order to signify that a character must be added at this very place. For instance, an alignment between the words 'place' and 'palace' is:

P-LACE
PALACE

For this task, write a function that shows the alignment of two strings for the corresponding levenshtein distance. As an example, use the words "rosettacode" and "raisethysword".

You can either implement an algorithm, or use a dedicated library (thus showing us how it is named in your language).

Perl

<lang perl>use List::Util qw(min);

sub levenshtein_distance_alignment {

   my @s = ('^', split //, shift);
   my @t = ('^', split //, shift);
   my @d;
   $d[$_][0] = $_ for 0 .. @s-1;
   $d[0][$_] = $_ for 0 .. @t-1;
   my (@AS, @AT);
   $AS[$_][0] = join , @s[1 .. $_] for 0 .. @s-1;
   $AS[0][$_] = '-' x $_ for 0 .. @t-1;
   $AT[0][$_] = join , @t[1 .. $_] for 0 .. @t-1;
   $AT[$_][0] = '-' x $_ for 0 .. @s-1;
   for my $i (1 .. @s-1) {
       for my $j (1 .. @t-1) {
           if ($s[$i] eq $t[$j]) {
               $AS[$i][$j] = $AS[$i-1][$j-1] . $s[$i];
               $AT[$i][$j] = $AT[$i-1][$j-1] . $t[$j];
               $d[$i][$j] = $d[$i-1][$j-1];
               next;
           }
           $d[$i][$j] = 1 + (
               my $min = min $d[$i-1][$j], $d[$i][$j-1], $d[$i-1][$j-1]
           );
           if ($d[$i-1][$j] == $min) {
               $AS[$i][$j] = $AS[$i-1][$j] . $s[$i];
               $AT[$i][$j] = $AT[$i-1][$j] . '-';
           }
           elsif ($d[$i][$j-1] == $min) {
               $AS[$i][$j] = $AS[$i][$j-1] . '-';
               $AT[$i][$j] = $AT[$i][$j-1] . $t[$j];
           }
           else {
               $AS[$i][$j] = $AS[$i-1][$j-1] . $s[$i];
               $AT[$i][$j] = $AT[$i-1][$j-1] . $t[$j];
           }
       }
   }
   return $AS[-1][-1], $AT[-1][-1];

}

print join "\n", levenshtein_distance_alignment "rosettacode", "raisethysword"; </lang>

Output:
ro-settac-o-de
raisethysword-