Levenshtein distance: Difference between revisions

Content added Content deleted
(→‎{{header|Picat}}: Split into subsections. Added {{out}})
Line 4,102: Line 4,102:


=={{header|Picat}}==
=={{header|Picat}}==
==Iterative==
Here are three versions:
* iterative (algorithm from Wikipedia): levenshtein/2
Based on the iterative algorithm at Wikipedia. Picat is 1-based so some adjustments are needed.
<lang Picat>levenshtein(S,T) = Dist =>
* recursive with tabling: levenshtein_rec/2
* mode directive tabling (from the Prolog version): levenshtein_mode/2

<lang Picat>go =>
S = [
["kitten","sitting"],
["rosettacode","raisethysword"],
["levenshtein","levenstein"],
["saturday", "sunday"],
["stop", "tops"],
["saturday", "sunday"]
],
foreach([W1,W2] in S)
% clear the table cache
initialize_table,
println(iter=[W1,W2,levenshtein(W1,W2)]),
println(recu=[W1,W2,levenshtein_rec(W1,W2)]),
println(mode=[W1,W2,levenshtein_mode(W1,W2)]),
nl
end,
nl.

%
% Based on the iterative algorithm at Wikipedia.
% (Picat is 1-based so some adjustments are needed.)
%
levenshtein(S,T) = Dist =>
M = 1+S.length,
M = 1+S.length,
N = 1+T.length,
N = 1+T.length,
Line 4,156: Line 4,129:
end,
end,


Dist = D[M,N].
Dist = D[M,N].</lang>


===Tabled recursive version===

<lang Picat>table
%
% Tabled recursive version.
%
table
levenshtein_rec(S,T) = Dist =>
levenshtein_rec(S,T) = Dist =>
Dist1 = 0,
Dist1 = 0,
Line 4,179: Line 4,149:
Dist1 := A + 1
Dist1 := A + 1
end,
end,
Dist = Dist1.
Dist = Dist1.</lang>


===Mode-directed tabling===
%
{{trans|Prolog}}
% Mode directed tabling (from the Prolog version)
<lang Picat>
%
levenshtein_mode(S,T) = Dist =>
levenshtein_mode(S,T) = Dist =>
lev(S, T, Dist).
lev(S, T, Dist).
Line 4,194: Line 4,164:
lev(L, [_|R], D) :- lev(L, R, H), D is H+1.</lang>
lev(L, [_|R], D) :- lev(L, R, H), D is H+1.</lang>



Output:
===Test===
<lang Picat>go =>
S = [
["kitten","sitting"],
["rosettacode","raisethysword"],
["levenshtein","levenstein"],
["saturday", "sunday"],
["stop", "tops"],
["saturday", "sunday"]
],
foreach([W1,W2] in S)
% clear the table cache
initialize_table,
println(iter=[W1,W2,levenshtein(W1,W2)]),
println(recu=[W1,W2,levenshtein_rec(W1,W2)]),
println(mode=[W1,W2,levenshtein_mode(W1,W2)]),
nl
end,
nl.</lang>

{{out}}
<pre>goal = go
<pre>goal = go
iter = [kitten,sitting,3]
iter = [kitten,sitting,3]
Line 4,220: Line 4,212:
mode = [saturday,sunday,3]</pre>
mode = [saturday,sunday,3]</pre>



===Benchmark on larger strings===
Benchmarking the methods with larger strings of random lengths (between 1 and 2000).
Benchmarking the methods with larger strings of random lengths (between 1 and 2000).
<lang Picat>go2 =>
<lang Picat>go2 =>
Line 4,237: Line 4,231:
println(distances=[iter=L1,rec=L2,mode=L3]),
println(distances=[iter=L1,rec=L2,mode=L3]),
nl.
nl.



%
%
Line 4,247: Line 4,240:
S := [Alpha[random(1,Len)] : _ in 1..random(1,MaxLen)].</lang>
S := [Alpha[random(1,Len)] : _ in 1..random(1,MaxLen)].</lang>


Here is sample run. The mode directed version is clearly the fastest.
Here is sample run. The version using mode-directed tabling is clearly the fastest.
{{out}}
<pre>len_s1 = 1968
<pre>len_s1 = 1968
len_s2 = 1529
len_s2 = 1529
Line 4,260: Line 4,254:
CPU time 1.402 seconds.
CPU time 1.402 seconds.


distances=[iter = 1607,rec = 1607,mode = 1607]
distances=[iter = 1607,rec = 1607,mode = 1607]</pre>
</pre>


=={{header|PicoLisp}}==
=={{header|PicoLisp}}==