Longest common subsequence: Difference between revisions

← Older edit

Longest common subsequence (view source)

Revision as of 15:17, 28 April 2024

7,734 bytes added , 19 days ago

m

→‎{{header|EasyLang}}

Chkas

1,978

edits

Revision as of 00:50, 4 February 2022 (view source) CNHume (talk \| contribs) m (Improved discussion of M interpreted as an m*n boolean matrix. Simplified O(n) terms.) ← Older edit		Latest revision as of 15:17, 28 April 2024 (view source) Chkas (talk \| contribs) m (→‎{{header\|EasyLang}})
(27 intermediate revisions by 9 users not shown)
Line 2: '''Introduction''' Define a ''subsequence'' to be any output string obtained by deleting zero or more symbols from an input string. The [http://en.wikipedia.org/wiki/Longest_common_subsequence_problem '''Longest Common Subsequence'''] ('''LCS''') is a subsequence of maximum length common to two or more strings. Let ''A'' =&equiv; ''A''[0]… ''A''[m - 1] and ''B'' =&equiv; ''B''[0]… ''B''[n - 1], m &lelt; n be strings drawn from an alphabet Σ of size s, containing every distinct symbol in A + B. An ordered pair (i, j) will be ~~called~~referred to as a match if ''A''[i] == ''B''[j], where 0 ≤ i << m and 0 ≤ j << n. The set of matches '''M''' defines a relation over matches: '''M'''[i, j] ⇔ (i, j) ∈ '''M'''. Define the strict Cartesian [https://en.wikipedia.org/wiki/Product_order product-order] (<) over matches, such that (i1, j1) < (i2, j2) ↔ i1 < i2 and j1 < j2. Defining (>) similarly, we can write m2 < m1 as m1 > m2. Define a ''non-strict'' [https://en.wikipedia.org/wiki/Product_order product-order] (≤) over ordered pairs, such that (i1, j1) ≤ (i2, j2) ⇔ i1 ≤ i2 and j1 ≤ j2. We define (≥) similarly. ~~We write m1 <> m2 to mean that either m1 < m2 or m1 > m2 holds, ''i.e.'', that m1, m2 are ''comparable''.~~ We say ordered pairs p1 and p2 are ''comparable'' if either p1 ≤ p2 or p1 ≥ p2 holds. If i1 &lelt; i2 and j2 &lelt; j1 (or i2 &lelt; i1 and j1 &lelt; j2) then neither m1p1 <≤ m2p2 nor m1p1 >≥ m2p2 are possible;, and ~~m1,~~we say p1 and m2p2 are ''incomparable''. Define the ''strict'' product-order (<) over ordered pairs, such that (i1, j1) < (i2, j2) ⇔ i1 < i2 and j1 < j2. We define (>) similarly. Defining (#) to denote this case, we write m1 # m2. Because the underlying product-order is strict, m1 == m2 (''i.e.'', i1 == i2 and j1 == j2) implies m1 # m2. m1 <> m2 implies m1 ≠ m2, ''i.e.'', that the two tuples differ in some component. Thus, the (<>) operator is the inverse of (#). ~~Given a product-order over the set of matches '''M''', a~~A chain '''C''' is ~~any~~a subset of '''M''' consisting of at least one element m; and where either m1 <>< m2 or m1 > m2 for every pair of distinct elements m1 and m2 ~~of '''C'''~~. ~~Similarly, an~~An antichain '''D''' is any subset of '''M''' ~~where~~in ~~m1 # m2 for~~which every pair of distinct elements m1 and m2 ofare ~~'''D'''~~incomparable. ~~Finding~~A ~~an LCS~~chain can ~~then~~ be ~~restated~~visualized as ~~the~~a ~~problem~~strictly ofincreasing ~~finding~~curve athat ~~chain~~passes ofthrough ~~maximum~~matches ~~cardinality~~(i, pj) ~~over~~in the ~~set~~mn coordinate space of ~~matches~~ '''M'''[i, j]. Every Common Sequence of length ''q'' corresponds to a chain of cardinality ''q'', over the set of matches '''M'''. Thus, finding an LCS can be restated as the problem of finding a chain of maximum cardinality ''p''. According to [Dilworth 1950], this cardinality p equals the minimum number of disjoint antichains into which '''M''' can be decomposed. Note that such a decomposition into the minimal number p of disjoint antichains may not be unique. According to [Dilworth 1950], this cardinality ''p'' equals the minimum number of disjoint antichains into which '''M''' can be decomposed. Note that such a decomposition into the minimal number p of disjoint antichains may not be unique. The set of matches '''M''' can be interpreted as an mn boolean matrix such that '''M'''[i, j] ↔ (i, j) ∈ '''M'''. Then a chain '''C''' can be visualized as a strictly increasing curve through those coordinate pairs corresponding to matches. '''Background''' Where the number of symbols appearing in matches is small relative to the length of the input strings, reuse of the symbols increases; and the number of matches will tend towards ~~quadratic,~~ O(''mn'') quadratic growth. This occurs, for example, in the Bioinformatics application of nucleotide and protein sequencing. The "divide -and -conquer" approach of [Hirschberg 1975] limits the space required to O(''n''). However, this approach requires O(''mn'') time even in the best case. This quadratic time dependency may become prohibitive, given very long input strings. Thus, heuristics are often favored over optimal Dynamic Programming solutions. Line 46: A, B are input strings of lengths m, n respectively p is the length of the LCS M is the set of ~~match pairs~~matches (i, j) such that xA[i] == yB[j] r is the magnitude of M s is the magnitude of the alphabet Σ of distinct symbols in xA + yB '''References''' Line 97: {{trans\|Python}} <~~lang~~syntaxhighlight lang="11l">F lcs(a, b) V lengths = [[0] * (b.len+1)] * (a.len+1) L(x) a Line 116: print(lcs(‘1234’, ‘1224533324’)) print(lcs(‘thisisatest’, ‘testing123testing’))</~~lang~~syntaxhighlight> {{out}} Line 126: =={{header\|Ada}}== Using recursion: <~~lang~~syntaxhighlight lang="ada">with Ada.Text_IO; use Ada.Text_IO; procedure Test_LCS is Line 150: begin Put_Line (LCS ("thisisatest", "testing123testing")); end Test_LCS;</~~lang~~syntaxhighlight> {{out}} <pre> Line 156: </pre> Non-recursive solution: <~~lang~~syntaxhighlight lang="ada">with Ada.Text_IO; use Ada.Text_IO; procedure Test_LCS is Line 200: begin Put_Line (LCS ("thisisatest", "testing123testing")); end Test_LCS;</~~lang~~syntaxhighlight> {{out}} <pre> Line 211: {{works with\|ALGOL 68G\|Any - tested with release mk15-0.8b.fc9.i386}} {{works with\|ELLA ALGOL 68\|Any (with appropriate job cards) - tested with release 1.8.8d.fc9.i386}} <~~lang~~syntaxhighlight lang="algol68">main:( PROC lcs = (STRING a, b)STRING: BEGIN Line 225: END # lcs #; print((lcs ("thisisatest", "testing123testing"), new line)) )</~~lang~~syntaxhighlight> {{out}} <pre> Line 233: =={{header\|APL}}== {{works with\|Dyalog APL}} <~~lang~~syntaxhighlight ~~APL~~lang="apl">lcs←{ ⎕IO←0 betterof←{⊃(</+/¨⍺ ⍵)⌽⍺ ⍵} ⍝ better of 2 selections Line 253: this ∇ 1↓[1]⍵ ⍝ keep looking }sstt }</~~lang~~syntaxhighlight> =={{header\|Arturo}}== {{trans\|Python}} <syntaxhighlight lang="rebol">lcs: function [a,b][ ls: new array.of: @[inc size a, inc size b] 0 loop.with:'i a 'x [ loop.with:'j b 'y [ ls\[i+1]\[j+1]: (x=y)? -> ls\[i]\[j] + 1 -> max @[ls\[i+1]\[j], ls\[i]\[j+1]] ] ] [result, x, y]: @[new "", size a, size b] while [and? [x > 0][y > 0]][ if? ls\[x]\[y] = ls\[x-1]\[y] -> x: x-1 else [ if? ls\[x]\[y] = ls\[x]\[y-1] -> y: y-1 else [ result: a\[x-1] ++ result x: x-1 y: y-1 ] ] ] return result ] print lcs "1234" "1224533324" print lcs "thisisatest" "testing123testing"</syntaxhighlight> {{out}} <pre>1234 tsitest</pre> =={{header\|AutoHotkey}}== {{trans\|Java}} using dynamic programming<br> ahk forum: [http://www.autohotkey.com/forum/viewtopic.php?t=44657&start=135 discussion] <~~lang~~syntaxhighlight ~~AutoHotkey~~lang="autohotkey">lcs(a,b) { ; Longest Common Subsequence of strings, using Dynamic Programming Loop % StrLen(a)+2 { ; Initialize i := A_Index-1 Line 285 ⟶ 320: } Return t }</~~lang~~syntaxhighlight> =={{header\|BASIC}}== ==={{header\|QuickBASIC}}=== {{works with\|QuickBasic\|4.5}} {{trans\|Java}} <~~lang~~syntaxhighlight lang="qbasic">FUNCTION lcs$ (a$, b$) IF LEN(a$) = 0 OR LEN(b$) = 0 THEN lcs$ = "" Line 304 ⟶ 340: END IF END IF END FUNCTION</~~lang~~syntaxhighlight> =={{header\|BASIC256}}== {{trans\|FreeBASIC}} <~~lang~~syntaxhighlight ~~BASIC256~~lang="basic256">function LCS(a, b) if length(a) = 0 or length(b) = 0 then return "" if right(a, 1) = right(b, 1) then Line 322 ⟶ 357: print LCS("1234", "1224533324") print LCS("thisisatest", "testing123testing") end</~~lang~~syntaxhighlight> =={{header\|BBC BASIC}}== This makes heavy use of BBC BASIC's shortcut '''LEFT$(a$)''' and '''RIGHT$(a$)''' functions. <~~lang~~syntaxhighlight lang="bbcbasic"> PRINT FNlcs("1234", "1224533324") PRINT FNlcs("thisisatest", "testing123testing") END Line 338 ⟶ 373: y$ = FNlcs(LEFT$(a$), b$) IF LEN(y$) > LEN(x$) SWAP x$,y$ = x$</~~lang~~syntaxhighlight> '''Output:''' <pre> Line 347 ⟶ 382: =={{header\|BQN}}== It's easier and faster to get only the length of the longest common subsequence, using <code>LcsLen ← ¯1 ⊑ 0¨∘⊢ {𝕩⌈⌈`𝕨+»𝕩}˝ =⌜⟜⌽</code>. This function can be expanded by changing <code>⌈</code> to <code>⊣⍟(>○≠)</code> (choosing from two arguments one that has the greatest length), and replacing the empty length 0 with the empty string <code>""</code> in the right places. <~~lang~~syntaxhighlight lang="bqn">LCS ← ¯1 ⊑ "" <⊸∾ ""¨∘⊢ ⊣⍟(>○≠){𝕩𝔽¨𝔽`𝕨∾¨""<⊸»𝕩}˝ (=⌜⥊¨⊣)⟜⌽</~~lang~~syntaxhighlight> Output: <~~lang~~syntaxhighlight lang="bqn"> "1234" LCS "1224533324" "1234" "thisisatest" LCS "testing123testing" "tsitest"</~~lang~~syntaxhighlight> =={{header\|Bracmat}}== <~~lang~~syntaxhighlight lang="bracmat"> ( LCS = A a ta B b tb prefix . !arg:(?prefix.@(?A:%?a ?ta).@(?B:%?b ?tb)) Line 367 ⟶ 402: & :?lcs & LCS$(.thisisatest.testing123testing) & out$(max !max lcs !lcs);</~~lang~~syntaxhighlight> {{out}} <pre>max 7 lcs t s i t e s t</pre> =={{header\|C}}== <~~lang~~syntaxhighlight lang="c">#include <stdio.h> #include <stdlib.h> Line 408 ⟶ 443: return t; } </syntaxhighlight> ~~</lang>~~ Testing <~~lang~~syntaxhighlight lang="c">int main () { char a[] = "thisisatest"; char b[] = "testing123testing"; Line 419 ⟶ 454: printf("%.s\n", t, s); // tsitest return 0; }</~~lang~~syntaxhighlight> =={{header\|C sharp\|C#}}== ===With recursion=== <~~lang~~syntaxhighlight lang="csharp">using System; namespace LCS Line 455 ⟶ 490: } } }</~~lang~~syntaxhighlight> =={{header\|C++}}== '''Hunt and Szymanski algorithm''' <syntaxhighlight lang="cpp"> ~~<lang cpp>#include <stdint.h>~~ #include <stdint.h> #include <string> #include <memory> // for shared_ptr<> #include <iostream> #include <deque> #include <~~map~~unordered_map> //[C++11] #include <algorithm> // for lower_bound() #include <iterator> // for next() and prev() Line 472 ⟶ 508: class LCS { protected: // ~~This~~Instances of the Pair linked list class isare used to ~~trace~~recover the LCS ~~candidates~~: class Pair { public: Line 492 ⟶ 528: typedef deque<shared_ptr<Pair>> PAIRS; ~~typedef deque<uint32_t> THRESHOLD;~~ typedef deque<uint32_t> INDEXES; typedef ~~map~~unordered_map<char, INDEXES> ~~CHAR2INDEXES~~CHAR_TO_INDEXES_MAP; typedef deque<INDEXES> MATCHES; static uint32_t FindLCS( ~~// The following implements the Hunt and Szymanski algorithm:~~ ~~uint32_t~~ ~~Pairs(~~ MATCHES& ~~matches~~indexesOf2MatchedByIndex1, shared_ptr<Pair>* pairs) { auto ~~trace~~traceLCS = pairs != nullptr; PAIRS ~~traces~~chains; ~~THRESHOLD~~INDEXES ~~threshold~~prefixEnd; // //[Assert]After each index1 iteration ~~threshold~~prefixEnd[index3] is the least index2 // such that the LCS of s1[0:index1] and s2[0:index2] has length index3 + 1 // uint32_t index1 = 0; for (const auto& it1 : ~~matches~~indexesOf2MatchedByIndex1) { ifauto ~~(!it1->empty())~~dq2 {= it1; auto ~~dq2~~limit = it1prefixEnd.end(); for (auto ~~limit~~it2 = ~~threshold~~dq2.~~end~~rbegin(); it2 != dq2.rend(); it2++) { // Each index1, index2 pair corresponds to a match ~~for (auto it2 = dq2.rbegin(); it2 != dq2.rend(); it2++) {~~ ~~// Each of the index1,~~auto index2 ~~pairs considered here correspond to a~~= ~~match~~it2; ~~auto index2 = it2;~~ // // Note: The reverse iterator it2 visits index2 values in descending order, // allowing ~~thresholds~~in-place toupdate beof ~~updated in-place~~prefixEnd[]. std::lower_bound() is used to // to perform a binary search. // limit = lower_bound(~~threshold~~prefixEnd.begin(), limit, index2); ~~auto index3 = distance(threshold.begin(), limit);~~ // // Look ahead to the next index2 value to optimize ~~space~~Pairs used inby the Hunt // and Szymanski algorithm. If the next index2 is also an improvement on // the value currently held in ~~threshold~~prefixEnd[index3], a new Pair will only be // superseded on the next index2 iteration. // // ~~Depending~~Verify that a onnext ~~match~~index2 ~~redundancy,~~value ~~the~~exists; ~~number~~and ofthat ~~Pair~~this ~~constructions~~value ~~may~~is begreater // ~~divided~~than bythe ~~factors~~final ~~ranging~~index2 ~~from~~value 2of upthe toLCS 10prefix orat ~~more.~~prev(limit): // auto ~~skip~~preferNextIndex2 = next(it2) != dq2.rend() && (limit == ~~threshold~~prefixEnd.begin() \|\| prev(limit) < next(it2)); ~~if (skip) continue;~~ // ~~if (limit == threshold.end()) {~~ // Depending on match redundancy, this optimization may reduce the number ~~// insert case~~ // of Pair allocations by factors ranging from 2 up to 10 or more. ~~threshold.push_back(index2);~~ // ~~Refresh limit iterator:~~ if (preferNextIndex2) continue; ~~limit = prev(threshold.end());~~ ~~if (trace) {~~ auto ~~prefix~~index3 = ~~index3 > 0 ? traces[index3 - 1] :~~distance(prefixEnd.begin(), ~~nullptr~~limit); ~~auto last = make_shared<Pair>(index1, index2, prefix);~~ if (limit == ~~traces~~prefixEnd.~~push_back~~end(~~last~~);) { // Insert }Case prefixEnd.push_back(index2); // Refresh limit iterator: limit = prev(prefixEnd.end()); if (traceLCS) { chains.push_back(pushPair(chains, index3, index1, index2)); } ~~else if (index2 < limit) {~~} else if (index2 < //limit) ~~replacement case~~{ // Update limit = index2;Case // Update iflimit ~~(trace) {~~value: limit = index2; ~~auto prefix = index3 > 0 ? traces[index3 - 1] : nullptr;~~ if (traceLCS) { ~~auto last = make_shared<Pair>(index1, index2, prefix);~~ ~~traces~~chains[index3] = ~~last~~pushPair(chains, index3, index1, index2); } } } ~~} // next index2~~ } // next index2 } index1++; } // next index1 if (~~trace~~traceLCS) { // Return the LCS as a linked list of matched index pairs: auto last = ~~traces~~chains.~~size~~empty() >? 0nullptr ?: ~~traces~~chains.back() ~~: nullptr~~; // Reverse longest ~~back-trace~~chain pairs = Pair::Reverse(last); } auto length = ~~threshold~~prefixEnd.size(); return length; } private: static shared_ptr<Pair> pushPair( PAIRS& chains, const ptrdiff_t& index3, uint32_t& index1, uint32_t& index2) { auto prefix = index3 > 0 ? chains[index3 - 1] : nullptr; return make_shared<Pair>(index1, index2, prefix); } protected: // // Match() avoids ~~incurring~~ mn comparisons by using ~~the~~CHAR_TO_INDEXES_MAP ~~associative~~to // ~~memory implemented by CHAR2INDEXES to~~ achieve O(m+n) performance, where m and n are the input lengths. ~~// where m and n are the input lengths.~~ // // The lookup time can be assumed constant in the case of characters. Line 583 ⟶ 626: // time will be O(log(m+n)), at most. // static void Match(~~CHAR2INDEXES& indexes, MATCHES& matches,~~ CHAR_TO_INDEXES_MAP& indexesOf2MatchedByChar, MATCHES& indexesOf2MatchedByIndex1, const string& s1, const string& s2) { uint32_t index = 0; for (const auto& it : s2) ~~indexes~~indexesOf2MatchedByChar[it].push_back(index++); for (const auto& it : s1) { auto& dq2 = ~~indexes~~indexesOf2MatchedByChar[it]; ~~matches~~indexesOf2MatchedByIndex1.push_back(&dq2); } } static string Select(shared_ptr<Pair> pairs, uint32_t length, bool right, const string& s1, const string& s2) { string buffer; Line 607 ⟶ 651: public: static string Correspondence(const string& s1, const string& s2) { CHAR_TO_INDEXES_MAP indexesOf2MatchedByChar; ~~CHAR2INDEXES indexes;~~ MATCHES ~~matches~~indexesOf2MatchedByIndex1; // holds references into ~~indexes~~indexesOf2MatchedByChar Match(~~indexes~~indexesOf2MatchedByChar, ~~matches~~indexesOf2MatchedByIndex1, s1, s2); shared_ptr<Pair> pairs; // obtain the LCS as index pairs auto length = ~~Pairs~~FindLCS(~~matches~~indexesOf2MatchedByIndex1, &pairs); return Select(pairs, length, false, s1, s2); } };</~~lang~~syntaxhighlight> '''Example''': <syntaxhighlight lang ="cpp"> ~~LCS lcs;~~ auto s = ~~lcs.~~LCS::Correspondence(s1, s2); cout << s << endl;</~~lang~~syntaxhighlight> More fully featured examples are available at [https://github.com/CNHume/Samples/tree/master/C%2B%2B/LCS Samples/C++/LCS]. =={{header\|Clojure}}== Based on algorithm from Wikipedia. <~~lang~~syntaxhighlight ~~Clojure~~lang="clojure">(defn longest [xs ys] (if (> (count xs) (count ys)) xs ys)) (def lcs Line 632 ⟶ 678: (= x y) (cons x (lcs xs ys)) :else (longest (lcs (cons x xs) ys) (lcs xs (cons y ys)))))))</~~lang~~syntaxhighlight> =={{header\|CoffeeScript}}== <~~lang~~syntaxhighlight lang="coffeescript"> lcs = (s1, s2) -> len1 = s1.length Line 666 ⟶ 712: s1 = "thisisatest" s2 = "testing123testing" console.log lcs(s1, s2)</~~lang~~syntaxhighlight> =={{header\|Common Lisp}}== Here's a memoizing/dynamic-programming solution that uses an <var>n × m</var> array where <var>n</var> and <var>m</var> are the lengths of the input arrays. The first return value is a sequence (of the same type as array1) which is the longest common subsequence. The second return value is the length of the longest common subsequence. <~~lang~~syntaxhighlight lang="lisp">(defun longest-common-subsequence (array1 array2) (let* ((l1 (length array1)) (l2 (length array2)) Line 697 ⟶ 743: b))))))))) (destructuring-bind (seq len) (lcs 0 0) (values (coerce seq (type-of array1)) len)))))</~~lang~~syntaxhighlight> For example, <~~lang~~syntaxhighlight lang="lisp">(longest-common-subsequence "123456" "1a2b3c")</~~lang~~syntaxhighlight> produces the two values <~~lang~~syntaxhighlight lang="lisp">"123" 3</~~lang~~syntaxhighlight> ===An alternative adopted from Clojure=== Line 712 ⟶ 758: Here is another version with its own memoization macro: <~~lang~~syntaxhighlight lang="lisp">(defmacro mem-defun (name args body) (let ((hash-name (gensym))) `(let ((,hash-name (make-hash-table :test 'equal))) Line 725 ⟶ 771: ((equal (car xs) (car ys)) (cons (car xs) (lcs (cdr xs) (cdr ys)))) (t (longer (lcs (cdr xs) ys) (lcs xs (cdr ys)))))))</~~lang~~syntaxhighlight> When we test it, we get: <~~lang~~syntaxhighlight lang="lisp">(coerce (lcs (coerce "thisisatest" 'list) (coerce "testing123testing" 'list)) 'string)))) "tsitest"</~~lang~~syntaxhighlight> =={{header\|D}}== Line 737 ⟶ 783: ===Recursive version=== <~~lang~~syntaxhighlight lang="d">import std.stdio, std.array; T[] lcs(T)(in T[] a, in T[] b) pure nothrow @safe { Line 749 ⟶ 795: void main() { lcs("thisisatest", "testing123testing").writeln; }</~~lang~~syntaxhighlight> {{out}} <pre>tsitest</pre> Line 755 ⟶ 801: ===Faster dynamic programming version=== The output is the same. <~~lang~~syntaxhighlight lang="d">import std.stdio, std.algorithm, std.traits; T[] lcs(T)(in T[] a, in T[] b) pure /nothrow/ { Line 784 ⟶ 830: void main() { lcs("thisisatest", "testing123testing").writeln; }</~~lang~~syntaxhighlight> ===Hirschberg algorithm version=== Line 793 ⟶ 839: Adapted from Python code: http://wordaligned.org/articles/longest-common-subsequence <~~lang~~syntaxhighlight lang="d">import std.stdio, std.algorithm, std.range, std.array, std.string, std.typecons; uint[] lensLCS(R)(R xs, R ys) pure nothrow @safe { Line 853 ⟶ 899: void main() { lcsString("thisisatest", "testing123testing").writeln; }</~~lang~~syntaxhighlight> =={{header\|Dart}}== <~~lang~~syntaxhighlight lang="dart">import 'dart:math'; String lcsRecursion(String a, String b) { Line 923 ⟶ 969: print("lcsRecursion('x', 'x') = ${lcsRecursion('x', 'x')}"); } </syntaxhighlight> ~~</lang>~~ {{out}} <pre>lcsDynamic('1234', '1224533324') = 1234 Line 934 ⟶ 980: lcsRecursion('', 'x') = lcsRecursion('x', 'x') = x</pre> =={{header\|EasyLang}}== {{trans\|BASIC256}} <syntaxhighlight> func$ right a$ n . return substr a$ (len a$ - n + 1) n . func$ left a$ n . if n < 0 n = len a$ + n . return substr a$ 1 n . func$ lcs a$ b$ . if len a$ = 0 or len b$ = 0 return "" . if right a$ 1 = right b$ 1 return lcs left a$ -1 left b$ -1 & right a$ 1 . x$ = lcs a$ left b$ -1 y$ = lcs left a$ -1 b$ if len x$ > len y$ return x$ else return y$ . . print lcs "1234" "1224533324" print lcs "thisisatest" "testing123testing" </syntaxhighlight> {{out}} <pre> 1234 tsitest </pre> =={{header\|Egison}}== <~~lang~~syntaxhighlight lang="egison"> (define $common-seqs (lambda [$xs $ys] Line 946 ⟶ 1,029: (define $lcs (compose common-seqs rac)) </syntaxhighlight> ~~</lang>~~ '''Output:''' <~~lang~~syntaxhighlight lang="egison"> > (lcs "thisisatest" "testing123testing")) "tsitest" </syntaxhighlight> ~~</lang>~~ =={{header\|Elixir}}== Line 958 ⟶ 1,041: This solution is Brute force. It is slow {{trans\|Ruby}} <~~lang~~syntaxhighlight lang="elixir">defmodule LCS do def lcs(a, b) do lcs(to_charlist(a), to_charlist(b), []) \|> to_string Line 971 ⟶ 1,054: IO.puts LCS.lcs("thisisatest", "testing123testing") IO.puts LCS.lcs('1234','1224533324')</~~lang~~syntaxhighlight> ===Dynamic Programming=== {{trans\|Erlang}} <~~lang~~syntaxhighlight lang="elixir">defmodule LCS do def lcs_length(s,t), do: lcs_length(s,t,Map.new) \|> elem(0) Line 1,014 ⟶ 1,097: IO.puts LCS.lcs("thisisatest","testing123testing") IO.puts LCS.lcs("1234","1224533324")</~~lang~~syntaxhighlight> {{out}} Line 1,025 ⟶ 1,108: =={{header\|Erlang}}== This implementation also includes the ability to calculate the length of the longest common subsequence. In calculating that length, we generate a cache which can be traversed to generate the longest common subsequence. <~~lang~~syntaxhighlight lang="erlang"> module(lcs). -compile(export_all). Line 1,067 ⟶ 1,150: lcs(ST,T,Cache,Acc) end. </syntaxhighlight> ~~</lang>~~ '''Output:''' <~~lang~~syntaxhighlight lang="erlang"> 77> lcs:lcs("thisisatest","testing123testing"). "tsitest" 78> lcs:lcs("1234","1224533324"). "1234" </syntaxhighlight> ~~</lang>~~ We can also use the process dictionary to memoize the recursive implementation: <~~lang~~syntaxhighlight lang="erlang"> lcs(Xs0, Ys0) -> CacheKey = {lcs_cache, Xs0, Ys0}, Line 1,102 ⟶ 1,185: Result end. </syntaxhighlight> ~~</lang>~~ Similar to the above, but without using the process dictionary: <~~lang~~syntaxhighlight lang="erlang"> -module(lcs). Line 1,145 ⟶ 1,228: end, {LCS, CacheB}. </syntaxhighlight> ~~</lang>~~ '''Output:''' <~~lang~~syntaxhighlight lang="erlang"> 48> lcs:lcs("thisisatest", "testing123testing"). "tsitest" </syntaxhighlight> ~~</lang>~~ =={{header\|F Sharp\|F#}}== Copied and slightly adapted from OCaml (direct recursion) <~~lang~~syntaxhighlight lang="fsharp">open System let longest xs ys = if List.length xs > List.length ys then xs else ys Line 1,175 ⟶ 1,258: (lcs (split "thisisatest") (split "testing123testing")))) 0 </syntaxhighlight> ~~</lang>~~ =={{header\|Factor}}== <~~lang~~syntaxhighlight lang="factor">USE: lcs "thisisatest" "testing123testing" lcs print</~~lang~~syntaxhighlight> {{out}} <pre> Line 1,189 ⟶ 1,272: Using the <tt>iso_varying_string</tt> module which can be found [http://www.fortran.com/iso_varying_string.f95 here] (or equivalent module conforming to the ISO/IEC 1539-2:2000 API or to a subset according to the need of this code: <code>char</code>, <code>len</code>, <code>//</code>, <code>extract</code>, <code>==</code>, <code>=</code>) <~~lang~~syntaxhighlight lang="fortran">program lcstest use iso_varying_string implicit none Line 1,226 ⟶ 1,309: end function lcs end program lcstest</~~lang~~syntaxhighlight> =={{header\|FreeBASIC}}== <~~lang~~syntaxhighlight lang="freebasic">Function LCS(a As String, b As String) As String Dim As String x, y If Len(a) = 0 Or Len(b) = 0 Then Line 1,245 ⟶ 1,328: Print LCS("1234", "1224533324") Print LCS("thisisatest", "testing123testing") Sleep</~~lang~~syntaxhighlight> Line 1,252 ⟶ 1,335: ===Recursion=== Brute force <~~lang~~syntaxhighlight lang="go">func lcs(a, b string) string { aLen := len(a) bLen := len(b) Line 1,266 ⟶ 1,349: } return y }</~~lang~~syntaxhighlight> ===Dynamic Programming=== <~~lang~~syntaxhighlight lang="go">func lcs(a, b string) string { arunes := []rune(a) brunes := []rune(b) Line 1,310 ⟶ 1,393: } return string(s) }</~~lang~~syntaxhighlight> =={{header\|Groovy}}== Recursive solution: <~~lang~~syntaxhighlight lang="groovy">def lcs(xstr, ystr) { if (xstr == "" \|\| ystr == "") { return ""; Line 1,336 ⟶ 1,419: println(lcs("1234", "1224533324")); println(lcs("thisisatest", "testing123testing"));</~~lang~~syntaxhighlight> {{out}} <pre>1234 Line 1,345 ⟶ 1,428: The [[wp:Longest_common_subsequence#Solution_for_two_sequences\|Wikipedia solution]] translates directly into Haskell, with the only difference that equal characters are added in front: <~~lang~~syntaxhighlight lang="haskell">longest xs ys = if length xs > length ys then xs else ys lcs [] _ = [] Line 1,351 ⟶ 1,434: lcs (x:xs) (y:ys) \| x == y = x : lcs xs ys \| otherwise = longest (lcs (x:xs) ys) (lcs xs (y:ys))</~~lang~~syntaxhighlight> A Memoized version of the naive algorithm. <~~lang~~syntaxhighlight lang="haskell">import qualified Data.MemoCombinators as M lcs = memoize lcsm Line 1,367 ⟶ 1,450: maxl x y = if length x > length y then x else y memoize = M.memo2 mString mString mString = M.list M.char -- Chars, but you can specify any type you need for the memo</~~lang~~syntaxhighlight> Memoization (aka dynamic programming) of that uses ''zip'' to make both the index and the character available: <~~lang~~syntaxhighlight lang="haskell">import Data.Array lcs xs ys = a!(0,0) where Line 1,382 ⟶ 1,465: f x y i j \| x == y = x : a!(i+1,j+1) \| otherwise = longest (a!(i,j+1)) (a!(i+1,j))</~~lang~~syntaxhighlight> All 3 solutions work of course not only with strings, but also with any other list. Example: <~~lang~~syntaxhighlight lang="haskell">Main> lcs "thisisatest" "testing123testing" "tsitest"</~~lang~~syntaxhighlight> The dynamic programming version without using arrays: <~~lang~~syntaxhighlight lang="haskell">import Data.List longest xs ys = if length xs > length ys then xs else ys Line 1,396 ⟶ 1,479: f [a,b] [c,d] \| null a = longest b c: [b] \| otherwise = (a++d):[b]</~~lang~~syntaxhighlight> Simple and slow solution: <~~lang~~syntaxhighlight lang="haskell">import Data.Ord import Data.List Line 1,407 ⟶ 1,490: lcs xs ys = maximumBy (comparing length) $ intersect (subsequences xs) (subsequences ys) main = print $ lcs "thisisatest" "testing123testing"</~~lang~~syntaxhighlight> {{out}} <pre>"tsitest"</pre> Line 1,416 ⟶ 1,499: {{libheader\|Icon Programming Library}} [http://www.cs.arizona.edu/icon/library/src/procs/strings.icn Uses deletec from strings] <~~lang~~syntaxhighlight ~~Icon~~lang="icon">procedure main() LCSTEST("thisisatest","testing123testing") LCSTEST("","x") Line 1,451 ⟶ 1,534: return if (x := lcs(a,b[1:-1])) > (y := lcs(a[1:-1],b)) then x else y # divide, discard, and keep longest end</~~lang~~syntaxhighlight> {{out}} <pre>lcs( "thisisatest", "testing123testing" ) = "tsitest" Line 1,459 ⟶ 1,542: =={{header\|J}}== <~~lang~~syntaxhighlight lang="j">lcs=: dyad define \|.x{~ 0{"1 cullOne^:_ (\: +/"1)(\:{."1) 4$.$. x =/ y ) cullOne=: ({~[: <@<@< [: (i. 0:)1,[: ./[: \|: 2>/\]) :: ]</~~lang~~syntaxhighlight> Here's [[Longest_common_subsequence/J\|another approach]]: <~~lang~~syntaxhighlight Jlang="j">mergeSq=: ;@}: ~.@, {.@;@{. ,&.> 3 {:: 4&{. common=: 2 2 <@mergeSq@,;.3^:_ [: (<@#&.> i.@$) =/ lcs=: [ {~ 0 {"1 ,&$ #: 0 ({:: (#~ [: (= >./) #@>)) 0 ({:: ,) common</~~lang~~syntaxhighlight> Example use (works with either definition of lcs): <~~lang~~syntaxhighlight Jlang="j"> 'thisisatest' lcs 'testing123testing' tsitest</~~lang~~syntaxhighlight> '''Dynamic programming version''' <~~lang~~syntaxhighlight lang="j">longest=: ]`[@.(>&#) upd=:{:@[,~ ({.@[ ,&.> {:@])`({:@[ longest&.> {.@])@.(0 = #&>@{.@[) lcs=: 0{:: [: ([: {.&> [: upd&.>/\.<"1@:,.)/ a:,.~a:,~=/{"1 a:,.<"0@[</~~lang~~syntaxhighlight> '''Output:''' <~~lang~~syntaxhighlight lang="j"> '1234' lcs '1224533324' 1234 'thisisatest' lcs 'testing123testing' tsitest</~~lang~~syntaxhighlight> '''Recursion''' <~~lang~~syntaxhighlight lang="j">lcs=:;(($:}.) longest }.@[ $: ])`({.@[,$:&}.)@.(=&{.)`((i.0)"_)@.(+.&(0=#))&((e.#[)&>/) ;~</~~lang~~syntaxhighlight> =={{header\|Java}}== ===Recursion=== This is not a particularly fast algorithm, but it gets the job done eventually. The speed is a result of many recursive function calls. <~~lang~~syntaxhighlight lang="java">public static String lcs(String a, String b){ int aLen = a.length(); int bLen = b.length(); Line 1,506 ⟶ 1,589: return (x.length() > y.length()) ? x : y; } }</~~lang~~syntaxhighlight> ===Dynamic Programming=== <~~lang~~syntaxhighlight lang="java">public static String lcs(String a, String b) { int[][] lengths = new int[a.length()+1][b.length()+1]; Line 1,539 ⟶ 1,622: return sb.reverse().toString(); }</~~lang~~syntaxhighlight> =={{header\|JavaScript}}== Line 1,545 ⟶ 1,628: {{trans\|Java}} This is more or less a translation of the recursive Java version above. <~~lang~~syntaxhighlight lang="javascript">function lcs(a, b) { var aSub = a.substr(0, a.length - 1); var bSub = b.substr(0, b.length - 1); Line 1,558 ⟶ 1,641: return (x.length > y.length) ? x : y; } }</~~lang~~syntaxhighlight> ES6 recursive implementation <~~lang~~syntaxhighlight lang="javascript"> const longest = (xs, ys) => (xs.length > ys.length) ? xs : ys; Line 1,572 ⟶ 1,655: return (x === y) ? (x + lcs(xs, ys)) : longest(lcs(xx, ys), lcs(xs, yy)); };</~~lang~~syntaxhighlight> ===Dynamic Programming=== This version runs in O(mn) time and consumes O(mn) space. Factoring out loop edge cases could get a small constant time improvement, and it's fairly trivial to edit the final loop to produce a full diff in addition to the lcs. <~~lang~~syntaxhighlight lang="javascript">function lcs(x,y){ var s,i,j,m,n, lcs=[],row=[],c=[], Line 1,611 ⟶ 1,694: } return lcs.join(''); }</~~lang~~syntaxhighlight> '''BUG note: In line 6, m and n are not yet initialized, and so x and y are never swapped.''' Line 1,617 ⟶ 1,700: The final loop can be modified to concatenate maximal common substrings rather than individual characters: <~~lang~~syntaxhighlight lang="javascript"> var t=i; while(i>-1&&j>-1){ switch(c[i][j]){ Line 1,631 ⟶ 1,714: } } if(t!==i){lcs.unshift(x.substring(i+1,t+1));}</~~lang~~syntaxhighlight> ===Greedy Algorithm=== This is an heuristic algorithm that won't always return the correct answer, but is significantly faster and less memory intensive than the dynamic programming version, in exchange for giving up the ability to re-use the table to find alternate solutions and greater complexity in generating diffs. Note that this implementation uses a binary buffer for additional efficiency gains, but it's simple to transform to use string or array concatenation; <~~lang~~syntaxhighlight lang="javascript">function lcs_greedy(x,y){ var p1, i, idx, symbols = {}, Line 1,676 ⟶ 1,759: return pos; } }</~~lang~~syntaxhighlight> Note that it won't return the correct answer for all inputs. For example: <~~lang~~syntaxhighlight lang="javascript">lcs_greedy('bcaaaade', 'deaaaabc'); // 'bc' instead of 'aaaa'</~~lang~~syntaxhighlight> =={{header\|jq}}== Naive recursive version: <~~lang~~syntaxhighlight lang="jq">def lcs(xstr; ystr): if (xstr == "" or ystr == "") then "" else Line 1,694 ⟶ 1,777: \| if ($one\|length) > ($two\|length) then $one else $two end end end;</~~lang~~syntaxhighlight> Example: <~~lang~~syntaxhighlight lang="jq">lcs("1234"; "1224533324"), lcs("thisisatest"; "testing123testing")</~~lang~~syntaxhighlight> Output:<~~lang~~syntaxhighlight lang="sh"> # jq -n -f lcs-recursive.jq "1234" "tsitest"</~~lang~~syntaxhighlight> =={{header\|Julia}}== {{works with\|Julia\|0.6}} <~~lang~~syntaxhighlight lang="julia">longest(a::String, b::String) = length(a) ≥ length(b) ? a : b """ Line 1,762 ⟶ 1,845: @time lcsrecursive("thisisatest", "testing123testing") @show lcsdynamic("thisisatest", "testing123testing") @time lcsdynamic("thisisatest", "testing123testing")</~~lang~~syntaxhighlight> {{out}} Line 1,771 ⟶ 1,854: =={{header\|Kotlin}}== <~~lang~~syntaxhighlight lang="scala">// version 1.1.2 fun lcs(x: String, y: String): String { Line 1,787 ⟶ 1,870: val y = "testing123testing" println(lcs(x, y)) }</~~lang~~syntaxhighlight> {{out}} Line 1,795 ⟶ 1,878: =={{header\|Liberty BASIC}}== <syntaxhighlight lang="lb"> ~~<lang lb>~~ 'variation of BASIC example w$="aebdef" Line 1,827 ⟶ 1,910: END IF END FUNCTION </syntaxhighlight> ~~</lang>~~ =={{header\|Logo}}== This implementation works on both words and lists. <~~lang~~syntaxhighlight lang="logo">to longest :s :t output ifelse greater? count :s count :t [:s] [:t] end Line 1,839 ⟶ 1,922: if equal? first :s first :t [output combine first :s lcs bf :s bf :t] output longest lcs :s bf :t lcs bf :s :t end</~~lang~~syntaxhighlight> =={{header\|Lua}}== <~~lang~~syntaxhighlight lang="lua">function LCS( a, b ) if #a == 0 or #b == 0 then return "" Line 1,859 ⟶ 1,942: end print( LCS( "thisisatest", "testing123testing" ) )</~~lang~~syntaxhighlight> =={{header\|M4}}== <~~lang~~syntaxhighlight M4lang="m4">define(`set2d',`define(`$1[$2][$3]',`$4')') define(`get2d',`defn($1[$2][$3])') define(`tryboth', Line 1,882 ⟶ 1,965: lcs(`1234',`1224533324') lcs(`thisisatest',`testing123testing')</~~lang~~syntaxhighlight> Note: the caching (set2d/get2d) obscures the code even more than usual, but is necessary in order to get the second test to run in a reasonable amount of time. =={{header\|Maple}}== <syntaxhighlight lang="maple"> ~~<lang Maple>~~ > StringTools:-LongestCommonSubSequence( "thisisatest", "testing123testing" ); "tsitest" </syntaxhighlight> ~~</lang>~~ =={{header\|Mathematica}}/{{header\|Wolfram Language}}== A built-in function can do this for us: <~~lang~~syntaxhighlight ~~Mathematica~~lang="mathematica">a = "thisisatest"; b = "testing123testing"; LongestCommonSequence[a, b]</~~lang~~syntaxhighlight> gives: <syntaxhighlight lang ~~Mathematica~~="mathematica">tsitest</~~lang~~syntaxhighlight> Note that Mathematica also has a built-in function called LongestCommonSubsequence[a,b]: Line 1,913 ⟶ 1,996: ===Recursion=== {{trans\|Python}} <~~lang~~syntaxhighlight lang="nim">proc lcs(x, y: string): string = if x == "" or y == "": return "" Line 1,925 ⟶ 2,008: echo lcs("1234", "1224533324") echo lcs("thisisatest", "testing123testing")</~~lang~~syntaxhighlight> This recursive version is not efficient but the execution time can be greatly improved by using memoization. Line 1,931 ⟶ 2,014: ===Dynamic Programming=== {{trans\|Python}} <~~lang~~syntaxhighlight lang="nim">proc lcs(a, b: string): string = var ls = newSeq[seq[int]](a.len+1) for i in 0 .. a.len: Line 1,958 ⟶ 2,041: echo lcs("1234", "1224533324") echo lcs("thisisatest", "testing123testing")</~~lang~~syntaxhighlight> =={{header\|OCaml}}== ===Recursion=== from Haskell <~~lang~~syntaxhighlight lang="ocaml">let longest xs ys = if List.length xs > List.length ys then xs else ys let rec lcs a b = match a, b with Line 1,972 ⟶ 2,055: x :: lcs xs ys else longest (lcs a ys) (lcs xs b)</~~lang~~syntaxhighlight> ===Memoized recursion=== <~~lang~~syntaxhighlight lang="ocaml"> let lcs xs ys = let cache = Hashtbl.create 16 in Line 1,995 ⟶ 2,078: result in lcs xs ys</~~lang~~syntaxhighlight> ===Dynamic programming=== <~~lang~~syntaxhighlight lang="ocaml">let lcs xs' ys' = let xs = Array.of_list xs' and ys = Array.of_list ys' in Line 2,012 ⟶ 2,095: done done; a.(0).(0)</~~lang~~syntaxhighlight> Because both solutions only work with lists, here are some functions to convert to and from strings: <~~lang~~syntaxhighlight lang="ocaml">let list_of_string str = let result = ref [] in String.iter (fun x -> result := x :: !result) Line 2,024 ⟶ 2,107: let result = String.create (List.length lst) in ignore (List.fold_left (fun i x -> result.[i] <- x; i+1) 0 lst); result</~~lang~~syntaxhighlight> Both solutions work. Example: Line 2,037 ⟶ 2,120: Recursive solution: <~~lang~~syntaxhighlight lang="oz">declare fun {LCS Xs Ys} case [Xs Ys] Line 2,051 ⟶ 2,134: end in {System.showInfo {LCS "thisisatest" "testing123testing"}}</~~lang~~syntaxhighlight> =={{header\|Pascal}}== {{trans\|Fortran}} <~~lang~~syntaxhighlight lang="pascal">Program LongestCommonSubsequence(output); function lcs(a, b: string): string; Line 2,088 ⟶ 2,171: s2 := '1224533324'; writeln (lcs(s1, s2)); end.</~~lang~~syntaxhighlight> {{out}} <pre>:> ./LongestCommonSequence Line 2,096 ⟶ 2,179: =={{header\|Perl}}== <~~lang~~syntaxhighlight lang="perl">sub lcs { my ($a, $b) = @_; if (!length($a) \|\| !length($b)) { Line 2,109 ⟶ 2,192: } print lcs("thisisatest", "testing123testing") . "\n";</~~lang~~syntaxhighlight> ===Alternate letting regex do all the work=== <syntaxhighlight lang="perl">use strict; ~~<lang perl>#!/usr/bin/perl~~ ~~use strict; # https://rosettacode.org/wiki/Longest_common_subsequence~~ use warnings; use feature 'bitwise'; print "lcs is ", lcs('thisisatest', 'testing123testing'), "\n"; Line 2,122 ⟶ 2,204: { my ($c, $d) = @_; for my $len ( reverse 1 .. length($c &. $d) ) { "$c\n$d" =~ join '.', ('(.)') x $len, "\n", map "\\$_", 1 .. $len and Line 2,128 ⟶ 2,210: } return ''; }</~~lang~~syntaxhighlight> {{out}} <pre>lcs is ~~tsitest~~tastiest</pre> =={{header\|Phix}}== If you want this to work with (utf8) Unicode text, just chuck the inputs through utf8_to_utf32() first (and the output through utf32_to_utf8()). <!--<~~lang~~syntaxhighlight ~~Phix~~lang="phix">(phixonline)--> <span style="color: #008080;">with</span> <span style="color: #008080;">javascript_semantics</span> <span style="color: #008080;">function</span> <span style="color: #000000;">lcs</span><span style="color: #0000FF;">(</span><span style="color: #004080;">sequence</span> <span style="color: #000000;">a</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">b</span><span style="color: #0000FF;">)</span> Line 2,156 ⟶ 2,238: <span style="color: #0000FF;">?</span><span style="color: #000000;">lcs</span><span style="color: #0000FF;">(</span><span style="color: #000000;">a</span><span style="color: #0000FF;">,</span><span style="color: #000000;">b</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">end</span> <span style="color: #008080;">for</span> <!--</~~lang~~syntaxhighlight>--> {{out}} <pre> Line 2,164 ⟶ 2,246: ===Alternate version=== same output <!--<~~lang~~syntaxhighlight ~~Phix~~lang="phix">(phixonline)--> <span style="color: #008080;">with</span> <span style="color: #008080;">javascript_semantics</span> <span style="color: #008080;">function</span> <span style="color: #000000;">LCSLength</span><span style="color: #0000FF;">(</span><span style="color: #004080;">sequence</span> <span style="color: #000000;">X</span><span style="color: #0000FF;">,</span> <span style="color: #004080;">sequence</span> <span style="color: #000000;">Y</span><span style="color: #0000FF;">)</span> Line 2,204 ⟶ 2,286: <span style="color: #0000FF;">?</span><span style="color: #000000;">lcs</span><span style="color: #0000FF;">(</span><span style="color: #000000;">a</span><span style="color: #0000FF;">,</span><span style="color: #000000;">b</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">end</span> <span style="color: #008080;">for</span> <!--</~~lang~~syntaxhighlight>--> =={{header\|Picat}}== ===Wikipedia algorithm=== With some added trickery for a 1-based language. <syntaxhighlight lang="picat">lcs_wiki(X,Y) = V => [C, _Len] = lcs_length(X,Y), V = backTrace(C,X,Y,X.length+1,Y.length+1). lcs_length(X, Y) = V=> M = X.length, N = Y.length, C = [[0 : J in 1..N+1] : I in 1..N+1], foreach(I in 2..M+1,J in 2..N+1) if X[I-1] == Y[J-1] then C[I,J] := C[I-1,J-1] + 1 else C[I,J] := max([C[I,J-1], C[I-1,J]]) end end, V = [C, C[M+1,N+1]]. backTrace(C, X, Y, I, J) = V => if I == 1; J == 1 then V = "" elseif X[I-1] == Y[J-1] then V = backTrace(C, X, Y, I-1, J-1) ++ [X[I-1]] else if C[I,J-1] > C[I-1,J] then V = backTrace(C, X, Y, I, J-1) else V = backTrace(C, X, Y, I-1, J) end end.</syntaxhighlight> ===Rule-based=== {{trans\|SETL}} <syntaxhighlight lang="picat">table lcs_rule(A, B) = "", (A == ""; B == "") => true. lcs_rule(A, B) = [A[1]] ++ lcs_rule(butfirst(A), butfirst(B)), A[1] == B[1] => true. lcs_rule(A, B) = longest(lcs_rule(butfirst(A), B), lcs_rule(A, butfirst(B))) => true. % Return the longest string of A and B longest(A, B) = cond(A.length > B.length, A, B). % butfirst (everything except first element) butfirst(A) = [A[I] : I in 2..A.length].</syntaxhighlight> ===Test=== <syntaxhighlight lang="picat">go => Tests = [["thisisatest","testing123testing"], ["XMJYAUZ", "MZJAWXU"], ["1234", "1224533324"], ["beginning-middle-ending","beginning-diddle-dum-ending"] ], Funs = [lcs_wiki,lcs_rule], foreach(Fun in Funs) println(fun=Fun), foreach(Test in Tests) printf("%w : %w\n", Test, apply(Fun,Test[1],Test[2])) end, nl end, nl.</syntaxhighlight> {{out}} <pre>fun = lcs_wiki [thisisatest,testing123testing] : tsitest [XMJYAUZ,MZJAWXU] : MJAU [1234,1224533324] : 1234 [beginning-middle-ending,beginning-diddle-dum-ending] : beginning-iddle-ending fun = lcs_rule [thisisatest,testing123testing] : tsitest [XMJYAUZ,MZJAWXU] : MJAU [1234,1224533324] : 1234 [beginning-middle-ending,beginning-diddle-dum-ending] : beginning-iddle-ending</pre> =={{header\|PicoLisp}}== <~~lang~~syntaxhighlight ~~PicoLisp~~lang="picolisp">(de commonSequences (A B) (when A (conc Line 2,218 ⟶ 2,378: (commonSequences (chop "thisisatest") (chop "testing123testing") ) )</~~lang~~syntaxhighlight> {{out}} <pre>-> ("t" "s" "i" "t" "e" "s" "t")</pre> Line 2,224 ⟶ 2,384: =={{header\|PowerShell}}== Returns a sequence (array) of a type: <syntaxhighlight lang="powershell"> ~~<lang PowerShell>~~ function Get-Lcs ($ReferenceObject, $DifferenceObject) { Line 2,272 ⟶ 2,432: $longestCommonSubsequence } </syntaxhighlight> ~~</lang>~~ Returns the character array as a string: <syntaxhighlight lang="powershell"> ~~<lang PowerShell>~~ (Get-Lcs -ReferenceObject "thisisatest" -DifferenceObject "testing123testing") -join "" </syntaxhighlight> ~~</lang>~~ {{Out}} <pre> Line 2,282 ⟶ 2,442: </pre> Returns an array of integers: <syntaxhighlight lang="powershell"> ~~<lang PowerShell>~~ Get-Lcs -ReferenceObject @(1,2,3,4) -DifferenceObject @(1,2,2,4,5,3,3,3,2,4) </syntaxhighlight> ~~</lang>~~ {{Out}} <pre> Line 2,293 ⟶ 2,453: </pre> Given two lists of objects, return the LCS of the ID property: <syntaxhighlight lang="powershell"> ~~<lang PowerShell>~~ $list1 Line 2,319 ⟶ 2,479: Get-Lcs -ReferenceObject $list1.ID -DifferenceObject $list2.ID </syntaxhighlight> ~~</lang>~~ {{Out}} <pre> Line 2,332 ⟶ 2,492: ===Recursive Version=== First version: <~~lang~~syntaxhighlight ~~Prolog~~lang="prolog">test :- time(lcs("thisisatest", "testing123testing", Lcs)), writef('%s',[Lcs]). Line 2,351 ⟶ 2,511: length(L1,Length1), length(L2,Length2), ((Length1 > Length2) -> Longest = L1; Longest = L2).</~~lang~~syntaxhighlight> Second version, with memoization: <~~lang~~syntaxhighlight ~~Prolog~~lang="prolog">%declare that we will add lcs_db facts during runtime :- dynamic lcs_db/3. Line 2,381 ⟶ 2,541: length(L1,Length1), length(L2,Length2), ((Length1 > Length2) -> Longest = L1; Longest = L2).</~~lang~~syntaxhighlight> {{out\|Demonstrating}} Example for "beginning-middle-ending" and "beginning-diddle-dum-ending" <BR> First version : <~~lang~~syntaxhighlight ~~Prolog~~lang="prolog">?- time(lcs("beginning-middle-ending","beginning-diddle-dum-ending", Lcs)),writef('%s', [Lcs]). % 10,875,184 inferences, 1.840 CPU in 1.996 seconds (92% CPU, 5910426 Lips) beginning-iddle-ending</~~lang~~syntaxhighlight> Second version which is much faster : <~~lang~~syntaxhighlight ~~Prolog~~lang="prolog">?- time(lcs("beginning-middle-ending","beginning-diddle-dum-ending", Lcs)),writef('%s', [Lcs]). % 2,376 inferences, 0.010 CPU in 0.003 seconds (300% CPU, 237600 Lips) beginning-iddle-ending</~~lang~~syntaxhighlight> =={{header\|PureBasic}}== {{trans\|Basic}} <~~lang~~syntaxhighlight ~~PureBasic~~lang="purebasic">Procedure.s lcs(a$, b$) Protected x$ , lcs$ If Len(a$) = 0 Or Len(b$) = 0 Line 2,414 ⟶ 2,574: OpenConsole() PrintN( lcs("thisisatest", "testing123testing")) PrintN("Press any key to exit"): Repeat: Until Inkey() <> ""</~~lang~~syntaxhighlight> =={{header\|Python}}== Line 2,421 ⟶ 2,581: ===Recursion=== This solution is similar to the Haskell one. It is slow. <~~lang~~syntaxhighlight lang="python">def lcs(xstr, ystr): """ >>> lcs('thisisatest', 'testing123testing') Line 2,432 ⟶ 2,592: return str(lcs(xs, ys)) + x else: return max(lcs(xstr, ys), lcs(xs, ystr), key=len)</~~lang~~syntaxhighlight> Test it: <~~lang~~syntaxhighlight lang="python">if __name__=="__main__": import doctest; doctest.testmod()</~~lang~~syntaxhighlight> ===Dynamic Programming=== <~~lang~~syntaxhighlight lang="python">def lcs(a, b): # generate matrix of length of longest common subsequence for substrings of both words lengths = [[0] (len(b)+1) for _ in range(len(a)+1)] Line 2,455 ⟶ 2,615: result += a[i-1] return result</~~lang~~syntaxhighlight> =={{header\|Racket}}== <~~lang~~syntaxhighlight lang="racket">#lang racket (define (longest xs ys) (if (> (length xs) (length ys)) Line 2,483 ⟶ 2,643: (list->string (lcs/list (string->list sx) (string->list sy)))) (lcs "thisisatest" "testing123testing")</~~lang~~syntaxhighlight> {{out}} <pre>"tsitest"></pre> Line 2,492 ⟶ 2,652: {{works with\|rakudo\|2015-09-16}} This solution is similar to the Haskell one. It is slow. <syntaxhighlight lang="raku" ~~perl6~~line>say lcs("thisisatest", "testing123testing");sub lcs(Str $xstr, Str $ystr) { return "" unless $xstr && $ystr; Line 2,501 ⟶ 2,661: } say lcs("thisisatest", "testing123testing");</~~lang~~syntaxhighlight> ===Dynamic Programming=== {{trans\|Java}} <syntaxhighlight lang="raku" line> ~~<lang perl6>~~ sub lcs(Str $xstr, Str $ystr) { my ($xlen, $ylen) = ($xstr, $ystr)>>.chars; Line 2,536 ⟶ 2,696: } say lcs("thisisatest", "testing123testing");</~~lang~~syntaxhighlight> ===Bit Vector=== Bit parallel dynamic programming with nearly linear complexity O(n). It is fast. <syntaxhighlight lang="raku" ~~perl6~~line>sub lcs(Str $xstr, Str $ystr) { my ($@a,$ @b) := ([$xstr~~.comb]~~,[ $ystr)».comb]); my (%positions, @Vs, $lcs); myfor @a.kv -> $i, $x { %positions;{$x} +\|= 1 +< ($i % @a) } ~~for $a.kv -> $i,$x { $positions{$x} +\|= 1 +< $i };~~ my $S = +^ 0; myfor ~~$Vs~~(0 =..^ @b) -> $j ~~[];~~{ my ~~($y,~~$u = $S +& (%positions{@b[$j]} // 0); ~~for~~ @Vs[$j] = $S = (~~0..~~$S + $~~b-1~~u) ->+\| ($jS {- $u) ~~$y = $positions{$b[$j]} // 0;~~ ~~$u = $S +& $y;~~ ~~$S = ($S + $u) +\| ($S - $u);~~ ~~$Vs[$j] = $S;~~ } my ($i, $j) = ~~(+$~~@a-1, +$@b-1); while ($i ≥ 0 and $j ≥ 0) { ~~my $result = "";~~ ~~while~~ ~~($i~~ >= 0 &&unless (@Vs[$j] >=+& 0(1 +< $i)) { if ( $lcs [R~]= @a[$i] unless $j and ^@Vs[$j-1] +& (1 +< $i)~~) { $i-- }~~; ~~else~~ { $j-- ~~unless ($j && +^$Vs[$j-1] +& (1 +< $i)) {~~ ~~$result = $a[$i] ~ $result;~~ ~~$i--;~~ } ~~$j--;~~ } $i-- } ~~return~~ $~~result;~~lcs } say lcs("thisisatest", "testing123testing");</~~lang~~syntaxhighlight> =={{header\|ReasonML}}== <~~lang~~syntaxhighlight lang="ocaml"> let longest = (xs, ys) => if (List.length(xs) > List.length(ys)) { Line 2,594 ⟶ 2,746: } }; </syntaxhighlight> ~~</lang>~~ =={{header\|REXX}}== <~~lang~~syntaxhighlight lang="rexx">/REXX program tests the LCS (Longest Common Subsequence) subroutine. / parse arg aaa bbb . /obtain optional arguments from the CL/ say 'string A =' aaa /echo the string A to the screen. / Line 2,622 ⟶ 2,774: y= LCS( substr(a, 1, j - 1), b, 9) if length(x)>length(y) then return x return y</~~lang~~syntaxhighlight> {{out\|output\|text=  when using the input of:     <tt> 1234   1224533324 </tt>}} <pre> Line 2,637 ⟶ 2,789: =={{header\|Ring}}== <~~lang~~syntaxhighlight lang="ring"> see longest("1267834", "1224533324") + nl Line 2,654 ⟶ 2,806: lcs = x2 return lcs ok ok </syntaxhighlight> ~~</lang>~~ Output: <pre> Line 2,664 ⟶ 2,816: This solution is similar to the Haskell one. It is slow (The time complexity is exponential.) {{works with\|Ruby\|1.9}} <~~lang~~syntaxhighlight lang="ruby">=begin irb(main):001:0> lcs('thisisatest', 'testing123testing') => "tsitest" Line 2,677 ⟶ 2,829: [lcs(xstr, ys), lcs(xs, ystr)].max_by {\|x\| x.size} end end</~~lang~~syntaxhighlight> ===Dynamic programming=== Line 2,684 ⟶ 2,836: Walker class for the LCS matrix: <~~lang~~syntaxhighlight lang="ruby">class LCS SELF, LEFT, UP, DIAG = [0,0], [0,-1], [-1,0], [-1,-1] Line 2,750 ⟶ 2,902: puts lcs('thisisatest', 'testing123testing') puts lcs("rosettacode", "raisethysword") end</~~lang~~syntaxhighlight> {{out}} Line 2,760 ⟶ 2,912: =={{header\|Run BASIC}}== <~~lang~~syntaxhighlight lang="runbasic">a$ = "aebdaef" b$ = "cacbac" print lcs$(a$,b$) Line 2,786 ⟶ 2,938: END IF [ext] END FUNCTION</~~lang~~syntaxhighlight><pre>aba</pre> =={{header\|Rust}}== Dynamic programming version: <~~lang~~syntaxhighlight lang="rust"> use std::cmp; Line 2,846 ⟶ 2,998: let res = lcs("AGGTAB".to_string(), "GXTXAYB".to_string()); assert_eq!((4 as usize, "GTAB".to_string()), res); }</~~lang~~syntaxhighlight> =={{header\|Scala}}== {{works with\|Scala 2.13}} Using lazily evaluated lists: <~~lang~~syntaxhighlight lang="scala"> def lcsLazy[T](u: IndexedSeq[T], v: IndexedSeq[T]): IndexedSeq[T] = { def su = subsets(u).to(LazyList) def sv = subsets(v).to(LazyList) Line 2,862 ⟶ 3,014: def subsets[T](u: IndexedSeq[T]): Iterator[IndexedSeq[T]] = { u.indices.reverseIterator.flatMap{n => u.indices.combinations(n + 1).map(_.map(u))} }</~~lang~~syntaxhighlight> Using recursion: <~~lang~~syntaxhighlight lang="scala"> def lcsRec[T]: (IndexedSeq[T], IndexedSeq[T]) => IndexedSeq[T] = { case (a +: as, b +: bs) if a == b => a +: lcsRec(as, bs) case (as, bs) if as.isEmpty \|\| bs.isEmpty => IndexedSeq[T]() Line 2,871 ⟶ 3,023: val (s1, s2) = (lcsRec(a +: as, bs), lcsRec(as, b +: bs)) if(s1.length > s2.length) s1 else s2 }</~~lang~~syntaxhighlight> {{out}} Line 2,881 ⟶ 3,033: {{works with\|Scala 2.9.3}} Recursive version: <~~lang~~syntaxhighlight lang="scala"> def lcs[T]: (List[T], List[T]) => List[T] = { case (_, Nil) => Nil case (Nil, _) => Nil Line 2,891 ⟶ 3,043: } } }</~~lang~~syntaxhighlight> The dynamic programming version: <~~lang~~syntaxhighlight lang="scala"> case class Memoized[A1, A2, B](f: (A1, A2) => B) extends ((A1, A2) => B) { val cache = scala.collection.mutable.Map.empty[(A1, A2), B] def apply(x: A1, y: A2) = cache.getOrElseUpdate((x, y), f(x, y)) Line 2,910 ⟶ 3,062: } } }</~~lang~~syntaxhighlight> {{out}} Line 2,920 ⟶ 3,072: Port from Clojure. <~~lang~~syntaxhighlight lang="scheme"> ;; using srfi-69 (define (memoize proc) Line 2,950 ⟶ 3,102: '())) '())))) </syntaxhighlight> ~~</lang>~~ Testing: <~~lang~~syntaxhighlight lang="scheme"> (test-group Line 2,967 ⟶ 3,119: (lcs '(a b d e f g h i j) '(A b c d e f F a g h j)))) </syntaxhighlight> ~~</lang>~~ =={{header\|Seed7}}== <~~lang~~syntaxhighlight lang="seed7">$ include "seed7_05.s7i"; const func string: lcs (in string: a, in string: b) is func Line 2,998 ⟶ 3,150: writeln(lcs("thisisatest", "testing123testing")); writeln(lcs("1234", "1224533324")); end func;</~~lang~~syntaxhighlight> Output: Line 3,011 ⟶ 3,163: It is interesting to note that x and y are computed in parallel, dividing work across threads repeatedly down through the recursion. <~~lang~~syntaxhighlight lang="sequencel">import <Utilities/Sequence.sl>; lcsBack(a(1), b(1)) := Line 3,032 ⟶ 3,184: lcsBack(args[1], args[2]) when size(args) >=2 else lcsBack("thisisatest", "testing123testing");</~~lang~~syntaxhighlight> {{out}} Line 3,041 ⟶ 3,193: =={{header\|SETL}}== Recursive; Also works on tuples (vectors) <~~lang~~syntaxhighlight lang="setl"> op .longest(a, b); return if #a > #b then a else b end; end .longest; Line 3,053 ⟶ 3,205: return lcs(a(2..), b) .longest lcs(a, b(2..)); end; end lcs;</~~lang~~syntaxhighlight> =={{header\|Sidef}}== <~~lang~~syntaxhighlight lang="ruby">func lcs(xstr, ystr) is cached { xstr.is_empty && return xstr; ystr.is_empty && return ystr; var(x, xs, y, ys) = (xstr.ftfirst(~~0,0~~1), xstr.ftslice(1), ystr.ftfirst(~~0,0~~1), ystr.ftslice(1)); if (x == y) { x + lcs(xs, ys) } else { [lcs(xstr, ys), lcs(xs, ystr)].max_by { .len }; } } say lcs("thisisatest", "testing123testing");</~~lang~~syntaxhighlight> {{out}} <pre> Line 3,081 ⟶ 3,233: ===Recursion=== {{trans\|Java}} <~~lang~~syntaxhighlight lang="slate">s1@(Sequence traits) longestCommonSubsequenceWith: s2@(Sequence traits) [ s1 isEmpty \/ s2 isEmpty ifTrue: [^ {}]. Line 3,090 ⟶ 3,242: y: (s1 allButLast longestCommonSubsequenceWith: s2). x length > y length ifTrue: [x] ifFalse: [y]] ].</~~lang~~syntaxhighlight> ===Dynamic Programming=== {{trans\|Ruby}} <~~lang~~syntaxhighlight lang="slate">s1@(Sequence traits) longestCommonSubsequenceWith: s2@(Sequence traits) [\| lengths \| lengths: (ArrayMD newWithDimensions: {s1 length `cache. s2 length `cache} defaultElement: 0). Line 3,116 ⟶ 3,268: index2: index2 - 1]] ] writingAs: s1) reverse ].</~~lang~~syntaxhighlight> =={{header\|Swift}}== Swift 5.1 ===Recursion=== <~~lang~~syntaxhighlight ~~Swift~~lang="swift">rlcs(_ s1: String, _ s2: String) -> String { if s1.count == 0 \|\| s2.count == 0 { return "" Line 3,133 ⟶ 3,285: return str1.count > str2.count ? str1 : str2 } }</~~lang~~syntaxhighlight> ===Dynamic Programming=== <~~lang~~syntaxhighlight ~~Swift~~lang="swift">func lcs(_ s1: String, _ s2: String) -> String { var lens = Array( repeating:Array(repeating: 0, count: s2.count + 1), Line 3,168 ⟶ 3,320: return String(returnStr.reversed()) }</~~lang~~syntaxhighlight> =={{header\|Tcl}}== ===Recursive=== {{trans\|Java}} <~~lang~~syntaxhighlight lang="tcl">proc r_lcs {a b} { if {$a eq "" \|\| $b eq ""} {return ""} set a_ [string range $a 1 end] Line 3,184 ⟶ 3,336: return [expr {[string length $x] > [string length $y] ? $x :$y}] } }</~~lang~~syntaxhighlight> ===Dynamic=== {{trans\|Java}} {{works with\|Tcl\|8.5}} <~~lang~~syntaxhighlight lang="tcl">package require Tcl 8.5 namespace import ::tcl::mathop::+ namespace import ::tcl::mathop::- Line 3,226 ⟶ 3,378: } return [string reverse $result] }</~~lang~~syntaxhighlight> ===Performance Comparison=== <~~lang~~syntaxhighlight lang="tcl">% time {d_lcs thisisatest testing123testing} 10 637.5 microseconds per iteration % time {r_lcs thisisatest testing123testing} 10 1275566.8 microseconds per iteration</~~lang~~syntaxhighlight> =={{header\|Ursala}}== This uses the same recursive algorithm as in the Haskell example, and works on lists of any type. <~~lang~~syntaxhighlight ~~Ursala~~lang="ursala">#import std lcs = ~&alrB^& ~&E?abh/~&alh2fabt2RC @faltPrXlrtPXXPW leql?/~&r ~&l</~~lang~~syntaxhighlight> test program: <~~lang~~syntaxhighlight ~~Ursala~~lang="ursala">#cast %s example = lcs('thisisatest','testing123testing')</~~lang~~syntaxhighlight> {{out}} <pre>'tsitest'</pre> Line 3,249 ⟶ 3,401: =={{header\|Wren}}== {{trans\|Kotlin}} <~~lang~~syntaxhighlight ~~ecmascript~~lang="wren">var lcs // recursive lcs = Fn.new { \|x, y\| if (x.count == 0 \|\| y.count == 0) return "" Line 3,262 ⟶ 3,414: var x = "thisisatest" var y = "testing123testing" System.print(lcs.call(x, y))</~~lang~~syntaxhighlight> {{out}} Line 3,272 ⟶ 3,424: This is quite vile in terms of [time] efficiency, another algorithm should be used for real work. {{trans\|D}} <~~lang~~syntaxhighlight lang="zkl">fcn lcs(a,b){ if(not a or not b) return(""); if (a[0]==b[0]) return(a[0] + self.fcn(a[1,],b[1,])); return(fcn(x,y){if(x.len()>y.len())x else y}(lcs(a,b[1,]),lcs(a[1,],b))) }</~~lang~~syntaxhighlight> The last line looks strange but it is just return(lambda longest(lcs.lcs)) {{out}}