Suffix tree

From Rosetta Code
Revision as of 17:31, 10 May 2013 by Grondilu (talk | contribs) (Created page with "{{draft task}} A wp:suffix tree is a data structure commonly used in wp:string algorithms. Basically, for any string, its suffix tree is a wp:rooted tree where...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Suffix tree is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

A suffix tree is a data structure commonly used in string algorithms. Basically, for any string, its suffix tree is a rooted tree where each edge is labelled, and where the concatenation of all the labels from the root to a leaf uniquely identifies a suffix of the string.


For this task, build the suffix tree of the string "rosettacode$", and show that its edges are:

$ $ acode$ acode$ code$ de$ de$ e o rosettacode$ settacode$ settacode$ t tacode$ ttacode$

Perl 6

<lang Perl 6>sub suffixes(Str $str) { map &flip, [\~] $str.flip.comb } sub suffix-tree(@a) {

   @a == 0 ?? [] !!
   @a == 1 ?? hash @a[0] => [] !!
   hash gather for @a.classify(*.substr(0, 1)) {
       my $subtree = suffix-tree(grep *.chars, map *.substr(1), .value[]);
       if $subtree.elems == 1 {
           my $pair = $subtree.pick;
           take .key ~ $pair.key => $pair.value;
       } else {
           take .key => $subtree;
       }
   }

}

sub edges($tree) {

   gather for $tree[] {
       take .key;
       .take for edges .value;
   } if $tree;

}

say sort edges suffix-tree suffixes 'rosettacode$';</lang>

Output matches the one in the task description.