Anadromes

From Rosetta Code
Revision as of 05:42, 30 July 2022 by rosettacode>Globules (Initial Haskell version.)
Anadromes is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.


An anadrome is similar to a palindrome except, rather than spelling the same word or phrase when reversed, it spells a different word or phrase. An anadrome is a special case of an anagram.

Anadrome is a portmanteau of the words anagram and palindrome.

For instance, regal and lager are anadromes.


Task

Using the words.txt file from https://github.com/dwyl/english-words, find and display all of the anadrome pairs with more than 6 characters.

Each word pair should only show up one time in the list.

Alternate download location.


ALGOL 68

Reads the words from standard input, stopping when a word = ZZZ is found (which is the last word in words.txt).
Unfortunately, Algol 68G doesn't like an array of STRINGs with more than 300 000 elements, even though it allows INT arrays to have millions - at least under Windows
(I haven't tried it with Linux).
So you will need to use another compiler under Windows.
As in the Wren sample, the words are quicksorted so binary searching can be used to find the reversed words. <lang algol68>BEGIN # find some anadromes: words that whwn reversed are also words #

   # in-place quick sort an array of strings                                #
   PROC s quicksort = ( REF[]STRING a, INT lb, ub )VOID:
        IF ub > lb
        THEN
           # more than one element, so must sort                            #
           INT  left  := lb;
           INT  right := ub;
           # choosing the middle element of the array as the pivot          #
           STRING pivot := a[ left + ( ( right + 1 ) - left ) OVER 2 ];
           WHILE
               WHILE IF left  <= ub THEN a[ left  ] < pivot ELSE FALSE FI
               DO
                   left  +:= 1
               OD;
               WHILE IF right >= lb THEN a[ right ] > pivot ELSE FALSE FI
               DO
                   right -:= 1
               OD;
               left <= right
           DO
               STRING t   := a[ left  ];
               a[ left  ] := a[ right ];
               a[ right ] := t;
               left      +:= 1;
               right     -:= 1
           OD;
           s quicksort( a, lb,   right );
           s quicksort( a, left, ub    )
        FI # s quicksort # ;
   # returns TRUE if item is in list, FALSE otherwise                       #
   # - based on the iterative routine in the binary search task             #
   PROC contains = ( []STRING list, STRING item, INT lb, ub )BOOL:
        BEGIN
           INT  low  := lb,
           INT  high := ub;
           WHILE low < high DO
               INT mid =  ( low + high ) OVER 2;
               IF   list[ mid ] > item THEN high := mid - 1
               ELIF list[ mid ] < item THEN low  := mid + 1
               ELSE low := high := mid
               FI
           OD;
           list[ low ] = item
        END # contains # ;
   [ 1 : 500 000 ]STRING words;
   INT   t count    := 0;
   INT   w count    := 0;
   INT   max length := 0;
   BOOL  at eof     := FALSE;
   WHILE NOT at eof
   DO
       STRING word;
       read( ( word, newline ) );
       at eof   := word = "ZZZ";
       t count +:= 1;
       INT w length := 1 + ( UPB word - LWB word );
       IF w length > 6 THEN
           w count +:= 1;
           words[ w count ] := word;
           IF w length > max length THEN max length := w length FI
       FI
   OD;
   print( ( "read ", whole( t count, 0 ), " words, "
          , "the longest is ", whole( max length, 0 ), " characters"
          , newline
          , "     ", whole( w count, 0 ), " words are longer than 6 characters"
          , newline, newline
          )
        );
   s quicksort( words, 1, w count ); # sort the words for binary search      #
   print( ( "The following anadromes are present:", newline, newline ) );
   INT a count := 0;
   FOR i TO w count DO
       STRING word = words[ i ];
       STRING reverse word := "";
       FOR w pos FROM LWB word TO UPB word DO word[ w pos ] +=: reverse word OD;
       IF word < reverse word THEN
           IF contains( words, reverse word, 1, w count ) THEN
               # have an anadromic pair                                      #
               INT w length = 1 + ( UPB words[ i ] - LWB words[ i ] );
               FOR c TO 10 - w length DO print( ( " " ) ) OD;
               print( ( words[ i ], " :: ", reverse word, newline ) );
               a count +:= 1
           FI
       FI
   OD;
   print( ( newline, "Found ", whole( a count, 0 ), " anadromes", newline ) )

END</lang>

Output:
read 466551 words, the longest is 45 characters
     387537 words are longer than 6 characters

The following anadromes are present:

   amaroid :: diorama
   degener :: reneged
   deifier :: reified
   deliver :: reviled
   dessert :: tressed
  desserts :: stressed
   deviler :: relived
  dioramas :: samaroid
   gateman :: nametag
   leveler :: relevel
   pat-pat :: tap-tap
  redrawer :: rewarder
   reknits :: stinker
   relever :: reveler
   reliver :: reviler
   revotes :: setover
   sallets :: stellas

Found 17 anadromes

AppleScript

<lang applescript>use AppleScript version "2.4" -- OS X 10.10 (Yosemite) or later use framework "Foundation" use scripting additions

on Anadromes(textFile, minLength)

   set |⌘| to current application
   -- Read the text from the file.
   set theText to |⌘|'s class "NSString"'s stringWithContentsOfFile:(textFile's POSIX path) ¬
       usedEncoding:(missing value) |error|:(missing value)
   -- Lose paragraphs (one word per paragraph) which have fewer than minLength characters.
   set theText to theText's stringByReplacingOccurrencesOfString:("(?m)^.{0," & minLength - 1 & "}\\R") withString:("") ¬
       options:(|⌘|'s NSRegularExpressionSearch) range:({0, theText's |length|()})
   -- Get the remaining paragraphs as an array.
   set wordArray to (theText's componentsSeparatedByCharactersInSet:(|⌘|'s class "NSCharacterSet"'s controlCharacterSet()))
   
   -- Derive a list of reversed words.
   script o
       property wordList : (wordArray) as list
   end script
   set astid to AppleScript's text item delimiters
   set AppleScript's text item delimiters to ""
   repeat with i from 1 to (count o's wordList)
       set o's wordList's item i to (o's wordList's item i's characters's reverse) as text
   end repeat
   
   -- Get a list of words that are in both the original and reversed groups.
   set reversedWordSet to |⌘|'s class "NSSet"'s setWithArray:(o's wordList)
   set filter to |⌘|'s class "NSPredicate"'s predicateWithFormat_("self IN %@", reversedWordSet)
   set o's wordList to (wordArray's filteredArrayUsingPredicate:(filter)) as list
   
   -- Build the output line by line, omitting palindromes and already matched word pairs.
   set output to {}
   repeat with i from 1 to (count o's wordList)
       set thisWord to o's wordList's item i
       set o's wordList's item i to missing value
       set reversedWord to (thisWord's characters's reverse) as text
       if ({reversedWord} is in o's wordList) then set output's end to thisWord & " <--> " & reversedWord
   end repeat
   set AppleScript's text item delimiters to linefeed
   set output to output as text
   set AppleScript's text item delimiters to astid
   
   return output

end Anadromes

return Anadromes(((path to desktop as text) & "www.rosettacode.org:words.txt") as «class furl», 7)</lang>

Output:

<lang applescript>"amaroid <--> diorama degener <--> reneged deifier <--> reified deliver <--> reviled dessert <--> tressed desserts <--> stressed deviler <--> relived dioramas <--> samaroid gateman <--> nametag leveler <--> relevel pat-pat <--> tap-tap redrawer <--> rewarder reknits <--> stinker relever <--> reveler reliver <--> reviler revotes <--> setover sallets <--> stellas"</lang>

AWK

<lang AWK>

  1. syntax: GAWK -f ANADROMES.AWK WORDS.TXT
  2. sorting:
  3. PROCINFO["sorted_in"] is used by GAWK
  4. SORTTYPE is used by Thompson Automation's TAWK

BEGIN {

   width = 6

} { if (length($0) > width) {

     arr[$0]++
   }

} END {

   PROCINFO["sorted_in"] = "@ind_str_asc" ; SORTTYPE = 1
   for (i in arr) {
     tmp = reverse(i)
     if (tmp in arr) {
       if (i == tmp) { continue }
       if (tmp in shown_arr) { continue }
       printf("%11s %11s\n",i,tmp)
       shown_arr[i] = ""
     }
   }
   printf("%d words, %d > %d characters, %d anadromes\n",NR,length(arr),width,length(shown_arr))
   exit(0)

} function reverse(str, i,rts) {

   for (i=length(str); i>=1; i--) {
     rts = rts substr(str,i,1)
   }
   return(rts)

} </lang>

Output:
    amaroid     diorama
    degener     reneged
    deifier     reified
    deliver     reviled
    dessert     tressed
   desserts    stressed
    deviler     relived
   dioramas    samaroid
    gateman     nametag
    leveler     relevel
    pat-pat     tap-tap
   redrawer    rewarder
    reknits     stinker
    relever     reveler
    reliver     reviler
    revotes     setover
    sallets     stellas
466551 words, 387537 > 6 characters, 17 anadromes

C++

<lang cpp>#include <algorithm>

  1. include <cstdlib>
  2. include <fstream>
  3. include <iomanip>
  4. include <iostream>
  5. include <string>
  6. include <vector>

int main() {

   std::vector<std::string> words;
   std::ifstream in("words.txt");
   if (!in) {
       std::cerr << "Cannot open file words.txt.\n";
       return EXIT_FAILURE;
   }
   std::string line;
   while (getline(in, line)) {
       if (line.size() > 6)
           words.push_back(line);
   }
   sort(words.begin(), words.end());
   for (const std::string& word : words) {
       std::string reversed(word.rbegin(), word.rend());
       if (reversed > word &&
           binary_search(words.begin(), words.end(), reversed)) {
           std::cout << std::setw(8) << word << " <-> " << reversed << '\n';
       }
   }
   return EXIT_SUCCESS;

}</lang>

Output:
 amaroid <-> diorama
 degener <-> reneged
 deifier <-> reified
 deliver <-> reviled
 dessert <-> tressed
desserts <-> stressed
 deviler <-> relived
dioramas <-> samaroid
 gateman <-> nametag
 leveler <-> relevel
 pat-pat <-> tap-tap
redrawer <-> rewarder
 reknits <-> stinker
 relever <-> reveler
 reliver <-> reviler
 revotes <-> setover
 sallets <-> stellas

F#

<lang fsharp> // Anadromes. Nigel Galloway: June 26th., 2022 let words=seq{use n=System.IO.File.OpenText("words.txt") in while not n.EndOfStream do yield n.ReadLine()}|>Seq.filter(fun n->6<(Seq.length n))|>Seq.map(fun n->n.ToCharArray())|>Set.ofSeq Set.intersect words (words|>Set.map(Array.rev))|>Set.iter(fun n->if n<Array.rev n then printfn "%s" (System.String n)) </lang>

Factor

Works with: Factor version 0.99 2022-04-03

<lang factor>USING: assocs grouping hash-sets io.encodings.ascii io.files kernel math prettyprint sequences sets sets.extras ;

"words.txt" ascii file-lines [ length 6 > ] filter dup >hash-set '[ reverse _ in? ] filter [ reverse ] zip-with [ all-equal? ] reject [ fast-set ] unique-by .</lang>

Output:
{
    { "amaroid" "diorama" }
    { "degener" "reneged" }
    { "deifier" "reified" }
    { "deliver" "reviled" }
    { "dessert" "tressed" }
    { "desserts" "stressed" }
    { "deviler" "relived" }
    { "dioramas" "samaroid" }
    { "gateman" "nametag" }
    { "leveler" "relevel" }
    { "pat-pat" "tap-tap" }
    { "redrawer" "rewarder" }
    { "reknits" "stinker" }
    { "relever" "reveler" }
    { "reliver" "reviler" }
    { "revotes" "setover" }
    { "sallets" "stellas" }
}

Haskell

<lang haskell>import qualified Data.Set as S import qualified Data.Text as T import qualified Data.Text.IO as TIO

-- The anadromes of a list of words. We convert all words to lower case. anadromes :: [T.Text] -> [(T.Text, T.Text)] anadromes ws = let set = S.fromList $ map T.toLower ws

              in S.foldr (step set) [] set
 where step set w ps = let rev = T.reverse w
                       in if w < rev && S.member rev set
                          then (w, rev) : ps
                          else ps

main :: IO () main = TIO.interact (T.unlines . map anaShow . anadromes . longEnough . T.lines)

 where longEnough = filter ((> 6) . T.length)
       anaShow (x, y) = T.unwords [x, y]</lang>
Output:
amaroid diorama
anacara aracana
annabal labanna
artamus sumatra
colbert trebloc
degener reneged
deifier reified
delbert trebled
delevan naveled
deliver reviled
dessert tressed
desserts stressed
deviler relived
dioramas samaroid
eimmart trammie
emmeram maremme
gateman nametag
latimer remital
lattimer remittal
lessees seessel
leveler relevel
nicolaus sualocin
pat-pat tap-tap
redrawer rewarder
reknits stinker
relever reveler
reliver reviler
revotes setover
rotanev venator
roygbiv vibgyor
sallets stellas
sennits stinnes

J

Inspecting other entries here, it seems clear that we cannot ignore case. Otherwise, 'trebled' would be an anadrome.

Anyways, the basic approach here is to identify a canonical key for each word, look for paired keys and organize the words based on those keys:

<lang J>words=: cutLF fread 'words.txt' canon=: {.@/:~@(,:|.) each words akeys=: (~. #~ 2 = #/.~) canon tkeys=: (#~ 6 < #@>) akeys order=: /: canon pairs=: _2]\ (order{canon e. tkeys) # order { words</lang>

This gives us:

<lang J> pairs ┌────────┬────────┐ │amaroid │diorama │ ├────────┼────────┤ │degener │reneged │ ├────────┼────────┤ │deifier │reified │ ├────────┼────────┤ │deliver │reviled │ ├────────┼────────┤ │dessert │tressed │ ├────────┼────────┤ │desserts│stressed│ ├────────┼────────┤ │deviler │relived │ ├────────┼────────┤ │dioramas│samaroid│ ├────────┼────────┤ │gateman │nametag │ ├────────┼────────┤ │leveler │relevel │ ├────────┼────────┤ │pat-pat │tap-tap │ ├────────┼────────┤ │redrawer│rewarder│ ├────────┼────────┤ │reknits │stinker │ ├────────┼────────┤ │relever │reveler │ ├────────┼────────┤ │reliver │reviler │ ├────────┼────────┤ │revotes │setover │ ├────────┼────────┤ │sallets │stellas │ └────────┴────────┘</lang>

Julia

<lang ruby>function anadromes(minsize, csense = true, fname = "words.txt")

   words = Set(filter(w -> length(w) >= minsize, split((csense ? identity : lowercase)(read(fname, String)), r"\s+")))
   found = [(w, reverse(w)) for w in words if (r = reverse(w)) in words && w < r]
   println("Total $(length(found)) case $(csense ? "" : in)sensitive anadrome pairs found.")
   foreach(a -> println(a[1], " <=> ", a[2]), sort!(found))

end

anadromes(7) anadromes(7, false)

</lang>

Output:
Total 17 case sensitive anadrome pairs found.
amaroid <=> diorama
degener <=> reneged
deifier <=> reified
deliver <=> reviled
dessert <=> tressed
desserts <=> stressed
deviler <=> relived
dioramas <=> samaroid
gateman <=> nametag
leveler <=> relevel
pat-pat <=> tap-tap
redrawer <=> rewarder
reknits <=> stinker
relever <=> reveler
reliver <=> reviler
revotes <=> setover
sallets <=> stellas
Total 32 case insensitive anadrome pairs found.
amaroid <=> diorama
anacara <=> aracana
annabal <=> labanna
artamus <=> sumatra
colbert <=> trebloc
degener <=> reneged
deifier <=> reified
delbert <=> trebled
delevan <=> naveled
deliver <=> reviled
dessert <=> tressed
desserts <=> stressed
deviler <=> relived
dioramas <=> samaroid
eimmart <=> trammie
emmeram <=> maremme
gateman <=> nametag
latimer <=> remital
lattimer <=> remittal
lessees <=> seessel
leveler <=> relevel
nicolaus <=> sualocin
pat-pat <=> tap-tap
redrawer <=> rewarder
reknits <=> stinker
relever <=> reveler
reliver <=> reviler
revotes <=> setover
rotanev <=> venator
roygbiv <=> vibgyor
sallets <=> stellas
sennits <=> stinnes

Perl

<lang perl>use strict; use warnings;

my(%W,%A); for my $w ( grep { /[A-z\-]{7,}/ } split "\n", do { local( @ARGV, $/ ) = ( 'words.txt' ); <> } ) {

   my $r = reverse $w;
   if ($W{$r}) { $A{$r} = sprintf "%10s ↔ %s\n", $r, $w }
   else        { $W{$w} = $w                            }

}

print $A{$_} for sort keys %A;</lang>

Output:
   amaroid ↔ diorama
   degener ↔ reneged
   deifier ↔ reified
   deliver ↔ reviled
   dessert ↔ tressed
  desserts ↔ stressed
   deviler ↔ relived
  dioramas ↔ samaroid
   gateman ↔ nametag
   leveler ↔ relevel
   pat-pat ↔ tap-tap
  redrawer ↔ rewarder
   reknits ↔ stinker
   relever ↔ reveler
   reliver ↔ reviler
   revotes ↔ setover
   sallets ↔ stellas

Phix

with javascript_semantics
integer m = iff(platform()=JS?5:7)
sequence words = unique(unix_dict(m,`words.txt`))
function anadrome(string w)
    return w<reverse(w) and binary_search(reverse(w),words)>0
end function
sequence r = sort(filter(words,anadrome)),
         s = columnize({r,apply(r,reverse)}),
         t = join(s,"\n",fmt:="%8s <=> %-8s")
printf(1,"Found %d anadromes:\n%s\n",{length(r),t})
Output:

On the desktop:

Found 17 anadromes:
 amaroid <=> diorama
 degener <=> reneged
 deifier <=> reified
 deliver <=> reviled
 dessert <=> tressed
desserts <=> stressed
 deviler <=> relived
dioramas <=> samaroid
 gateman <=> nametag
 leveler <=> relevel
 pat-pat <=> tap-tap
redrawer <=> rewarder
 reknits <=> stinker
 relever <=> reveler
 reliver <=> reviler
 revotes <=> setover
 sallets <=> stellas

Under p2js, aka in a browser, from where it cannot realistically read disk files, unix_dict() ignores the filename parameter and uses the smaller unixdict.txt, as by now quite some time ago converted into a dirty great big Phix sequence constant and then transpiled into a JavaScript Array, for which we also drop the minimum length to 5, and get:

Found 5 anadromes:
   damon <=> nomad   
  kramer <=> remark  
   lager <=> regal   
   leper <=> repel   
   lever <=> revel   

Raku

<lang perl6>my @words = 'words.txt'.IO.slurp.words.grep: *.chars > 6;

my %words = @words.pairs.invert;

put join "\n", @words.map: { %words{$_}:delete and sprintf "%10s ↔ %s", $_, .flip if ($_ ne .flip) && %words{.flip} }</lang>

Output:
   amaroid ↔ diorama
   degener ↔ reneged
   deifier ↔ reified
   deliver ↔ reviled
   dessert ↔ tressed
  desserts ↔ stressed
   deviler ↔ relived
  dioramas ↔ samaroid
   gateman ↔ nametag
   leveler ↔ relevel
   pat-pat ↔ tap-tap
  redrawer ↔ rewarder
   reknits ↔ stinker
   relever ↔ reveler
   reliver ↔ reviler
   revotes ↔ setover
   sallets ↔ stellas

Seed7

<lang seed7>$ include "seed7_05.s7i";

const func boolean: binarySearch (in array string: haystack, in string: needle) is func

 result
   var boolean: result is FALSE;
 local
   var integer: low is 1;
   var integer: high is -1;
   var integer: middle is -1;
 begin
   high := length(haystack);
   while result = FALSE and low <= high do
     middle := low + (high - low) div 2;
     if needle < haystack[middle] then
       high := pred(middle);
     elsif needle > haystack[middle] then
       low := succ(middle);
     else
       result := TRUE;
     end if;
   end while;
 end func;

const proc: main is func

 local
   var file: dictionary is STD_NULL;
   var string: word is "";
   var string: reversed is "";
   var array string: words is (array string).value;
 begin
   dictionary := open("words.txt", "r");
   while not eof(dictionary) do
     readln(dictionary, word);
     words &:= word;
   end while;
   close(dictionary);
   words := sort(words);
   for word range words do
     reversed := reverse(word);
     if length(word) > 6 and word < reversed and binarySearch(words, reversed) then
       writeln(word <& " <-> " <& reversed);
     end if;
   end for;
 end func;</lang>
Output:
amaroid <-> diorama
degener <-> reneged
deifier <-> reified
deliver <-> reviled
dessert <-> tressed
desserts <-> stressed
deviler <-> relived
dioramas <-> samaroid
gateman <-> nametag
leveler <-> relevel
pat-pat <-> tap-tap
redrawer <-> rewarder
reknits <-> stinker
relever <-> reveler
reliver <-> reviler
revotes <-> setover
sallets <-> stellas

Sidef

<lang ruby>var S = Hash()

File('words.txt').open_r.each {|word|

   word.len > 6 || next
   if (S.has(word.flip)) {
       say "#{word.flip} <=> #{word}"
   }
   else {
       S{word} = true
   }

}</lang>

Output:
amaroid <=> diorama
gateman <=> nametag
deifier <=> reified
leveler <=> relevel
deviler <=> relived
degener <=> reneged
relever <=> reveler
deliver <=> reviled
reliver <=> reviler
redrawer <=> rewarder
dioramas <=> samaroid
revotes <=> setover
sallets <=> stellas
reknits <=> stinker
desserts <=> stressed
pat-pat <=> tap-tap
dessert <=> tressed

Wren

Library: Wren-sort
Library: Wren-fmt

<lang ecmascript>import "io" for File import "./sort" for Sort, Find import "./fmt" for Fmt

var wordList = "words.txt" // local copy var words = File.read(wordList)

               .trimEnd()
               .split("\n")
               .where { |word| word.count > 6 }
               .toList

Sort.quick(words) // need strict lexicographical order to use binary search var anadromes = [] for (word in words) {

   var word2 = word[-1..0]
   if (word != word2 && !anadromes.contains(word2) && Find.first(words, word2) >= 0) {
       anadromes.add(word)
   }

} System.print("The anadrome pairs with more than 6 letters are:") for (ana in anadromes) Fmt.print("$8s <-> $8s", ana, ana[-1..0])</lang>

Output:
The anadrome pairs with more than 6 letters are:
 amaroid <->  diorama
 degener <->  reneged
 deifier <->  reified
 deliver <->  reviled
 dessert <->  tressed
desserts <-> stressed
 deviler <->  relived
dioramas <-> samaroid
 gateman <->  nametag
 leveler <->  relevel
 pat-pat <->  tap-tap
redrawer <-> rewarder
 reknits <->  stinker
 relever <->  reveler
 reliver <->  reviler
 revotes <->  setover
 sallets <->  stellas