Anagrams/Deranged anagrams: Difference between revisions

From Rosetta Code
Content added Content deleted
(Drop the draft status.)
Line 1: Line 1:
{{draft task}}
{{task}}
Two or more words are said to be [[Anagrams|anagrams]] if they have the same characters, but in a different order. By analogy with [[Permutations/Derangements|derangements]] we define a deranged anagram as two words with the same characters, but in which the same character does ''not'' appear in the same position in both words.
Two or more words are said to be [[Anagrams|anagrams]] if they have the same characters, but in a different order. By analogy with [[Permutations/Derangements|derangements]] we define a deranged anagram as two words with the same characters, but in which the same character does ''not'' appear in the same position in both words.



Revision as of 14:46, 25 May 2011

Task
Anagrams/Deranged anagrams
You are encouraged to solve this task according to the task description, using any language you may know.

Two or more words are said to be anagrams if they have the same characters, but in a different order. By analogy with derangements we define a deranged anagram as two words with the same characters, but in which the same character does not appear in the same position in both words.

The task is to use the word list at http://www.puzzlers.org/pub/wordlists/unixdict.txt to find and show the longest deranged anagram.

Cf.

D

<lang d>import std.stdio, std.file, std.string, std.algorithm,

    std.typecons, std.range;

auto findAnagrams(string[] words) {

 string[][string] anagrams;
 foreach (w; words)
   anagrams[w.sort] ~= w.idup;
 return array(filter!q{a.length > 1}(anagrams.values));

}

/// returns pairs of words that have no character in the same position Tuple!(string,string)[] findDeranged(string[] words) {

 typeof(return) result;
 foreach (i, w1; words)
   foreach (w2; words[i+1..$])
     if (count!q{a[0] != a[1]}(zip(w1, w2)) == w1.length)
       result ~= tuple(w1, w2);
 return result;

}

Tuple!(string,string)[] largestDerangedAna(string[][] anagrams) {

 sort!q{a[0].length > b[0].length}(anagrams);
 foreach (words; anagrams) {
   auto derangedPairs = findDeranged(words);
   if (derangedPairs)
     return derangedPairs;
 }
 return [];

}

void main() {

 auto anags = findAnagrams(split(cast(string)read("unixdict.txt")));
 writeln("Longest anagrams with no characters in the same position:");
 foreach (pairs; largestDerangedAna(anags))
   writefln("  %s, %s", pairs.tupleof);

}</lang> Output:

Longest anagrams with no characters in the same position:
  excitation, intoxicate

J

This assumes that unixdict.txt has been saved in the current directory.

<lang j> #words=: <;._2 ] 1!:1 <'unixdict.txt' 25104

  #anagrams=: (#~ 1 < #@>) (</.~ /:~&>) words

1303

  #maybederanged=: (#~ (1 -.@e. #@~."1)@|:@:>&>) anagrams

432

  #longest=: (#~ [: (= >./) #@>@{.@>) maybederanged

1

  longest

┌───────────────────────┐ │┌──────────┬──────────┐│ ││excitation│intoxicate││ │└──────────┴──────────┘│ └───────────────────────┘</lang>

Note that anagram sets with more than two members might, hypothetically, have made things more complicated. By lucky coincidence, this was not an issue. We could have taken advantage of that coincidence to achieve slight further simplifications. Perhaps maybederanged=: (#~ (-: ~."1)@|:@:>&>) anagrams

Python

<lang python>import urllib.request from collections import defaultdict from itertools import combinations

def getwords(url='http://www.puzzlers.org/pub/wordlists/unixdict.txt'):

   return list(set(urllib.request.urlopen(url).read().decode().split()))

def find_anagrams(words):

   anagram = defaultdict(list) # map sorted chars to anagrams
   for word in words:
       anagram[tuple(sorted(word))].append( word )
   return dict((key, words) for key, words in anagram.items()
               if len(words) > 1)

def is_deranged(words):

   'returns pairs of words that have no character in the same position'
   return [ (word1, word2)
            for word1,word2 in combinations(words, 2)
            if all(ch1 != ch2 for ch1, ch2 in zip(word1, word2)) ]

def largest_deranged_ana(anagrams):

   ordered_anagrams = sorted(anagrams.items(),
                             key=lambda x:(-len(x[0]), x[0]))
   for _, words in ordered_anagrams:
       deranged_pairs = is_deranged(words)
       if deranged_pairs:
           return deranged_pairs
   return []

if __name__ == '__main__':

   words = getwords('http://www.puzzlers.org/pub/wordlists/unixdict.txt')
   print("Word count:", len(words))
   anagrams = find_anagrams(words)
   print("Anagram count:", len(anagrams),"\n")
   print("Longest anagrams with no characters in the same position:")
   print('  ' + '\n  '.join(', '.join(pairs)
                            for pairs in largest_deranged_ana(anagrams)))</lang>
Sample output
Word count: 25104
Anagram count: 1303 

Longest anagrams with no characters in the same position:
  excitation, intoxicate

Tcl

<lang tcl>package require Tcl 8.5 package require http

  1. Fetch the words

set t [http::geturl "http://www.puzzlers.org/pub/wordlists/unixdict.txt"] set wordlist [split [http::data $t] \n] http::cleanup $t

  1. Group by characters in word

foreach word $wordlist {

   dict lappend w [lsort [split $word ""]] [split $word ""]

}

  1. Deranged test

proc deranged? {l1 l2} {

   foreach c1 $l1 c2 $l2 {

if {$c1 eq $c2} {return 0}

   }
   return 1

}

  1. Get a deranged pair from an anagram set, if one exists

proc getDeranged {words} {

   foreach l1 [lrange $words 0 end-1] {

foreach l2 [lrange $words 1 end] { if {[deranged? $l1 $l2]} { return [list $l1 $l2 1] } }

   }
   return {{} {} 0}

}

  1. Find the max-length deranged anagram

set count 0 set candidates {} set max 0 dict for {k words} $w {

   incr count [expr {[llength $words] > 1}]
   if {[llength $k] > $max && [lassign [getDeranged $words] l1 l2]} {

set max [llength $l1] lappend candidates [join $l1 ""],[join $l2 ""]

   }

}

  1. Print out what we found

puts "[llength $wordlist] words" puts "[dict size $w] potential anagram-groups" puts "$count real anagram-groups" foreach pair $candidates {

   puts "considered candidate pairing: $pair"

} puts "MAXIMAL DERANGED ANAGRAM: LENGTH $max\n\t[lindex $candidates end]"</lang> Output:

25105 words
23567 potential anagram-groups
1303 real anagram-groups
considered candidate pairing: abc,cab
considered candidate pairing: abed,bade
considered candidate pairing: abort,bator
considered candidate pairing: afresh,shafer
considered candidate pairing: alberto,latrobe
considered candidate pairing: american,cinerama
considered candidate pairing: ancestral,lancaster
considered candidate pairing: excitation,intoxicate
MAXIMAL DERANGED ANAGRAM: LENGTH 10
	excitation,intoxicate