Anagrams/Deranged anagrams

From Rosetta Code
Revision as of 11:32, 14 May 2011 by rosettacode>Paddy3118 (New task and Python solution.)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Anagrams/Deranged anagrams is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

Two or more words are said to be anagrams if they have the same characters, but in a different order. By analogy with derangements we define a deranged anagram as two words with the same characters, but in which the same character does not appear in the same position in both words.

The task is to use the word list at http://www.puzzlers.org/pub/wordlists/unixdict.txt to find and show the longest deranged anagram.

Python

<lang python>import urllib.request from collections import defaultdict from itertools import combinations

def getwords(url='http://www.puzzlers.org/pub/wordlists/unixdict.txt'):

   return list(set(urllib.request.urlopen(url).read().decode().split()))

def find_anagrams(words):

   anagram = defaultdict(list) # map sorted chars to anagrams
   for word in words:
       anagram[tuple(sorted(word))].append( word )
   return dict((key, words) for key, words in anagram.items()
               if len(words) > 1)

def is_deranged(words):

   'returns pairs of words that have no character in the same position'
   return [ (word1, word2)
            for word1,word2 in combinations(words, 2)
            if all(ch1 != ch2 for ch1, ch2 in zip(word1, word2)) ]

def largest_deranged_ana(anagrams):

   ordered_anagrams = sorted(anagrams.items(),
                             key=lambda x:(-len(x[0]), x[0]))
   for _, words in ordered_anagrams:
       deranged_pairs = is_deranged(words)
       if deranged_pairs:
           return deranged_pairs
   return []

if __name__ == '__main__':

   words = getwords('http://www.puzzlers.org/pub/wordlists/unixdict.txt')
   print("Word count:", len(words))
   anagrams = find_anagrams(words)
   print("Anagram count:", len(anagrams),"\n")
   print("Longest anagrams with no characters in the same position:")
   print('  ' + '\n  '.join(', '.join(pairs)
                            for pairs in largest_deranged_ana(anagrams)))</lang>
Sample output
Word count: 25104
Anagram count: 1303 

Longest anagrams with no characters in the same position:
  excitation, intoxicate