Anagrams/Deranged anagrams: Difference between revisions
m (→{{header|J}}) |
(Drop the draft status.) |
||
Line 1: | Line 1: | ||
{{ |
{{task}} |
||
Two or more words are said to be [[Anagrams|anagrams]] if they have the same characters, but in a different order. By analogy with [[Permutations/Derangements|derangements]] we define a deranged anagram as two words with the same characters, but in which the same character does ''not'' appear in the same position in both words. |
Two or more words are said to be [[Anagrams|anagrams]] if they have the same characters, but in a different order. By analogy with [[Permutations/Derangements|derangements]] we define a deranged anagram as two words with the same characters, but in which the same character does ''not'' appear in the same position in both words. |
||
Revision as of 14:46, 25 May 2011
You are encouraged to solve this task according to the task description, using any language you may know.
Two or more words are said to be anagrams if they have the same characters, but in a different order. By analogy with derangements we define a deranged anagram as two words with the same characters, but in which the same character does not appear in the same position in both words.
The task is to use the word list at http://www.puzzlers.org/pub/wordlists/unixdict.txt to find and show the longest deranged anagram.
- Cf.
D
<lang d>import std.stdio, std.file, std.string, std.algorithm,
std.typecons, std.range;
auto findAnagrams(string[] words) {
string[][string] anagrams; foreach (w; words) anagrams[w.sort] ~= w.idup; return array(filter!q{a.length > 1}(anagrams.values));
}
/// returns pairs of words that have no character in the same position Tuple!(string,string)[] findDeranged(string[] words) {
typeof(return) result; foreach (i, w1; words) foreach (w2; words[i+1..$]) if (count!q{a[0] != a[1]}(zip(w1, w2)) == w1.length) result ~= tuple(w1, w2); return result;
}
Tuple!(string,string)[] largestDerangedAna(string[][] anagrams) {
sort!q{a[0].length > b[0].length}(anagrams); foreach (words; anagrams) { auto derangedPairs = findDeranged(words); if (derangedPairs) return derangedPairs; } return [];
}
void main() {
auto anags = findAnagrams(split(cast(string)read("unixdict.txt"))); writeln("Longest anagrams with no characters in the same position:"); foreach (pairs; largestDerangedAna(anags)) writefln(" %s, %s", pairs.tupleof);
}</lang> Output:
Longest anagrams with no characters in the same position: excitation, intoxicate
J
This assumes that unixdict.txt has been saved in the current directory.
<lang j> #words=: <;._2 ] 1!:1 <'unixdict.txt' 25104
#anagrams=: (#~ 1 < #@>) (</.~ /:~&>) words
1303
#maybederanged=: (#~ (1 -.@e. #@~."1)@|:@:>&>) anagrams
432
#longest=: (#~ [: (= >./) #@>@{.@>) maybederanged
1
longest
┌───────────────────────┐ │┌──────────┬──────────┐│ ││excitation│intoxicate││ │└──────────┴──────────┘│ └───────────────────────┘</lang>
Note that anagram sets with more than two members might, hypothetically, have made things more complicated. By lucky coincidence, this was not an issue. We could have taken advantage of that coincidence to achieve slight further simplifications. Perhaps maybederanged=: (#~ (-: ~."1)@|:@:>&>) anagrams
Python
<lang python>import urllib.request from collections import defaultdict from itertools import combinations
def getwords(url='http://www.puzzlers.org/pub/wordlists/unixdict.txt'):
return list(set(urllib.request.urlopen(url).read().decode().split()))
def find_anagrams(words):
anagram = defaultdict(list) # map sorted chars to anagrams for word in words: anagram[tuple(sorted(word))].append( word ) return dict((key, words) for key, words in anagram.items() if len(words) > 1)
def is_deranged(words):
'returns pairs of words that have no character in the same position' return [ (word1, word2) for word1,word2 in combinations(words, 2) if all(ch1 != ch2 for ch1, ch2 in zip(word1, word2)) ]
def largest_deranged_ana(anagrams):
ordered_anagrams = sorted(anagrams.items(), key=lambda x:(-len(x[0]), x[0])) for _, words in ordered_anagrams: deranged_pairs = is_deranged(words) if deranged_pairs: return deranged_pairs return []
if __name__ == '__main__':
words = getwords('http://www.puzzlers.org/pub/wordlists/unixdict.txt') print("Word count:", len(words))
anagrams = find_anagrams(words) print("Anagram count:", len(anagrams),"\n")
print("Longest anagrams with no characters in the same position:") print(' ' + '\n '.join(', '.join(pairs) for pairs in largest_deranged_ana(anagrams)))</lang>
- Sample output
Word count: 25104 Anagram count: 1303 Longest anagrams with no characters in the same position: excitation, intoxicate
Tcl
<lang tcl>package require Tcl 8.5 package require http
- Fetch the words
set t [http::geturl "http://www.puzzlers.org/pub/wordlists/unixdict.txt"] set wordlist [split [http::data $t] \n] http::cleanup $t
- Group by characters in word
foreach word $wordlist {
dict lappend w [lsort [split $word ""]] [split $word ""]
}
- Deranged test
proc deranged? {l1 l2} {
foreach c1 $l1 c2 $l2 {
if {$c1 eq $c2} {return 0}
} return 1
}
- Get a deranged pair from an anagram set, if one exists
proc getDeranged {words} {
foreach l1 [lrange $words 0 end-1] {
foreach l2 [lrange $words 1 end] { if {[deranged? $l1 $l2]} { return [list $l1 $l2 1] } }
} return {{} {} 0}
}
- Find the max-length deranged anagram
set count 0 set candidates {} set max 0 dict for {k words} $w {
incr count [expr {[llength $words] > 1}] if {[llength $k] > $max && [lassign [getDeranged $words] l1 l2]} {
set max [llength $l1] lappend candidates [join $l1 ""],[join $l2 ""]
}
}
- Print out what we found
puts "[llength $wordlist] words" puts "[dict size $w] potential anagram-groups" puts "$count real anagram-groups" foreach pair $candidates {
puts "considered candidate pairing: $pair"
} puts "MAXIMAL DERANGED ANAGRAM: LENGTH $max\n\t[lindex $candidates end]"</lang> Output:
25105 words 23567 potential anagram-groups 1303 real anagram-groups considered candidate pairing: abc,cab considered candidate pairing: abed,bade considered candidate pairing: abort,bator considered candidate pairing: afresh,shafer considered candidate pairing: alberto,latrobe considered candidate pairing: american,cinerama considered candidate pairing: ancestral,lancaster considered candidate pairing: excitation,intoxicate MAXIMAL DERANGED ANAGRAM: LENGTH 10 excitation,intoxicate