Isograms and heterograms: Difference between revisions

← Older edit

Isograms and heterograms (view source)

Revision as of 12:02, 11 December 2023

19,446 bytes added , 5 months ago

m

→‎{{header|Wren}}: Changed to Wren S/H

PureFox

9,476

edits

Revision as of 18:46, 30 August 2022 (view source) Wherrera (talk \| contribs) m (→‎{{header\|Julia}}) ← Older edit		Latest revision as of 12:02, 11 December 2023 (view source) PureFox (talk \| contribs) m (→‎{{header\|Wren}}: Changed to Wren S/H)
(12 intermediate revisions by 9 users not shown)
Line 1: {{~~draft~~ task}} [[Category: String manipulation]] [[Category:Strings]] Line 53: # file opened OK # BOOL at eof := FALSE; # set the EOF handler for the file - notes eof has been reached and # # returns TRUE so processing can continue # ~~on logical file end( input file~~ on logical file end( input file, ( REF FILE f )BOOL: at eof := TRUE ); ~~BEGIN # note that we reached EOF on the latest read #~~ ~~# and return TRUE so processing can continue #~~ ~~at eof := TRUE~~ ~~END~~ ); # in-place quick sort an array of strings # Line 129 ⟶ 124: [ 1 : 2 000 ]STRING words; INT w count := 0; WHILE ~~NOT at eof~~ DO STRING word; get( input file, ( word, newline ) ); IF NOT at eof ~~THEN~~ DO ~~# have another word #~~ # have another word # ~~INT order = ORDER word;~~ IFINT order >= 0ORDER ~~THEN~~word; IF order > 0 ~~INT w length = LENGTH word;~~THEN ~~IF ( order = 1 AND~~INT w length ~~> 10 ) OR order >~~= 1LENGTH ~~THEN~~word; IF ( order = 1 AND w length #> a10 ~~long~~) ~~heterogram~~OR ororder an> ~~isogram~~1 #THEN # a long heterogram or an isogram # ~~store~~ ~~the~~ ~~word~~ ~~prefixed~~ by ~~the~~ ~~max~~ ~~abs~~ ~~char~~ ~~complement~~ # # ofstore the ~~order~~word ~~and~~prefixed by the ~~length~~max soabs ~~when~~char ~~sorted,~~complement ~~the words~~of # # ~~are~~the ~~ordered~~order ~~as requierd by~~and the ~~task~~ length so when sorted, the words are # # ordered as requierd ~~STRING~~by sthe ~~word~~task = ~~REPR~~ ( ~~max~~ ~~abs~~ ~~char~~ - ~~order~~ ) # STRING s word += REPR ( max abs char - worder ~~length~~ ) + REPR ( max +abs char - w length ~~word;~~) ~~words[~~ w ~~count~~ ~~+:=~~ 1 ] := s + word; FIwords[ w count +:= 1 ] := s word FI FI Line 167 ⟶ 161: print( ( newline, "heterograms longer than 10 characters" ) ) ELSE print( ( newline, "isograms of order ", whole( order, 0 ) ) ) ) FI; prev order := order; Line 205 ⟶ 198: sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism </pre> =={{header\|AppleScript}}== <syntaxhighlight lang="applescript">use AppleScript version "2.3.1" -- Mac OS X 10.9 (Mavericks) or later. use sorter : script ¬ "Custom Iterative Ternary Merge Sort" -- <https://www.macscripter.net/t/timsort-and-nigsort/71383/3> use scripting additions -- Return the n number of an n-isogram or 0 for a non-isogram. on isogramicity(wrd) set chrCount to (count wrd) if (chrCount < 2) then return chrCount set chrs to wrd's characters tell sorter to sort(chrs, 1, chrCount, {}) set i to 1 set currentChr to chrs's beginning repeat with j from 2 to chrCount set testChr to chrs's item j if (testChr ≠ currentChr) then if (i = 1) then set n to j - i -- First character's instance count. else if (j - i ≠ n) then return 0 -- Instance count mismatch. end if set i to j set currentChr to testChr end if end repeat if (i = 1) then return chrCount -- All characters the same. if (chrCount - i + 1 ≠ n) then return 0 -- Mismatch with last character. return n end isogramicity on task() script o property wrds : paragraphs of ¬ (read file ((path to desktop as text) & "unixdict.txt") as «class utf8») property isograms : {{}, {}, {}, {}, {}} -- Allow for up to 5-isograms. -- Sort customisation handler to order the words as required. on isGreater(a, b) set ca to (count a) set cb to (count b) if (ca = cb) then return (a > b) return (ca < cb) end isGreater end script ignoring case -- A mere formality. It's the default and unixdict.txt is single-cased anyway! repeat with i from 1 to (count o's wrds) set thisWord to o's wrds's item i set n to isogramicity(thisWord) if (n > 0) then set end of o's isograms's item n to thisWord end repeat repeat with thisList in o's isograms tell sorter to sort(thisList, 1, -1, {comparer:o}) end repeat end ignoring set output to {"N-isograms where n > 1:"} set n_isograms to {} repeat with i from (count o's isograms) to 2 by -1 set n_isograms to n_isograms & o's isograms's item i end repeat set wpl to 6 -- Words per line. repeat with i from 1 to (count n_isograms) set n_isograms's item i to text 1 thru 10 of ((n_isograms's item i) & " ") set wtg to i mod wpl -- Words to go to in this line. if (wtg = 0) then set end of output to join(n_isograms's items (i - wpl + 1) thru i, "") end repeat if (wtg > 0) then set end of output to join(n_isograms's items -wtg thru i, "") set end of output to linefeed & "Heterograms with more than 10 characters:" set n_isograms to o's isograms's beginning set wpl to 4 repeat with i from 1 to (count n_isograms) set thisWord to n_isograms's item i if ((count thisWord) < 11) then exit repeat set n_isograms's item i to text 1 thru 15 of (thisWord & " ") set wtg to i mod wpl if (wtg = 0) then set end of output to join(n_isograms's items (i - wpl + 1) thru i, "") end repeat if (wtg > 0) then set end of output to join(n_isograms's items (i - wtg) thru (i - 1), "") return join(output, linefeed) end task on join(lst, delim) set astid to AppleScript's text item delimiters set AppleScript's text item delimiters to delim set txt to lst as text set AppleScript's text item delimiters to astid return txt end join task()</syntaxhighlight> {{output}} <syntaxhighlight lang="applescript">"N-isograms where n > 1: aaa iii beriberi bilabial caucasus couscous teammate appall emmett hannah murmur tartar testes anna coco dada deed dodo gogo isis juju lulu mimi noon otto papa peep poop teet tete toot tutu ii Heterograms with more than 10 characters: ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism "</syntaxhighlight> =={{Header\|AutoHotkey}}== <syntaxhighlight lang="autohotkey"> LenOrder(lista) { loop,parse,lista,%A_Space% if (StrLen(A_LoopField) > MaxLen) MaxLen := StrLen(A_LoopField) loop % MaxLen-1 { loop,parse,lista,%A_Space% if (StrLen(A_LoopField) = MaxLen) devolve .= A_LoopField . " " MaxLen -= 1 } return devolve } loop,read,unixdict.txt { encounters := 0, started := false loop % StrLen(A_LoopReadLine) { target := strreplace(A_LoopReadLine,SubStr(A_LoopReadLine,a_index,1),,xt) if !started { started := true encounters := xt } if (xt<>encounters) { encounters := 0 continue } target := A_LoopReadLine } if (encounters = 1) and (StrLen(target) > 10) heterograms .= target " " else if (encounters > 1) isograms%encounters% .= target " " } Loop { if (A_Index = 1) continue if !isograms%A_Index% break isograms := LenOrder(isograms%A_Index%) . isograms } msgbox % isograms msgbox % LenOrder(heterograms) ExitApp return ~Esc:: ExitApp </syntaxhighlight> {{Out}} <pre>--------------------------- Isograms and Heterograms.ahk --------------------------- aaa iii beriberi bilabial caucasus couscous teammate appall emmett hannah murmur tartar testes anna coco dada deed dodo gogo isis juju lulu mimi noon otto papa peep poop teet tete toot tutu ii --------------------------- ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism ---------------------------</pre> =={{Header\|C++}}== <syntaxhighlight lang="c++"> #include <algorithm> #include <cstdint> #include <fstream> #include <iostream> #include <set> #include <string> #include <unordered_map> struct Isogram_pair { std::string word; int32_t value; }; std::string to_lower_case(const std::string& text) { std::string result = text; std::transform(result.begin(), result.end(), result.begin(), [](char ch){ return std::tolower(ch); }); return result; } int32_t isogram_value(const std::string& word) { std::unordered_map<char, int32_t> char_counts; for ( const char& ch : word ) { if ( char_counts.find(ch) == char_counts.end() ) { char_counts.emplace(ch, 1); } else { char_counts[ch]++; } } const int32_t count = char_counts[word[0]]; const bool identical = std::all_of(char_counts.begin(), char_counts.end(), [count](const std::pair<char, int32_t> pair){ return pair.second == count; }); return identical ? count : 0; } int main() { auto compare = [](Isogram_pair a, Isogram_pair b) { return ( a.value == b.value ) ? ( ( a.word.length() == b.word.length() ) ? a.word < b.word : a.word.length() > b.word.length() ) : a.value > b.value; }; std::set<Isogram_pair, decltype(compare)> isograms; std::fstream file_stream; file_stream.open("../unixdict.txt"); std::string word; while ( file_stream >> word ) { const int32_t value = isogram_value(to_lower_case(word)); if ( value > 1 \|\| ( word.length() > 10 && value == 1 ) ) { isograms.insert(Isogram_pair(word, value)); } } std::cout << "n-isograms with n > 1:" << std::endl; for ( const Isogram_pair& isogram_pair : isograms ) { if ( isogram_pair.value > 1 ) { std::cout << isogram_pair.word << std::endl; } } std::cout << "\n" << "Heterograms with more than 10 letters:" << std::endl; for ( const Isogram_pair& isogram_pair : isograms ) { if ( isogram_pair.value == 1 ) { std::cout << isogram_pair.word << std::endl; } } } </syntaxhighlight> {{ out }} <pre> n-isograms with n > 1: aaa iii beriberi bilabial caucasus couscous teammate appall emmett hannah murmur tartar testes anna coco dada deed dodo gogo isis juju lulu mimi noon otto papa peep poop teet tete toot tutu ii Heterograms with more than 10 letters: ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism </pre> Line 384 ⟶ 702: valedictory voluntarism </syntaxhighlight> =={{header\|Java}}== <syntaxhighlight lang="java"> import java.io.IOException; import java.nio.file.Path; import java.util.AbstractSet; import java.util.Comparator; import java.util.HashMap; import java.util.Map; import java.util.Scanner; import java.util.TreeSet; public final class IsogramsAndHeterograms { public static void main(String[] aArgs) throws IOException { AbstractSet<IsogramPair> isograms = new TreeSet<IsogramPair>(comparatorIsogram); Scanner scanner = new Scanner(Path.of("unixdict.txt")); while ( scanner.hasNext() ) { String word = scanner.next().toLowerCase(); final int value = isogramValue(word); if ( value > 1 \|\| ( word.length() > 10 && value == 1 ) ) { isograms.add( new IsogramPair(word, value) ); } } scanner.close(); System.out.println("n-isograms with n > 1:"); isograms.stream().filter( pair -> pair.aValue > 1 ).map( pair -> pair.aWord ).forEach(System.out::println); System.out.println(System.lineSeparator() + "Heterograms with more than 10 letters:"); isograms.stream().filter( pair -> pair.aValue == 1 ).map( pair -> pair.aWord ).forEach(System.out::println); } private static int isogramValue(String aWord) { Map<Character, Integer> charCounts = new HashMap<Character, Integer>(); for ( char ch : aWord.toCharArray() ) { charCounts.merge(ch, 1, Integer::sum); } final int count = charCounts.get(aWord.charAt(0)); final boolean identical = charCounts.values().stream().allMatch( i -> i == count ); return identical ? count : 0; } private static Comparator<IsogramPair> comparatorIsogram = Comparator.comparing(IsogramPair::aValue, Comparator.reverseOrder()) .thenComparing(IsogramPair::getWordLength, Comparator.reverseOrder()) .thenComparing(IsogramPair::aWord, Comparator.naturalOrder()); private record IsogramPair(String aWord, int aValue) { private int getWordLength() { return aWord.length(); } }; } </syntaxhighlight> {{ out }} <pre> n-isograms with n > 1: aaa iii beriberi bilabial caucasus couscous teammate appall emmett hannah murmur tartar testes anna coco dada deed dodo gogo isis juju lulu mimi noon otto papa peep poop teet tete toot tutu ii Heterograms with more than 10 letters: ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism </pre> =={{header\|jq}}== This entry assumes that the external file of words does not contain duplicates. <syntaxhighlight lang=jq> # bag of words def bow(stream): reduce stream as $word ({}; .[($word\|tostring)] += 1); # If the input string is an n-isogram then return n, otherwise 0: def isogram: bow(ascii_downcase\|explode[]\|[.]\|implode) \| .[keys_unsorted[0]] as $n \| if all(.[]; . == $n) then $n else 0 end ; # Read the word list (inputs) and record the n-isogram value. # Output: an array of [word, n] values def words: [inputs \| select(test("^[A-Za-z]+$")) \| sub("^ +";"") \| sub(" +$";"") \| [., isogram] ]; # Input: an array of [word, n] values # Sort by decreasing order of n; # Then by decreasing order of word length; # Then by ascending lexicographic order def isograms: map( select( .[1] > 1) ) \| sort_by( .[0]) \| sort_by( - (.[0]\|length)) \| sort_by( - .[1]); # Input: an array of [word, n] values # Sort as for isograms def heterograms($minlength): map(select (.[1] == 1 and (.[0]\|length) >= $minlength)) \| sort_by( .[0]) \| sort_by( - (.[0]\|length)); words \| (isograms \| "List of the \(length) n-isograms for which n > 1:", foreach .[] as [$word, $n] ({}; .header = if $n != .group then "\nisograms of order \($n)" else null end \| .group = $n; (.header \| select(.)), $word ) ) , (heterograms(11) \| "\nList of the \(length) heterograms with length > 10:", .[][0]) </syntaxhighlight> '''Invocation''' <pre> < unixdict.txt jq -Rrn -f isograms-and-heterograms.jq </pre> {{output}} <pre> List of the 33 n-isograms for which n > 1: isograms of order 3 aaa iii isograms of order 2 beriberi bilabial caucasus couscous teammate appall emmett hannah murmur tartar testes anna coco dada deed dodo gogo isis juju lulu mimi noon otto papa peep poop teet tete toot tutu ii List of the 32 heterograms with length > 10: ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism </pre> =={{header\|Julia}}== Line 478 ⟶ 1,057: valedictory 11 voluntarism 11 </pre> =={{header\|Nim}}== <syntaxhighlight lang="Nim">import std/[algorithm, strutils, tables] type Item = tuple[word: string; n: int] func isogramCount(word: string): Natural = ## Check if the word is an isogram and return the number ## of times each character is present. Return 1 for ## heterograms. Return 0 if the word is neither an isogram ## or an heterogram. let counts = word.toCountTable result = 0 for count in counts.values: if result == 0: result = count elif count != result: return 0 proc cmp1(item1, item2: Item): int = ## Comparison function for part 1. result = cmp(item2.n, item1.n) if result == 0: result = cmp(item2.word.len, item1.word.len) if result == 0: result = cmp(item1.word, item2.word) proc cmp2(item1, item2: Item): int = ## Comparison function for part 2. result = cmp(item1.n, item2.n) if result == 0: result = cmp(item2.word.len, item1.word.len) if result == 0: result = cmp(item1.word, item2.word) var isograms: seq[Item] for line in lines("unixdict.txt"): let word = line.toLower let count = word.isogramCount if count != 0: isograms.add (word, count) echo "N-isograms where N > 1:" isograms.sort(cmp1) var idx = 0 for item in isograms: if item.n == 1: break inc idx stdout.write item.word.alignLeft(12) if idx mod 6 == 0: stdout.write '\n' echo() echo "\nHeterograms with more than 10 characters:" isograms.sort(cmp2) idx = 0 for item in isograms: if item.n != 1: break if item.word.len > 10: inc idx stdout.write item.word.alignLeft(16) if idx mod 4 == 0: stdout.write '\n' echo() </syntaxhighlight> {{out}} <pre>N-isograms where N > 1: aaa iii beriberi bilabial caucasus couscous teammate appall emmett hannah murmur tartar testes anna coco dada deed dodo gogo isis juju lulu mimi noon otto papa peep poop teet tete toot tutu ii Heterograms with more than 10 characters: ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism </pre> Line 666 ⟶ 1,330: valedictory 1 11 voluntarism 1 11 </pre> =={{header\|Quackery}}== <syntaxhighlight lang="Quackery"> [ [] ]'[ rot witheach [ dup nested unrot over do iff [ dip join ] else nip ] drop ] is filter ( [ --> [ ) [ 0 127 of swap witheach [ upper 2dup peek 1+ unrot poke ] [] swap witheach [ dup iff join else drop ] dup [] = iff [ drop 0 ] done behead swap witheach [ over != if [ drop 0 conclude ] ] ] is isogram ( [ --> n ) $ "rosetta/unixdict.txt" sharefile drop nest$ dup filter [ isogram 1 > ] sort$ sortwith [ size dip size < ] sortwith [ isogram dip isogram < ] 60 wrap$ cr filter [ size 10 > ] filter [ isogram 1 = ] sort$ sortwith [ size dip size < ] 60 wrap$ cr</syntaxhighlight> {{out}} <pre>aaa iii beriberi bilabial caucasus couscous teammate appall emmett hannah murmur tartar testes anna coco dada deed dodo gogo isis juju lulu mimi noon otto papa peep poop teet tete toot tutu ii ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism </pre> Line 754 ⟶ 1,471: valedictory voluntarism</pre> =={{header\|Ruby}}== Blameworthy exclusionary lexicography causes unixdict.txt to make it incomputable if the word isogram is itself an isogram. <syntaxhighlight lang="ruby">words = File.readlines("unixdict.txt", chomp: true) isograms = words.group_by do \|word\| char_counts = word.downcase.chars.tally.values char_counts.first if char_counts.uniq.size == 1 end isograms.delete(nil) isograms.transform_values!{\|ar\| ar.sort_by{\|word\| [-word.size, word]} } keys = isograms.keys.sort.reverse keys.each{\|k\| puts "(#{isograms[k].size}) #{k}-isograms: #{isograms[k]} " if k > 1 } min_chars = 10 large_heterograms = isograms[1].select{\|word\| word.size > min_chars } puts "" , "(#{large_heterograms.size}) heterograms with more than #{min_chars} chars:" puts large_heterograms </syntaxhighlight> {{out}} <pre>(2) 3-isograms: ["aaa", "iii"] (31) 2-isograms: ["beriberi", "bilabial", "caucasus", "couscous", "teammate", "appall", "emmett", "hannah", "murmur", "tartar", "testes", "anna", "coco", "dada", "deed", "dodo", "gogo", "isis", "juju", "lulu", "mimi", "noon", "otto", "papa", "peep", "poop", "teet", "tete", "toot", "tutu", "ii"] (32) heterograms with more than 10 chars: ambidextrous bluestocking exclusionary incomputable lexicography loudspeaking malnourished atmospheric blameworthy centrifugal christendom consumptive countervail countryside countrywide disturbance documentary earthmoving exculpatory geophysical inscrutable misanthrope problematic selfadjoint stenography sulfonamide switchblade switchboard switzerland thunderclap valedictory voluntarism </pre> =={{header\|Wren}}== {{libheader\|Wren-str}} <syntaxhighlight lang="~~ecmascript~~wren">import "io" for File import "./str" for Str