Change e letters to i in words: Difference between revisions

From Rosetta Code
Content added Content deleted
m (added related tasks.)
(Added Go)
Line 54: Line 54:
vector -> victor
vector -> victor
welles -> willis
welles -> willis
</pre>

=={{header|Go}}==
<lang go>package main

import (
"bytes"
"fmt"
"io/ioutil"
"log"
"sort"
"strings"
"unicode/utf8"
)

func main() {
wordList := "unixdict.txt"
b, err := ioutil.ReadFile(wordList)
if err != nil {
log.Fatal("Error reading file")
}
bwords := bytes.Fields(b)
var words []string
for _, bword := range bwords {
s := string(bword)
if utf8.RuneCountInString(s) > 5 {
words = append(words, s)
}
}
count := 0
le := len(words)
for _, word := range words {
if strings.ContainsRune(word, 'e') {
repl := strings.ReplaceAll(word, "e", "i")
ix := sort.SearchStrings(words, repl) // binary search
if ix < le && words[ix] == repl {
count++
fmt.Printf("%2d: %-9s -> %s\n", count, word, repl)
}
}
}
}</lang>

{{out}}
<pre>
1: analyses -> analysis
2: atlantes -> atlantis
3: bellow -> billow
4: breton -> briton
5: clench -> clinch
6: convect -> convict
7: crises -> crisis
8: diagnoses -> diagnosis
9: enfant -> infant
10: enquiry -> inquiry
11: frances -> francis
12: galatea -> galatia
13: harden -> hardin
14: heckman -> hickman
15: inequity -> iniquity
16: inflect -> inflict
17: jacobean -> jacobian
18: marten -> martin
19: module -> moduli
20: pegging -> pigging
21: psychoses -> psychosis
22: rabbet -> rabbit
23: sterling -> stirling
24: synopses -> synopsis
25: vector -> victor
26: welles -> willis
</pre>
</pre>



Revision as of 09:15, 13 February 2021

Change e letters to i in words is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.
Task

Use the dictionary   unixdict.txt

Change letters   e   to   i   in words.

If changed word in dictionary show it here on this page.

The length of any word shown should have a length   >  5.


Other tasks related to string operations:
Metrics
Counting
Remove/replace
Anagrams/Derangements/shuffling
Find/Search/Determine
Formatting
Song lyrics/poems/Mad Libs/phrases
Tokenize
Sequences



Factor

<lang factor>USING: assocs binary-search formatting io.encodings.ascii io.files kernel literals math sequences splitting ;

CONSTANT: words $[ "unixdict.txt" ascii file-lines ]

words [ length 5 > ] filter [ CHAR: e swap member? ] filter [ dup "e" "i" replace ] map>alist [ nip words sorted-member? ] assoc-filter  ! binary search [ "%-9s -> %s\n" printf ] assoc-each</lang>

Output:
analyses  -> analysis
atlantes  -> atlantis
bellow    -> billow
breton    -> briton
clench    -> clinch
convect   -> convict
crises    -> crisis
diagnoses -> diagnosis
enfant    -> infant
enquiry   -> inquiry
frances   -> francis
galatea   -> galatia
harden    -> hardin
heckman   -> hickman
inequity  -> iniquity
inflect   -> inflict
jacobean  -> jacobian
marten    -> martin
module    -> moduli
pegging   -> pigging
psychoses -> psychosis
rabbet    -> rabbit
sterling  -> stirling
synopses  -> synopsis
vector    -> victor
welles    -> willis

Go

<lang go>package main

import (

   "bytes"
   "fmt"
   "io/ioutil"
   "log"
   "sort"
   "strings"
   "unicode/utf8"

)

func main() {

   wordList := "unixdict.txt"
   b, err := ioutil.ReadFile(wordList)
   if err != nil {
       log.Fatal("Error reading file")
   }
   bwords := bytes.Fields(b)
   var words []string
   for _, bword := range bwords {
       s := string(bword)
       if utf8.RuneCountInString(s) > 5 {
           words = append(words, s)
       }
   }
   count := 0
   le := len(words)
   for _, word := range words {
       if strings.ContainsRune(word, 'e') {
           repl := strings.ReplaceAll(word, "e", "i")
           ix := sort.SearchStrings(words, repl) // binary search
           if ix < le && words[ix] == repl {
               count++
               fmt.Printf("%2d: %-9s -> %s\n", count, word, repl)
           }
       }
   }

}</lang>

Output:
 1: analyses  -> analysis
 2: atlantes  -> atlantis
 3: bellow    -> billow
 4: breton    -> briton
 5: clench    -> clinch
 6: convect   -> convict
 7: crises    -> crisis
 8: diagnoses -> diagnosis
 9: enfant    -> infant
10: enquiry   -> inquiry
11: frances   -> francis
12: galatea   -> galatia
13: harden    -> hardin
14: heckman   -> hickman
15: inequity  -> iniquity
16: inflect   -> inflict
17: jacobean  -> jacobian
18: marten    -> martin
19: module    -> moduli
20: pegging   -> pigging
21: psychoses -> psychosis
22: rabbet    -> rabbit
23: sterling  -> stirling
24: synopses  -> synopsis
25: vector    -> victor
26: welles    -> willis

Julia

See Alternade_words for the foreachword function. <lang julia>e2i(w, d) = (if 'e' in w s = replace(w, "e" => "i"); haskey(d, s) && return "$w => $s" end; "") foreachword("unixdict.txt", e2i, minlen=6, colwidth=23, numcols=4)

</lang>

Output:
Word source: unixdict.txt

analyses => analysis   atlantes => atlantis   bellow => billow       breton => briton
clench => clinch       convect => convict     crises => crisis       diagnoses => diagnosis
enfant => infant       enquiry => inquiry     frances => francis     galatea => galatia
harden => hardin       heckman => hickman     inequity => iniquity   inflect => inflict
jacobean => jacobian   marten => martin       module => moduli       pegging => pigging     
psychoses => psychosis rabbet => rabbit       sterling => stirling   synopses => synopsis
vector => victor       welles => willis

Phix

Please make it stop. <lang Phix>sequence words = get_text("demo/unixdict.txt",GT_LF_STRIPPED) function chei(string word) return substitute(word,"e","i") end function function cheti(string word) return length(word)>5 and find('e',word) and find(chei(word),words) end function sequence chetie = filter(words,cheti), chetei = columnize({chetie,apply(chetie,chei)}) printf(1,"%d words: %v\n",{length(chetei),shorten(chetei,"",2)})</lang>

Output:
26 words: {{"analyses","analysis"},{"atlantes","atlantis"},"...",{"vector","victor"},{"welles","willis"}}

Raku

<lang perl6>my %ei = 'unixdict.txt'.IO.words.grep({ .chars > 5 and /<[ie]>/ }).map: { $_ => .subst('e', 'i', :g) }; put %ei.grep( *.key.contains: 'e' ).grep({ %ei{.value}:exists }).sort.batch(4)».gist».fmt('%-22s').join: "\n";</lang>

Output:
analyses => analysis   atlantes => atlantis   bellow => billow       breton => briton      
clench => clinch       convect => convict     crises => crisis       diagnoses => diagnosis
enfant => infant       enquiry => inquiry     frances => francis     galatea => galatia    
harden => hardin       heckman => hickman     inequity => iniquity   inflect => inflict    
jacobean => jacobian   marten => martin       module => moduli       pegging => pigging    
psychoses => psychosis rabbet => rabbit       sterling => stirling   synopses => synopsis  
vector => victor       welles => willis

REXX

This REXX version doesn't care what order the words in the dictionary are in,   nor does it care what
case  (lower/upper/mixed)  the words are in,   the search for words is   caseless.

It also allows the minimum length to be specified on the command line (CL),   as well as the old character   (that is
to be changed),   the new character   (that is to be changed into),   and as well as the dictionary file identifier. <lang rexx>/*REXX pgm finds words with changed letter E──►I and is a word (in a specified dict).*/ parse arg minL oldC newC iFID . /*obtain optional arguments from the CL*/ if minL== | minL=="," then minL= 6 /*Not specified? Then use the default.*/ if oldC== | oldC=="," then oldC= 'e' /* " " " " " " */ if newC== | newC=="," then newC= 'i' /* " " " " " " */ if iFID== | iFID=="," then iFID='unixdict.txt' /* " " " " " " */ upper oldC newC /*get uppercase versions of OLDC & NEWC*/ @.= /*default value of any dictionary word.*/

          do #=1  while lines(iFID)\==0         /*read each word in the file  (word=X).*/
          x= strip( linein( iFID) )             /*pick off a word from the input line. */
          $.#= x;       upper x;     @.x= $.#   /*save: original case and the old word.*/
          end   /*#*/                           /*Note: the old word case is left as─is*/
  1. = # - 1 /*adjust word count because of DO loop.*/

finds= 0 /*count of changed words found (so far)*/ say copies('─', 30) # "words in the dictionary file: " iFID say

      do j=1  for #;           L= length($.j)   /*process all the words that were found*/
      if L<minL  then iterate                   /*Is word too short?   Then ignore it. */
      y = $.j;                 upper y          /*uppercase the dictionary word.       */
      if pos(oldC, y)==0  then iterate          /*Have the required character? No, skip*/
      new= translate(y, newC, oldC)             /*obtain a changed (translated) word.  */
      if @.new==  then iterate                /*New word in the dict.?   No, skip it.*/
      finds= finds + 1                          /*bump the count of found changed words*/
      say right(left($.j, 20), 40) '──►' @.new  /*indent a bit, display the old & new. */
      end        /*j*/

say /*stick a fork in it, we're all done. */ say copies('─',30) finds " words found that were changed with " oldC '──►' ,

                     newC",  and with a minimum length of "     minL</lang>
output   when using the default inputs:
────────────────────────────── 25104 words in the dictionary file:  unixdict.txt

                    analyses             ──► analysis
                    atlantes             ──► atlantis
                    bellow               ──► billow
                    breton               ──► briton
                    clench               ──► clinch
                    convect              ──► convict
                    crises               ──► crisis
                    diagnoses            ──► diagnosis
                    enfant               ──► infant
                    enquiry              ──► inquiry
                    frances              ──► francis
                    galatea              ──► galatia
                    harden               ──► hardin
                    heckman              ──► hickman
                    inequity             ──► iniquity
                    inflect              ──► inflict
                    jacobean             ──► jacobian
                    marten               ──► martin
                    module               ──► moduli
                    pegging              ──► pigging
                    psychoses            ──► psychosis
                    rabbet               ──► rabbit
                    sterling             ──► stirling
                    synopses             ──► synopsis
                    vector               ──► victor
                    welles               ──► willis

────────────────────────────── 26  words found that were changed with  E ──► I,  and with a minimum length of  6

Ring

<lang ring> load "stdlib.ring"

cStr = read("unixdict.txt") wordList = str2list(cStr) num = 0

see "working..." + nl see "Words are:" + nl

ln = len(wordList) for n = ln to 1 step -1

   if len(wordList[n]) < 6
      del(wordList,n)
   ok

next

for n = 1 to len(wordList)

   ind = substr(wordList[n],"e") 
   if ind > 0
      str = substr(wordList[n],"e","i")
      indstr = find(wordList,str)
      if indstr > 0
         num = num + 1
         see "" + num + ". " + wordList[n] + " => " + str + nl
      ok
   ok 

next

see "done..." + nl </lang>

Output:
working...
Words are:
1. analyses => analysis
2. atlantes => atlantis
3. bellow => billow
4. breton => briton
5. clench => clinch
6. convect => convict
7. crises => crisis
8. diagnoses => diagnosis
9. enfant => infant
10. enquiry => inquiry
11. frances => francis
12. galatea => galatia
13. harden => hardin
14. heckman => hickman
15. inequity => iniquity
16. inflect => inflict
17. jacobean => jacobian
18. marten => martin
19. module => moduli
20. pegging => pigging
21. psychoses => psychosis
22. rabbet => rabbit
23. sterling => stirling
24. synopses => synopsis
25. vector => victor
26. welles => willis
done...

Wren

Library: Wren-sort
Library: Wren-fmt

<lang ecmascript>import "io" for File import "/sort" for Find import "/fmt" for Fmt

var wordList = "unixdict.txt" // local copy var count = 0 var words = File.read(wordList).trimEnd().split("\n").

   where { |w| w.count > 5 }.toList

for (word in words) {

   if (word.contains("e")) {
       var repl = word.replace("e", "i")
       if (Find.first(words, repl) >= 0) {  // binary search
           count = count + 1
           Fmt.print("$2d: $-9s -> $s", count, word, repl)
       }
   }

}</lang>

Output:
 1: analyses  -> analysis
 2: atlantes  -> atlantis
 3: bellow    -> billow
 4: breton    -> briton
 5: clench    -> clinch
 6: convect   -> convict
 7: crises    -> crisis
 8: diagnoses -> diagnosis
 9: enfant    -> infant
10: enquiry   -> inquiry
11: frances   -> francis
12: galatea   -> galatia
13: harden    -> hardin
14: heckman   -> hickman
15: inequity  -> iniquity
16: inflect   -> inflict
17: jacobean  -> jacobian
18: marten    -> martin
19: module    -> moduli
20: pegging   -> pigging
21: psychoses -> psychosis
22: rabbet    -> rabbit
23: sterling  -> stirling
24: synopses  -> synopsis
25: vector    -> victor
26: welles    -> willis