Odd words

From Rosetta Code
Odd words is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

Given a list of words   (using the words from the dictionary:   unixdict.txt).

Take odd indices letters from the word,   and if it's in the list,   then display the   odd word   on this page.

The length of the   odd word   should be   >   4.


Other tasks related to string operations:
Metrics
Counting
Remove/replace
Anagrams/Derangements/shuffling
Find/Search/Determine
Formatting
Song lyrics/poems/Mad Libs/phrases
Tokenize
Sequences



Factor

This is basically the same program as https://rosettacode.org/wiki/Alternade_words#Factor. <evens> is a virtual sequence representing the (zero-based) even indices of the input sequence, which this task calls the odd indices.

Works with: Factor version 0.99 2020-08-14

<lang factor>USING: formatting hash-sets io io.encodings.ascii io.files kernel literals math sequences sequences.extras sets strings ;

<< CONSTANT: words $[ "unixdict.txt" ascii file-lines ] >>

CONSTANT: wordset $[ words >hash-set ]

odd ( str -- newstr ) <evens> >string ;

"Odd words > 4:" print words [ length 8 > ] filter [ odd wordset in? ] filter [ dup odd "%-15s %s\n" printf ] each</lang>

Output:
Odd words > 4:
barbarian       brain
childbear       cider
corrigenda      cried
gargantuan      grata
headdress       hades
palladian       plain
propionate      point
salvation       slain
siltation       slain
slingshot       sight
statuette       saute
supersede       spree
supervene       spree
terminable      trial

FreeBASIC

<lang freebasic>#define NULL 0

type node

   word as string*32   'enough space to store any word in the dictionary
   nxt as node ptr

end type

function addword( tail as node ptr, word as string ) as node ptr

   'allocates memory for a new node, links the previous tail to it,
   'and returns the address of the new node
   dim as node ptr newnode = allocate(sizeof(node))
   tail->nxt = newnode
   newnode->nxt = NULL
   newnode->word = word
   return newnode

end function

function crunch( word as string ) as string

   dim as string ret = ""
   for i as uinteger = 1 to len(word) step 2
       ret += mid(word,i,1)
   next i
   return ret

end function

function length( word as string ) as uinteger

   'necessary replacement for the built-in len function, which in this
   'case would always return 32
   for i as uinteger = 1 to 32
       if asc(mid(word,i,1)) = 0 then return i-1
   next i
   return 999

end function

dim as string word dim as node ptr tail = allocate( sizeof(node) ) dim as node ptr head = tail, curr = head, currj tail->nxt = NULL tail->word = "XXXXHEADER"

open "unixdict.txt" for input as #1 while true

   line input #1, word
   if word = "" then exit while
   tail = addword( tail, word )

wend close #1

while curr->nxt <> NULL

   if length(curr->word) > 8 then word = crunch( curr->word ) else goto nextword
   currj = head
   while currj->nxt <> NULL
       if word = currj->word then print left(curr->word,length(curr->word));"   --->   ";word
       currj = currj->nxt
   wend
   nextword:
   curr = curr->nxt

wend</lang>

Output:
barbarian   --->   brain
childbear   --->   cider
corrigenda   --->   cried
gargantuan   --->   grata
headdress   --->   hades
palladian   --->   plain
propionate   --->   point
salvation   --->   slain
siltation   --->   slain
slingshot   --->   sight
statuette   --->   saute
supersede   --->   spree
supervene   --->   spree
terminable   --->   trial

And to discourage the creation of a whole new task for the even words, here they are. It requires only changing a 1 to a 2 in line 20, and an 8 to a 9 in line 50.

cannonball   --->   annal
importation   --->   motto
psychopomp   --->   scoop
starvation   --->   train
upholstery   --->   posey

Go

<lang go>package main

import (

   "bytes"
   "fmt"
   "io/ioutil"
   "log"
   "sort"
   "strings"

)

func main() {

   wordList := "unixdict.txt"
   b, err := ioutil.ReadFile(wordList)
   if err != nil {
       log.Fatal("Error reading file")
   }
   bwords := bytes.Fields(b)
   words := make([]string, len(bwords))
   for i, bword := range bwords {
       words[i] = string(bword)
   }
   count := 0
   fmt.Println("The odd words with length > 4 in", wordList, "are:")
   for _, word := range words {
       rword := []rune(word) // in case any non-ASCII
       if len(rword) > 8 {
           var sb strings.Builder
           for i := 0; i < len(rword); i += 2 {
               sb.WriteRune(rword[i])
           }
           s := sb.String()
           idx := sort.SearchStrings(words, s)      // binary search
           if idx < len(words) && words[idx] == s { // check not just an insertion point
               count = count + 1
               fmt.Printf("%2d: %-12s -> %s\n", count, word, s)
           }
       }
   }

}</lang>

Output:
The odd words with length > 4 in unixdict.txt are:
 1: barbarian    -> brain
 2: childbear    -> cider
 3: corrigenda   -> cried
 4: gargantuan   -> grata
 5: headdress    -> hades
 6: palladian    -> plain
 7: propionate   -> point
 8: salvation    -> slain
 9: siltation    -> slain
10: slingshot    -> sight
11: statuette    -> saute
12: supersede    -> spree
13: supervene    -> spree
14: terminable   -> trial

Perl

<lang perl>#!/usr/bin/perl

@ARGV = 'unixdict.txt'; chomp( my @words = <> ); my %dict; @dict{ grep length > 4, @words} = (); for ( @words )

 {
 my $oddword = s/(.).?/$1/gr;
 exists $dict{$oddword} and print " $_ $oddword\n";
 }</lang>
Output:
 barbarian brain
 childbear cider
 corrigenda cried
 gargantuan grata
 headdress hades
 palladian plain
 propionate point
 salvation slain
 siltation slain
 slingshot sight
 statuette saute
 supersede spree
 supervene spree
 terminable trial

Phix

<lang Phix>sequence words = split_any(get_text("demo/unixdict.txt")," \r\n") function odd(integer /*ch*/, idx) return remainder(idx,2)=1 end function function oddch(string word) return filter(word,odd) end function function over4(string word) return length(word)>4 end function words = filter(filter(apply(words,oddch),over4),"in",words) printf(1,"%d odd words found: %s\n",{length(words),join(shorten(words,"",3),", ")})</lang>

Output:
14 odd words found: brain, cider, cried, ..., spree, spree, trial

Alternative

Slightly more traditional, same output. <lang Phix>sequence words = split_any(get_text("demo/unixdict.txt")," \r\n"),

        res = {}

for i=1 to length(words) do

   string word = words[i], wodd = ""
   for oddchar=1 to length(word) by 2 do
       wodd &= word[oddchar]
   end for
   if length(wodd)>4 
   and find(wodd,words) then
       res = append(res,wodd)
   end if

end for printf(1,"%d odd words found: %s\n",{length(res),join(shorten(res,"",3),", ")})</lang>

Raku

<lang perl6>my %words = 'unixdict.txt'.IO.slurp.words.map: * => 1;

my (@odds, @evens);

for %words {

   next if .key.chars < 9;
   my $odd  = .key.comb[0,2 … *].join;
   @odds.push(.key => $odd) if %words{$odd} and $odd.chars > 4;
   my $even = .key.comb[1,3 … *].join;
   @evens.push(.key => $even) if %words{$even} and $even.chars > 4;

}

.put for flat 'Odd words > 4:', @odds.sort;

.put for flat "\nEven words > 4:", @evens.sort;</lang>

Output:
Odd words > 4:
barbarian	brain
childbear	cider
corrigenda	cried
gargantuan	grata
headdress	hades
palladian	plain
propionate	point
salvation	slain
siltation	slain
slingshot	sight
statuette	saute
supersede	spree
supervene	spree
terminable	trial

Even words > 4:
cannonball	annal
importation	motto
psychopomp	scoop
starvation	train
upholstery	posey

REXX

version 1

<lang rwxx>/* REXX */ fid='d:\unix.txt' ww.=0 /* ww.* the words to be analyzed */ w.=0 /* w.word = 1 if word is in unix.txt */ Do While lines(fid)>0

 l=linein(fid)     /* a word                 */
 ll=length(l)
 w.l=1             /*  word is in unix.txt   */
 If ll>=9 Then Do  /* worth to be analyzed   */
   z=ww.0+1        /* add it to the list     */
   ww.z=l
   ww.0=z
   End
 End

n=0 Do i=1 To ww.0

 wodd=wodd(ww.i)
 If w.wodd Then Do
   n=n+1
   Say format(n,3) left(ww.i,10) wodd
   End
 End

Exit wodd: Procedure /* use odd indexed letters */

 Parse Arg w
 wo=
 Do i=1 To length(w)
   If i//2=1 Then
     wo=wo||substr(w,i,1)
   End
 Return wo</lang>
Output:
  1 barbarian  brain
  2 childbear  cider
  3 corrigenda cried
  4 gargantuan grata
  5 headdress  hades
  6 palladian  plain
  7 propionate point
  8 salvation  slain
  9 siltation  slain
 10 slingshot  sight
 11 statuette  saute
 12 supersede  spree
 13 supervene  spree
 14 terminable trial

version 2, caseless

This REXX version doesn't care what order the words in the dictionary are in,   nor does it care what
case  (lower/upper/mixed)  the words are in,   the search for alternades is   caseless.

It also allows the minimum length to be specified on the command line (CL) as well as the dictionary file identifier. <lang rexx>/*REXX program finds all the caseless "odd words" (within an identified dictionary). */ parse arg minL iFID . /*obtain optional arguments from the CL*/ if minL== | minL=="," then minL= 5 /*Not specified? Then use the default.*/ if iFID== | iFID=="," then iFID='unixdict.txt' /* " " " " " " */ @.= /*default value of any dictionary word.*/

       do #=1  while lines(iFID)\==0            /*read each word in the file  (word=X).*/
       x= strip( linein( iFID) )                /*pick off a word from the input line. */
       $.#= x;       upper x;          @.x= .   /*save: original case and the semaphore*/
       end   /*#*/                              /* [↑]   semaphore name is uppercased. */

minW= minL * 2 - 1 /*minimum width of a word to be usable.*/ say copies('─', 30) # "words in the dictionary file: " iFID ows= 0 /*count of the "odd words" found. */

       do j=1  for #-1;       L= length($.j)    /*process all the words that were found*/
       if L<minW  then iterate                  /*Is word too short?   Then ignore it. */
       ow=                                      /*initialize the  "odd word".          */
                  do k=1  by 2  to  L           /*only use odd indexed letters in word.*/
                  ow= ow  ||  substr($.j, k, 1) /*construct the  "odd word".           */
                  end   /*k*/
       owU= ow;               upper owU         /*uppercase the odd word to be caseless*/
       if @.owU==  then iterate               /*if not extant,  then skip this word. */
       ows= ows + 1                             /*bump the count of "odd words" found. */
       say right(left($.j, 20), 24) left(ow, 9) /*indent original word for readability.*/
       end        /*j*/

say copies('─', 30) ows ' "odd words" found with a minimum length of ' minL</lang>

output   when using the default input:
────────────────────────────── 25105 words in the dictionary file:  unixdict.txt
    barbarian            brain
    childbear            cider
    corrigenda           cried
    gargantuan           grata
    headdress            hades
    palladian            plain
    propionate           point
    salvation            slain
    siltation            slain
    slingshot            sight
    statuette            saute
    supersede            spree
    supervene            spree
    terminable           trial
────────────────────────────── 14  "odd words" found with a minimum length of  5

Ring

<lang ring> cStr = read("unixdict.txt") wordList = str2list(cStr) num = 0

see "Odd words are:" + nl

for n = 1 to len(wordList)

   strWord = ""
   len = len(wordList[n])
   for m = 1 to len step 2
       strWord = strWord + wordList[n][m]
   next
   ind = find(wordList,strWord) 
   if ind > 0  and len(strWord) > 4
      num = num + 1
      see "" + num + ". " + wordList[n] + " >> " + strWord + nl
   ok

next </lang> Output:

Odd words are:
1. barbarian >> brain
2. childbear >> cider
3. corrigenda >> cried
4. gargantuan >> grata
5. headdress >> hades
6. palladian >> plain
7. propionate >> point
8. salvation >> slain
9. siltation >> slain
10. slingshot >> sight
11. statuette >> saute
12. supersede >> spree
13. supervene >> spree
14. terminable >> trial

Wren

Library: Wren-fmt
Library: Wren-sort
Library: Wren-trait

<lang ecmascript>import "io" for File import "/fmt" for Fmt import "/sort" for Find import "/trait" for Stepped

var wordList = "unixdict.txt" // local copy var words = File.read(wordList).trimEnd().split("\n") var count = 0 System.print("The odd words with length > 4 in %(wordList) are:") for (word in words) {

   if (word.count > 8) {
       var s = ""
       var chars = word.toList // in case any non-ASCII
       for (i in Stepped.new(0...chars.count, 2)) s = s + chars[i]
       if (Find.first(words, s) >= 0) { // binary search
           count = count + 1
           Fmt.print("$2d: $-12s -> $s", count, word, s)
       }
   }

}</lang>

Output:
The odd words with length > 4 in unixdict.txt are:
 1: barbarian    -> brain
 2: childbear    -> cider
 3: corrigenda   -> cried
 4: gargantuan   -> grata
 5: headdress    -> hades
 6: palladian    -> plain
 7: propionate   -> point
 8: salvation    -> slain
 9: siltation    -> slain
10: slingshot    -> sight
11: statuette    -> saute
12: supersede    -> spree
13: supervene    -> spree
14: terminable   -> trial