Unique characters in each string: Difference between revisions

Content added Content deleted

Inline

Revision as of 23:57, 12 October 2021

Task

Given a list of strings, find the characters appearing exactly once in each string.

The result should be in alphabetical order.

Use the following list for this task:

        ["1a3c52debeffd", "2b6178c97a938stf", "3ycxdb1fgxa2yz"]

For this list, the result would be: 1 2 3 a b c

Other tasks related to string operations:

Metrics

Counting

Remove/replace

Anagrams/Derangements/shuffling

Find/Search/Determine

Formatting

Song lyrics/poems/Mad Libs/phrases

Tokenize

Sequences

Ada

<lang Ada>with Ada.Text_Io;

procedure Unique_Characters is

  type Occurence_Count is array (Character) of Natural;
  type Occurence_List  is array (Positive range <>) of Occurence_Count;

  function Occurences (Item : String) return Occurence_Count is
     Count : Occurence_Count := (others => 0);
  begin
     for C of Item loop
        Count (C) := Count (C) + 1;
     end loop;
     return Count;
  end Occurences;

  procedure Put_Unique (List : Occurence_List) is
     use Ada.Text_Io;
  begin
     for C in List (List'First)'Range loop
        if (for all I in List'Range => List (I) (C) = 1) then
           Put (C);
           Put (' ');
        end if;
     end loop;
  end Put_Unique;

begin

  Put_Unique ((1 => Occurences ("1a3c52debeffd"),
               2 => Occurences ("2b6178c97a938stf"),
               3 => Occurences ("3ycxdb1fgxa2yz")));

end Unique_Characters;</lang>

Output:

1 2 3 a b c

AppleScript

AppleScriptObjC

The filtering here is case sensitive, the sorting dependent on locale.

<lang applescript>use AppleScript version "2.4" -- OS X 10.10 (Yosemite) or later use framework "Foundation"

on uniqueCharactersInEachString(listOfstrings)

   set astid to AppleScript's text item delimiters
   set AppleScript's text item delimiters to ""
   set countedSet to current application's class "NSCountedSet"'s setWithArray:(characters of (listOfstrings as text))
   set AppleScript's text item delimiters to astid
   set mutableSet to current application's class "NSMutableSet"'s setWithSet:(countedSet)
   repeat with thisString in listOfstrings
       tell mutableSet to intersectSet:(current application's class "NSSet"'s setWithArray:(thisString's characters))
       tell countedSet to minusSet:(mutableSet)
   end repeat
   tell mutableSet to minusSet:(countedSet)
   set sortDescriptor to current application's class "NSSortDescriptor"'s sortDescriptorWithKey:("self") ¬
       ascending:(true) selector:("localizedStandardCompare:")
   
   return (mutableSet's sortedArrayUsingDescriptors:({sortDescriptor})) as list

end uniqueCharactersInEachString

uniqueCharactersInEachString({"1a3c52debeffd", "2b6178c97a938stf", "3ycxdb1fgxa2yz"})</lang>

Output:

Core language only

This can be case-insensitive if required. (Just leave out the 'considering case' statement round the call to the handler). The requirement for AppleScript 2.3.1 is only for the 'use' command which loads the "Heap Sort" script. If "Heap Sort"'s instead loaded with the older 'load script' command or copied into the code, this will work on systems as far back as Mac OS X 10.5 (Leopard) and possibly earlier. Same output as above.

<lang applescript>use AppleScript version "2.3.1" -- OS X 10.9 (Mavericks) or later use sorter : script "Heap Sort" -- <https://www.rosettacode.org/wiki/Sorting_algorithms/Heapsort#AppleScript>

on uniqueCharactersInEachString(listOfStrings)

   script o
       property allCharacters : {}
       property uniques : {}
       
       on isInAllStrings(thisCharacter)
           repeat with thisString in listOfStrings
               if (thisCharacter is not in thisString) then return false
           end repeat
           
           return true
       end isInAllStrings
   end script
   
   set astid to AppleScript's text item delimiters
   set AppleScript's text item delimiters to ""
   set o's allCharacters to text items of (listOfStrings as text)
   set AppleScript's text item delimiters to astid
   
   set characterCount to (count o's allCharacters)
   tell sorter to sort(o's allCharacters, 1, characterCount)
   
   set i to 1
   set stringCount to (count listOfStrings)
   set currentCharacter to beginning of o's allCharacters
   repeat with j from 2 to characterCount
       set thisCharacter to item j of o's allCharacters
       if (thisCharacter is not currentCharacter) then
           if ((j - i = stringCount) and (o's isInAllStrings(currentCharacter))) then ¬
               set end of o's uniques to currentCharacter
           set i to j
           set currentCharacter to thisCharacter
       end if
   end repeat
   if ((j + 1 - i = stringCount) and (o's isInAllStrings(currentCharacter))) then ¬
       set end of o's uniques to currentCharacter
   
   return o's uniques

end uniqueCharactersInEachString

considering case

   uniqueCharactersInEachString({"1a3c52debeffd", "2b6178c97a938stf", "3ycxdb1fgxa2yz"})

end considering</lang>

Arturo

<lang rebol>arr: ["1a3c52debeffd" "2b6178c97a938stf" "3ycxdb1fgxa2yz"] uniques: split first arr

loop arr 'str [

   uniques: intersection uniques 
       select split str 'x -> 
           1 = size match str x

]

print sort uniques</lang>

Output:

1 2 3 a b c

AWK

syntax: GAWK -f UNIQUE_CHARACTERS_IN_EACH_STRING.AWK
sorting:
PROCINFO["sorted_in"] is used by GAWK
SORTTYPE is used by Thompson Automation's TAWK

BEGIN {

   PROCINFO["sorted_in"] = "@ind_str_asc" ; SORTTYPE = 1
   n = split("1a3c52debeffd,2b6178c97a938stf,3ycxdb1fgxa2yz",arr1,",")
   for (i=1; i<=n; i++) {
     str = arr1[i]
     printf("%s\n",str)
     total_c += leng = length(str)
     for (j=1; j<=leng; j++) {
       arr2[substr(str,j,1)][i]++
     }
   }
   for (c in arr2) {
     flag = 0
     for (i=1; i<=n; i++) {
       if (arr2[c][i] != 1) {
         flag = 1
       }
     }
     if (flag == 0) {
       rec = sprintf("%s%s",rec,c)
     }
   }
   printf("%d strings, %d characters, %d different, %d unique: %s\n",n,total_c,length(arr2),length(rec),rec)
   exit(0)

} </lang>

Output:

1a3c52debeffd
2b6178c97a938stf
3ycxdb1fgxa2yz
3 strings, 43 characters, 20 different, 6 unique: 123abc

F#

Output:

123abc

Factor

Works with: Factor version 0.99 2021-02-05

<lang factor>USING: io kernel sequences.interleaved sets sorting ;

{ "1a3c52debeffd" "2b6178c97a938sf" "3ycxdb1fgxa2yz" } [ intersect-all ] [ [ duplicates ] gather without ] bi natural-sort CHAR: space <interleaved> print

! How it works: ! intersect-all obtain elements present in every string -> "1a3c2bf" ! [ duplicates ] gather obtain elements that repeat within a single string -> "efd798xy" ! without from the first string, remove elements that are in the second -> "1a3c2b"</lang>

Output:

1 2 3 a b c

Go

<lang go>package main

import (

   "fmt"
   "sort"

)

func main() {

   strings := []string{"1a3c52debeffd", "2b6178c97a938stf", "3ycxdb1fgxa2yz"}
   u := make(map[rune]int)
   for _, s := range strings {
       m := make(map[rune]int)
       for _, c := range s {
           m[c]++
       }
       for k, v := range m {
           if v == 1 {
               u[k]++
           }
       }
   }
   var chars []rune
   for k, v := range u {
       if v == 3 {
           chars = append(chars, k)
       }
   }
   sort.Slice(chars, func(i, j int) bool { return chars[i] < chars[j] })
   fmt.Println(string(chars))

}</lang>

Output:

123abc

Haskell

<lang haskell>import qualified Data.Map.Strict as M import Data.Maybe (fromJust) import qualified Data.Set as S

onceInEach :: [String] -> String onceInEach [] = [] onceInEach ws@(x : xs) =

 let freq = charCounts (concat ws)
     wordCount = length ws
  in filter
       ((wordCount ==) . fromJust . flip M.lookup freq)
       ( S.elems $
           foldr
             (S.intersection . S.fromList)
             (S.fromList x)
             xs
       )

TEST -------------------------

main :: IO () main =

 (putStrLn . onceInEach)
   [ "1a3c52debeffd",
     "2b6178c97a938stf",
     "3ycxdb1fgxa2yz"
   ]

GENERIC ------------------------

charCounts :: String -> M.Map Char Int charCounts =

 foldr
   (flip (M.insertWith (+)) 1)
   M.empty</lang>

Output:

123abc

jq

Works with: jq

Works with gojq, the Go implementation of jq

Helper functions <lang jq># bag of words def bow(stream):

 reduce stream as $word ({}; .[($word|tostring)] += 1);

input: an array of arrays that represent sets
output: a stream of the items in all the input arrays

def intersections:

 # intersection of (two) sets
 # If a and b are sorted lists, and if all the elements respectively of a and b are distinct,
 # then [a,b] | ios will emit the stream of elements in the set-intersection of a and b.
 def ios:
   .[0] as $a | .[1] as $b
   | if 0 == ($a|length) or 0 == ($b|length) then empty
     elif $a[0] == $b[0] then $a[0], ([$a[1:], $b[1:]] | ios)
     elif $a[0]  < $b[0] then [$a[1:], $b] | ios
     else [$a, $b[1:]] | ios
     end;

  if length == 0 then empty
  elif length == 1 then .[0][]
  elif length == 2 then ios
  elif (.[0]|length) == 0 then empty
  else [.[0], [ .[1:] | intersections]] | ios
  end;</lang>

The task <lang jq>def once_in_each_string:

 # convert each string to an array of the constituent characters
   map((explode | map([.]|implode)))
 # identify the singleton characters in each string; `keys` sorts the keys
   | map( bow(.[]) | with_entries(select(.value==1)) | keys)
   | intersections ;

["1a3c52debeffd", "2b6178c97a938stf", "3ycxdb1fgxa2yz"] | [once_in_each_string] </lang>

Output:

["1","2","3","a","b","c"]

Julia

<lang julia>list = ["1a3c52debeffd", "2b6178c97a938stf", "3ycxdb1fgxa2yz"]

onceineachstring(list) = filter(c -> all(w -> count(x -> x == c, w) == 1, list), (sort ∘ unique ∘ prod)(list))

println(onceineachstring(list))

</lang>

Output:


['1', '2', '3', 'a', 'b', 'c']

Nim

<lang Nim>import strutils, tables

var result = AllChars for str in ["1a3c52debeffd", "2b6178c97a938stf", "3ycxdb1fgxa2yz"]:

 let charCount = str.toCountTable      # Mapping char -> count.
 var uniqueChars: set[char]            # Set of unique chars.
 for ch, count in charCount.pairs:
   if count == 1: uniqueChars.incl ch
 result = result * uniqueChars         # Intersection.

echo result</lang>

Output:

{'1', '2', '3', 'a', 'b', 'c'}

Perl

<lang perl>#!/usr/bin/perl

use strict; # https://rosettacode.org/wiki/Unique_characters_in_each_string use warnings;

my @strings = ("1a3c52debeffd", "2b6178c97a938stf", "3ycxdb1fgxa2yz"); my $chars = join "\n", @strings; print "@{[ sort grep

 $chars !~ /$_.*$_/ &&           # the 'only once in each string' test
 @strings == $chars =~ s/$_//g,  # the 'in every string' test
 $chars =~ /./g ]}\n";</lang>

Output:

1 2 3 a b c

Phix

include builtins\sets.e

function once(integer ch, i, string s)
    integer l = length(s)
    return (i=1 or ch!=s[i-1])
       and (i=l or ch!=s[i+1])
end function

sequence set = {"1a3c52debeffd", "2b6178c97a938stf", "3ycxdb1fgxa2yz"},
         res = intersection(apply(true,filter,{apply(set,sort),once}))
printf(1,"found %d unique common characters: %s\n",{length(res),res})

Output:

found 6 unique common characters: 123abc

PicoLisp

<lang PicoLisp>(de acc (V K N)

  (if (assoc K (val V))
     (inc (nth (cadr @) N))
     (push V (list K (list 1 0 0))) ) )

(de un (Lst)

  (let (Len (length Lst)  D)
     (for (I . Lst) (mapcar chop Lst)
        (for L Lst
           (acc 'D L I) ) )
     (mapcar
        car
        (by
           car
           sort
           (filter '((L) (fully =1 (cadr L))) D) ) ) ) )

(println

  (un
     (quote
        "1a3c52debeffd"
        "2b6178c97a938stf"
        "3ycxdb1fgxa2yz" ) ) )</lang>

Output:

("1" "2" "3" "a" "b" "c")

Python

<lang python>LIST = ["1a3c52debeffd", "2b6178c97a938stf", "3ycxdb1fgxa2yz"]

print(sorted([ch for ch in set([c for c in .join(LIST)]) if all(w.count(ch) == 1 for w in LIST)]))

</lang>

Output:

['1', '2', '3', 'a', 'b', 'c']

Raku

<lang perl6>my $strings = <1a3c52debeffd 2b6178c97a938stf 3ycxdb1fgxa2yz>;

put sort keys [∩] $strings.map: *.comb.Bag.grep: *.value == 1</lang>

Output:

1 2 3 a b c

REXX

This REXX program doesn't assume ASCII (or any other) order. This example was run on an ASCII machine.

If this REXX program is run on an ASCII machine, it will use the ASCII order of characters, in this case,
decimal digits, uppercase Latin letters, and then lowercase Latin letters, with other characters interspersed.

On an EBCDIC machine, the order would be lowercase Latin letters, uppercase Latin letters, and then the
decimal digits, with other characters interspersed.

On an EBCDIC machine, the lowercase letters and the uppercase letters aren't contiguous. <lang rexx>/*REXX pgm finds and shows characters that are unique in each string and once only. */ parse arg $ /*obtain optional arguments from the CL*/ if $= | $="," then $= '1a3c52debeffd' "2b6178c97a938stf" '3ycxdb1fgxa2yz' if $= then do; say "***error*** no lists were specified."; exit 13; end

= words($); $$= /*#: # words in $; $$: $ with no blanks*/

             do i=1  for #;    !.i= word($, i)  /*for speed, build a list of words in $*/
             $$= $$  ||  !.i                    /*build a list of all the strings.     */
             end   /*i*/

@= /*will be a list of all unique chars. */

  do j=0  for 256;        x= d2c(j)             /*process all the possible characters. */
  if pos(x, $$)==0               then iterate   /*Char not found in any string in  $ ? */
          do k=1  for #;  _= pos(x, !.k)        /*examine each string in the  $  list. */
          if _==0                then iterate j /*Character not found?   Then skip it. */
          if pos(x, !.k, _+1)>0  then iterate j /*    "     is a dup?      "    "   "  */
          end   /*k*/
  @= @ x                                        /*append a character, append it to list*/
  end   /*j*/                                   /*stick a fork in it,  we're all done. */

@@= space(@, 0); L= length(@@) /*elided superfluous blanks; get length*/ if @@== then @= " (none)" /*if none were found, pretty up message*/ if L==0 then L= "no" /*do the same thing for the # of chars.*/ say 'unique characters are: ' @ /*display the unique characters found. */ say say 'Found ' L " unique characters." /*display the # of unique chars found. */</lang>

output when using the default input:

unique characters are:   1 2 3 a b c

Found  6  unique characters.

Ring

<lang ring> see "working..." + nl see "Unique characters in each string are:" + nl row = 0 str = "" cList = [] uniqueChars = ["1a3c52debeffd", "2b6178c97a938stf", "3ycxdb1fgxa2yz"] lenChars = len(uniqueChars)

for n = 1 to lenChars

   str = str + uniqueChars[n]

   flag = 1
   for m = 1 to lenChars
       cnt = count(uniqueChars[m],str[n])
       if cnt != 1
          flag = 0
          exit
       ok
   next
   if flag = 1
      ind = find(cList,str[n])
      if ind = 0
         add(cList,str[n])
      ok
   ok

next cList = sort(cList) for n = 1 to len(cList)

   row = row + 1
   see "" + cList[n] + " "

next see nl

see "Found " + row + " unique characters in each string" + nl see "done..." + nl

func count(cString,dString)

    sum = 0
    while substr(cString,dString) > 0
          sum++
          cString = substr(cString,substr(cString,dString)+len(string(sum)))
    end
    return sum

</lang>

Output:

working...
Unique characters in each string are:
1 2 3 a b c 
Found 6 unique characters in each string
done...

Ruby

<lang ruby>arr = ["1a3c52debeffd", "2b6178c97a938stf", "3ycxdb1fgxa2yz"]

uniqs_in_str = arr.map{|str| str.chars.tally.filter_map{|char, count| char if count == 1} } puts uniqs_in_str.inject(&:intersection).sort.join(" ") </lang>

Output:

1 2 3 a b c

Wren

Library: Wren-seq

Library: Wren-sort

<lang ecmascript>import "/seq" for Lst import "/sort" for Sort

var strings = ["1a3c52debeffd", "2b6178c97a938stf", "3ycxdb1fgxa2yz"] var uniqueChars = [] for (s in strings) {

   var u = Lst.individuals(s.toList).where { |l| l[1] == 1 }.map { |l| l[0] }
   uniqueChars.addAll(u)

} var n = strings.count uniqueChars = Lst.individuals(uniqueChars).where { |l| l[1] == n }.map { |l| l[0] }.toList Sort.insertion(uniqueChars) System.print("Found %(uniqueChars.count) unique character(s) common to each string, namely:") System.print(uniqueChars.join(" "))</lang>

Output:

Found 6 unique character(s) common to each string, namely:
1 2 3 a b c

Revision as of 23:45, 12 October 2021 (view source) Hout (talk \| contribs) (→‎{{header\|Haskell}}: Added a Haskell version) ← Older edit		Revision as of 23:57, 12 October 2021 (view source) Hout (talk \| contribs) m (→‎{{header\|Haskell}}: Pruned unused imports) Newer edit →
Line 256: =={{header\|Haskell}}== <lang haskell>import qualified Data.~~List~~Map.Strict ~~(group,~~as ~~sort)~~M ~~import qualified Data.Map.Strict as M~~ import Data.Maybe (fromJust) import qualified Data.Set as S