Rep-string: Difference between revisions

From Rosetta Code
Content added Content deleted
No edit summary
Line 73: Line 73:
0 0
0 0
1 (no repeat)</pre>
1 (no repeat)</pre>

=={{header|Icon}} and {{header|Unicon}}==

The following works in both languages.

<lang unicon>procedure main(A)
every write(s := !A,": ",(repString(s) | "Not a rep string!")\1)
end

procedure repString(s)
rs := s[1+:*s/2]
while (*rs > 0) & (s ~== lrepl(rs,*s,rs)) do rs := rs[1:-1]
return (*rs > 0, rs)
end

procedure lrepl(s1,n,s2) # The standard left() procedure won't work.
while *s1 < n do s1 ||:= s2
return s1[1+:n]
end</lang>

Sample output:
<pre>
->rs 1110111011 0010010010 1010101010 1111111111 0100101101 0100100 101 11 1
1110111011: 1110
0010010010: 001
1010101010: 1010
1111111111: 11111
0100101101: Not a rep string!
0100100: 010
101: Not a rep string!
11: 1
1: Not a rep string!
->
<pre>


=={{header|J}}==
=={{header|J}}==

Revision as of 15:49, 12 July 2013

Rep-string is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

Given a series of ones and zeroes in a string, define a repeated string or rep-string as a string which is created by repeating a substring of the first N characters of the string truncated on the right to the length of the input string, and in which the substring appears repeated at least twice in the original.

For example, the string '10011001100' is a rep-string as the leftmost four characters of '1001' are repeated three times and truncated on the right to give the original string.

Note that the requirement for having the repeat occur two or more times means that the repeating unit is never longer than half the length of the input string.

The task is to:

  • Write a function/subroutine/method/... that takes a string and returns an indication of if it is a rep-string and the repeated string. (Either the string that is repeated, or the number of repeated characters would suffice).
  • There may be multiple sub-strings that make a string a rep-string - in that case an indication of all, or the longest, or the shortest would suffice.
  • Use the function to indicate the repeating substring if any, in the following:
'1001110011'
'1110111011'
'0010010010'
'1010101010'
'1111111111'
'0100101101'
'0100100'
'101'
'11'
'00'
'1'
  • Show your output on this page.

D

Two different algorithms. The second is from the Perl 6 entry. <lang d>import std.stdio, std.string, std.conv, std.range, std.string,

      std.algorithm;

size_t repString1(in string s) pure /*nothrow*/ {

   foreach_reverse (immutable n; 1 .. s.length / 2 + 1)
       if (s.take(n).cycle.take(s.length).equal(s))
           return n;
   return 0;

}

size_t repString2(in string s) /*pure nothrow*/ {

   immutable bits = s.to!ulong(2);
   foreach_reverse (immutable left; 1 .. s.length / 2 + 1) {
       immutable right = s.length - left;
       if ((bits ^ (bits >> left)) == ((bits >> right) << right))
           return left;
   }
   return 0;

}

void main() {

   immutable words = "1001110011 1110111011 0010010010 1010101010
                      1111111111 0100101101 0100100 101 11 00 1";
   foreach (w; words.split) {
       immutable r1 = w.repString1;
       assert(r1 == w.repString2);
       if (r1)
           // w.chunks(r).join(" ").writeln;
           std.range.chunks(w.dtext, r1).join(" ").writeln;
       else
           writeln(w, " (no repeat)");
   }

}</lang>

Output:
10011 10011
1110 1110 11
001 001 001 0
1010 1010 10
11111 11111
0100101101 (no repeat)
010 010 0
101 (no repeat)
1 1
0 0
1 (no repeat)

Icon and Unicon

The following works in both languages.

<lang unicon>procedure main(A)

  every write(s := !A,": ",(repString(s) | "Not a rep string!")\1)

end

procedure repString(s)

   rs := s[1+:*s/2]
   while (*rs > 0) & (s ~== lrepl(rs,*s,rs)) do rs := rs[1:-1]
   return (*rs > 0, rs)

end

procedure lrepl(s1,n,s2) # The standard left() procedure won't work.

   while *s1 < n do s1 ||:= s2
   return s1[1+:n]

end</lang>

Sample output:

->rs 1110111011 0010010010 1010101010 1111111111 0100101101 0100100 101 11 1
1110111011: 1110
0010010010: 001
1010101010: 1010
1111111111: 11111
0100101101: Not a rep string!
0100100: 010
101: Not a rep string!
11: 1
1: Not a rep string!
->
<pre>

=={{header|J}}==
Here's a test:

<lang j>isRepStr=: +./@((] -: $@] $ $)"0 1~1+[:i.<.@-:@#)</lang>

Example use:

<lang j>   isRepStr'1001110011'
1
   isRepStr'1110111011'
1
   isRepStr'0010010010'
1
   isRepStr'1010101010'
1
   isRepStr'1111111111'
1
   isRepStr'0100101101'
0
   isRepStr'0100100'
1
   isRepStr'101'
0
   isRepStr'11'
1
   isRepStr'00'
1
   isRepStr'1'
0</lang>

We could also report the lengths of the repeated prefix, though this seems more arbitrary:

<lang j>nRepStr=: 0 -.~ (([ * ] -: $@] $ $)"0 1~1+[:i.<.@-:@#)</lang>

With the above examples:

<lang j>   nRepStr'1001110011'
5
   nRepStr'1110111011'
4
   nRepStr'0010010010'
3
   nRepStr'1010101010'
2 4
   nRepStr'1111111111'
1 2 3 4 5
   nRepStr'0100101101'

   nRepStr'0100100'
3
   nRepStr'101'

   nRepStr'11'
1
   nRepStr'00'
1
   nRepStr'1'

</lang>

Here, the "non-str-rep" cases are indicated by an empty list of prefix lengths.

=={{header|NetRexx}}==
{{trans|REXX}}
<lang NetRexx>/* NetRexx */
options replace format comments java crossref symbols nobinary

/* REXX ***************************************************************
* 11.05.2013 Walter Pachl
**********************************************************************/
runSample(arg)
return

/**
 * Test for rep-strings
 * @param s_str a string to check for rep-strings
 * @return Rexx string: boolean indication of reps, length, repeated value
 */
method repstring(s_str) public static
  s_str_n = s_str.length()
  rep_str = ''
  Loop lx = s_str.length() % 2 to 1 By -1
    If s_str.substr(lx + 1, lx) = s_str.left(lx) Then Leave lx
    End lx
  If lx > 0 Then Do label reps
    rep_str = s_str.left(lx)
    Loop ix = 1 By 1
      If s_str.substr(ix * lx + 1, lx) <> rep_str Then
        Leave ix
      End ix
    If rep_str.copies(s_str_n).left(s_str.length()) <> s_str Then
      rep_str = ''
    End reps
  Return (rep_str.length() > 0) rep_str.length() rep_str

-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
method runSample(arg) public static
  parse arg samples
  if samples = '' then -
    samples = -
      '1001110011' -
      '1110111011' -
      '0010010010' -
      '1010101010' -
      '1111111111' -
      '0100101101' -
      '0100100' -
      '101' -
      '11' -
      '00' -
      '1'

  loop w_ = 1 to samples.words()
    in_str = samples.word(w_)
    parse repstring(in_str) is_rep_str rep_str_len rep_str

    sq = ''''in_str''''
    tstrlen = sq.length().max(20)
    sq=sq.right(tstrlen)
    if is_rep_str then
      Say sq 'has a repetition length of' rep_str_len "i.e. '"rep_str"'"
    else
      Say sq 'is not a repeated string'
    end w_
  return
</lang>
{{out}}
<pre>
        '1001110011' has a repetition length of 5 i.e. '10011'
        '1110111011' has a repetition length of 4 i.e. '1110'
        '0010010010' has a repetition length of 3 i.e. '001'
        '1010101010' has a repetition length of 4 i.e. '1010'
        '1111111111' has a repetition length of 5 i.e. '11111'
        '0100101101' is not a repeated string
           '0100100' has a repetition length of 3 i.e. '010'
               '101' is not a repeated string
                '11' has a repetition length of 1 i.e. '1'
                '00' has a repetition length of 1 i.e. '0'
                 '1' is not a repeated string

Perl

<lang perl>foreach (qw(1001110011 1110111011 0010010010 1010101010 1111111111 0100101101 0100100 101 11 00 1)) {

   print "$_\n";
   if (/^(.+)\1+(.*$)(?(?{ substr($1, 0, length $2) eq $2 })|(?!))/) {
       print ' ' x length $1, "$1\n\n";
   } else {
       print " (no repeat)\n\n";
   }

}</lang>

Output:
1001110011
     10011

1110111011
    1110

0010010010
   001

1010101010
    1010

1111111111
     11111

0100101101
 (no repeat)

0100100
   010

101
 (no repeat)

11
 1

00
 0

1
 (no repeat)

Perl 6

<lang perl6>for <1001110011 1110111011 0010010010 1010101010 1111111111 0100101101 0100100 101 11 00 1> {

   if /^ (.+) $0+: (.*$) <?{ $0.substr(0,$1.chars) eq $1 }> / {

my $rep = $0.chars; say .substr(0,$rep), .substr($rep,$rep).trans('01' => 'πŸ˜πŸ™'), .substr($rep*2);

   }
   else {

say "$_ (no repeat)";

   }

}</lang>

Output:
10011πŸ™πŸ˜πŸ˜πŸ™πŸ™
1110πŸ™πŸ™πŸ™πŸ˜11
001πŸ˜πŸ˜πŸ™0010
1010πŸ™πŸ˜πŸ™πŸ˜10
11111πŸ™πŸ™πŸ™πŸ™πŸ™
0100101101 (no repeat)
010πŸ˜πŸ™πŸ˜0
101 (no repeat)
1πŸ™
0𝟘
1 (no repeat)

Here's a technique that relies on the fact that XORing the shifted binary number should set all the lower bits to 0 if there are repeats. (The cool thing is that shift will automatically throw away the bits on the right that you want thrown away.) This produces the same output as above. <lang perl6>sub repstr(Str $s) {

   my $bits = :2($s);
   for reverse 1 .. $s.chars div 2 -> $left {

my $right = $s.chars - $left; return $left if $bits +^ ($bits +> $left) == $bits +> $right +< $right;

   }

}


for '1001110011 1110111011 0010010010 1010101010 1111111111 0100101101 0100100 101 11 00 1'.words {

   if repstr $_ -> $rep {

say .substr(0,$rep), .substr($rep,$rep).trans('01' => 'πŸ˜πŸ™'), .substr($rep*2);

   }
   else {

say "$_ (no repeat)";

   }

}</lang>

Python

Python: Procedural

<lang python>def is_repeated(text):

   'check if the first part of the string is repeated throughout the string'
   for x in range(len(text)//2, 0, -1):
       if text.startswith(text[x:]): return x
   return 0

matchstr = """\ 1001110011 1110111011 0010010010 1010101010 1111111111 0100101101 0100100 101 11 00 1 """ for line in matchstr.split():

   ln = is_repeated(line)
   print('%r has a repetition length of %i i.e. %s' 
          % (line, ln, repr(line[:ln]) if ln else '*not* a rep-string'))</lang>
Output:
'1001110011' has a repetition length of 5 i.e. '10011'
'1110111011' has a repetition length of 4 i.e. '1110'
'0010010010' has a repetition length of 3 i.e. '001'
'1010101010' has a repetition length of 4 i.e. '1010'
'1111111111' has a repetition length of 5 i.e. '11111'
'0100101101' has a repetition length of 0 i.e. *not* a rep-string
'0100100' has a repetition length of 3 i.e. '010'
'101' has a repetition length of 0 i.e. *not* a rep-string
'11' has a repetition length of 1 i.e. '1'
'00' has a repetition length of 1 i.e. '0'
'1' has a repetition length of 0 i.e. *not* a rep-string

Python: Functional

This returns all the possible repeated substrings <lang python>>>> def reps(text):

   return [text[:x] for x in range(1, 1 + len(text) // 2)
           if text.startswith(text[x:])]

>>> matchstr = """\ 1001110011 1110111011 0010010010 1010101010 1111111111 0100101101 0100100 101 11 00 1 """ >>> print('\n'.join('%r has reps %r' % (line, reps(line)) for line in matchstr.split())) '1001110011' has reps ['10011'] '1110111011' has reps ['1110'] '0010010010' has reps ['001'] '1010101010' has reps ['10', '1010'] '1111111111' has reps ['1', '11', '111', '1111', '11111'] '0100101101' has reps [] '0100100' has reps ['010'] '101' has reps [] '11' has reps ['1'] '00' has reps ['0'] '1' has reps [] >>> </lang>

Python: Regexp

This version, inspired by the Perl 6 entry uses the regexp substitute where what the match is substituted with is returned by a function. <lang python>import re

matchstr = """\ 1001110011 1110111011 0010010010 1010101010 1111111111 0100101101 0100100 101 11 00 1"""

def _checker(matchobj):

   g0, (g1, g2, g3, g4) = matchobj.group(0), matchobj.groups()
   if not g4 and g1 and g1.startswith(g3):
       return '%r repeats %r' % (g0, g1)
   return '%r is not a rep-string' % (g0,)

def checkit(txt):

   print(re.sub(r'(.+)(\1+)(.*)|(.*)', _checker, txt))

checkit(matchstr)</lang>

Output:
'1001110011' repeats '10011'
'1110111011' repeats '1110'
'0010010010' repeats '001'
'1010101010' repeats '1010'
'1111111111' repeats '11111'
'0100101101' is not a rep-string
'0100100' repeats '010'
'101' is not a rep-string
'11' repeats '1'
'00' repeats '0'
'1' is not a rep-string

Racket

<lang Racket>

  1. lang racket

(define (rep-string str)

 (define len (string-length str))
 (for/or ([n (in-range 1 len)])
   (and (let loop ([from n])
          (or (>= from len)
              (let ([m (min (- len from) n)])
                (and (equal? (substring str from (+ from m))
                             (substring str 0 m))
                     (loop (+ n from))))))
        (substring str 0 n))))

(for ([str '("1001110011"

            "1110111011"
            "0010010010"
            "1010101010"
            "1111111111"
            "0100101101"
            "0100100"
            "101"
            "11"
            "00"
            "1")])
 (printf "~a => ~a\n" str (or (rep-string str) "no repetition")))

</lang>

Output:

1001110011 => 10011
1110111011 => 1110
0010010010 => 001
1010101010 => 10
1111111111 => 1
0100101101 => 01001011
0100100 => 010
101 => 10
11 => 1
00 => 0
1 => no repetition

REXX

version 1

<lang rexx>/* REXX ***************************************************************

  • 11.05.2013 Walter Pachl
  • 14.05.2013 Walter Pachl extend to show additional rep-strings
                                                                                                                                            • /

Call repstring '1001110011' Call repstring '1110111011' Call repstring '0010010010' Call repstring '1010101010' Call repstring '1111111111' Call repstring '0100101101' Call repstring '0100100' Call repstring '101' Call repstring '11' Call repstring '00' Call repstring '1' Exit

repstring: Parse Arg s sq='s' n=length(s) Do l=length(s)%2 to 1 By -1

 If substr(s,l+1,l)=left(s,l) Then Leave
 End

If l>0 Then Do

 rep_str=left(s,l)
 Do i=1 By 1
   If substr(s,i*l+1,l)<>rep_str Then
     Leave
   End
 If left(copies(rep_str,n),length(s))=s Then Do
   Call show_rep rep_str              /* show result                */
   Do i=length(rep_str)-1 To 1 By -1  /* look for shorter rep_str-s */
     rep_str=left(s,i)
     If left(copies(rep_str,n),length(s))=s Then
       Call show_rep rep_str
     End
   End
 Else
   Call show_norep
 End

Else

 Call show_norep

Return

show_rep:

 Parse Arg rs
 Say right(sq,12) 'has a repetition length of' length(rs) 'i.e.' 'rs'
 Return

show_norep:

 Say right(sq,12) 'is not a repeated string'
 Return</lang>

Output:

'1001110011' has a repetition length of 5 i.e. '10011'
'1110111011' has a repetition length of 4 i.e. '1110'
'0010010010' has a repetition length of 3 i.e. '001'
'1010101010' has a repetition length of 4 i.e. '1010'
'1010101010' has a repetition length of 2 i.e. '10'
'1111111111' has a repetition length of 5 i.e. '11111'
'1111111111' has a repetition length of 4 i.e. '1111'
'1111111111' has a repetition length of 3 i.e. '111'
'1111111111' has a repetition length of 2 i.e. '11'
'1111111111' has a repetition length of 1 i.e. '1'
'0100101101' is not a repeated string
   '0100100' has a repetition length of 3 i.e. '010'
       '101' is not a repeated string
        '11' has a repetition length of 1 i.e. '1'
        '00' has a repetition length of 1 i.e. '0'
         '1' is not a repeated string

version 2

No checking is done to validate if the strings are binary strings (an easy-peasy-lemon-squeezy check)).
This version can handle any characters in the string except blanks (which could be used if a stemmed array would be used). <lang rexx>/*REXX pgm determines if a string is a repString, returns min len repStr*/ s=1001110011 1110111011 0010010010 1010101010 1111111111 0100101101 0100100 101 11 00 1

    do k=1  for words(s); _=word(s,k) /*process all the binary strings.*/
    say  right(_,20)  rep_string(_)   /*show the original & the result.*/
    end   /*k*/

exit /*stick a fork in it, we're done.*/ /*───────────────────────────────────REP_STRING subroutine──────────────*/ rep_string: procedure; parse arg x; L=length(x); r_=' rep string='

  do j=1  for L-1;    p=left(x,j);    if j>L%2  then iterate
  if left(copies(p,L*L),L)==x  then   return r_ left(p,15) '(length' j")"
  end   /*j*/

return ' (no repetitions)' /*(sigh)βˆ™βˆ™βˆ™ a failure to find rep*/</lang> output

          1001110011   rep string= 10011           (length 5)
          1110111011   rep string= 1110            (length 4)
          0010010010   rep string= 001             (length 3)
          1010101010   rep string= 10              (length 2)
          1111111111   rep string= 1               (length 1)
          0100101101     (no repetitions)
             0100100   rep string= 010             (length 3)
                 101     (no repetitions)
                  11   rep string= 1               (length 1)
                  00   rep string= 0               (length 1)
                   1     (no repetitions)

Run BASIC

<lang runbasic>a$ = "1001110011 1110111011 0010010010 1010101010 1111111111 0100101101 0100100 101 11 00 1 " a = 1 while word$(a$,a," ") <> ""

 a1$   = word$(a$,a," ")
 n     = len(a1$)
 n2    = max(1,int((n + .5) / 2))
 noRep = 0
 for i = 1 to n2
   a2$ = a1$
   r$  = left$(a1$,i)
   rep = 0
   for j = 1 to n
     if r$ = left$(a2$,i) then 
         a2$ = mid$(a2$,i+1)
         rep = rep + 1
      end if
   next j
   if rep > 1 then 
     print a;chr$(9);a1$;chr$(9);r$;chr$(9);" Length: ";len(r$);" Repeated ";rep;" Times"
     noRep = 1
   end if
 next i
 if noRep = 0 then print a;chr$(9);a1$;" No Repeat"
 a = a + 1

wend</lang>

1	1001110011	10011	 Length: 5 Repeated 2 Times
2	1110111011	1	 Length: 1 Repeated 3 Times
2	1110111011	1110	 Length: 4 Repeated 2 Times
3	0010010010	0	 Length: 1 Repeated 2 Times
3	0010010010	001	 Length: 3 Repeated 3 Times
4	1010101010	10	 Length: 2 Repeated 5 Times
4	1010101010	1010	 Length: 4 Repeated 2 Times
5	1111111111	1	 Length: 1 Repeated 10 Times
5	1111111111	11	 Length: 2 Repeated 5 Times
5	1111111111	111	 Length: 3 Repeated 3 Times
5	1111111111	1111	 Length: 4 Repeated 2 Times
5	1111111111	11111	 Length: 5 Repeated 2 Times
6	0100101101	010	 Length: 3 Repeated 2 Times
7	0100100	        010	 Length: 3 Repeated 2 Times
8	101 No Repeat
9	11	1	 Length: 1 Repeated 2 Times
10	00	0	 Length: 1 Repeated 2 Times
11	1 No Repeat

Tcl

<lang tcl>proc repstring {text} {

   set len [string length $text]
   for {set i [expr {$len/2}]} {$i > 0} {incr i -1} {

set sub [string range $text 0 [expr {$i-1}]] set eq [string repeat $sub [expr {int(ceil($len/double($i)))}]] if {[string equal -length $len $text $eq]} { return $sub }

   }
   error "no repetition"

}</lang> Demonstrating: <lang tcl>foreach sample {

   "1001110011" "1110111011" "0010010010" "1010101010" "1111111111"
   "0100101101" "0100100" "101" "11" "00" "1"

} {

   if {[catch {

set rep [repstring $sample] puts [format "\"%s\" has repetition (length: %d) of \"%s\"" \ $sample [string length $rep] $rep]

   }]} {

puts [format "\"%s\" is not a repeated string" $sample]

   }

}</lang>

Output:
"1001110011" has repetition (length: 5) of "10011"
"1110111011" has repetition (length: 4) of "1110"
"0010010010" has repetition (length: 3) of "001"
"1010101010" has repetition (length: 4) of "1010"
"1111111111" has repetition (length: 5) of "11111"
"0100101101" is not a repeated string
"0100100" has repetition (length: 3) of "010"
"101" is not a repeated string
"11" has repetition (length: 1) of "1"
"00" has repetition (length: 1) of "0"
"1" is not a repeated string