De Bruijn sequences

From Rosetta Code
Revision as of 10:05, 1 September 2019 by rosettacode>Gerard Schildberger (added a new (draft) task, also added the REXX computer programming language for this task.)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
De Bruijn sequences is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

The sequences are named after the Dutch mathematician   Nicolaas Govert de Bruijn.


A note on Dutch capitalization:   Nicolaas' last name is   de Bruijn,   the   de   isn't normally capitalized unless it's the first word in a sentence.   Rosetta Code (more or less by default or by fiat) requires the first word in the task name to be capitalized.


In combinatorial mathematics,   a   de Bruijn sequence   of order   n   on a   size-k   alphabet (computer science)   A   is a cyclic sequence in which every possible   length-n   string (computer science, formal theory)   on   A   occurs exactly once as a contiguous substring.

Such a sequence is denoted by   B(k, n)   and has length   kn,   which is also the number of distinct substrings of length   n   on   A;    
de Bruijn sequences are therefore optimally short.

There are:

                         (k!)k(n-1)   ÷   kn

distinct de Bruijn sequences   B(k, n).


Task

For this Rosetta Code task,   a de Bruijn sequence is to be generated that can be used to shorten a brute-force attack on a   PIN-like   code lock that does not have an "enter" key and accepts the last   n   digits entered.


Note:   automated tell machines (ATMs)   used to work like this,   but their software has been updated to not allow a brute-force attack.


Example

A   digital door lock   with a 4-digit code would have B (10, 4) solutions,   with a length of   10,000   (digits).

Therefore, only at most     10,000 + 3     (as the solutions are cyclic or wrap-around)   presses are needed to open the lock.

Trying all 4-digit codes separately would require   4 × 10,000   or   40,000   presses.


Task requirements
  •   Generate a de Bruijn sequence for a 4-digit (decimal) PIN code.
  •   Show the length of the generated de Bruijn sequence.
  •   (There are many possible de Bruijn sequences that solve this task,   one solution is shown on the discussion page).
  •   Show the first and last   130   digits of the de Bruijn sequence.
  •   Verify that all four-digit (decimal)   1,000   PIN codes are contained within the de Bruijn sequence.
  •   0000, 0001, 0002, 0003,   ...   9996, 9997, 9998, 9999   (note the leading zeros).
  •   Reverse the de Bruijn sequence.
  •   Again, perform the (above) verification test.
  •   Extract the 4,444th digit from the original de Bruijn sequence.
  •   Add   1   (unity) to that digit,   and take the right-most digit of that sum.
  •   Replace the original digit with the above digit   (in the de Bruijn sequence).
  •   Perform the verification test (again).   There should be several PIN codes missing.


(The last requirement is to ensure that the verification tests performs correctly.   The verification processes should list any and all missing PIN codes.)

Show all output here, on this page.


References



REXX

The   de Bruijn   sequence generated by this REXX program is identical to the sequence shown on the   discussion   page   (1st topic). <lang rexx>/*REXX pgm calculates the de Bruijn sequence for all pin numbers (4 digit decimals). */ $= /*initialize the de Bruijn sequence. */

  1. =10; lastNode= (#-2)(#-2)(#-1)(#-2) /*this number is formed when this # ···*/
                                                /*  ··· is skipped near the cycle end. */
 do j=0  for 10;  $= $ || j;  jj= j || j        /*compose the left half of the numbers.*/
                                                /* [↓]     "  right  "   "  "     "    */
                               do k=jj+1  to 99;      z= jj || right(k, 2, 0)
                               if z==lastNode  then iterate    /*the last node skipped.*/
                               if pos(z, $)\==0  then iterate  /*# in sequence? Skip it*/
                               $= $ || z        /* ◄─────────────────────────────────┐ */
                               end   /*k*/      /*append a number to the sequence──◄─┘ */
    do r= jj  to (j || 9);  b= right(r, 2, 0)   /*compose the left half of the numbers.*/
    if b==jj  then iterate
    $= $ || right(b, 2, 0)                      /* [↓]     "  right  "   "  "     "    */
                               do k= b+1  to 99;      z= right(b, 2, 0) || right(k, 2, 0)
                               if pos(z, $)\==0  then iterate  /*# in sequence? Skip it*/
                               $= $ || z        /* ◄─────────────────────────────────┐ */
                               end   /*k*/      /*append a number to the sequence──◄─┘ */
    end   /*r*/
 end      /*j*/
                     @deB= 'de Bruijn sequence' /*literal used in some SAY instructions*/

$= $ || left($, 3) /*append 000*/ /*simulate "wrap-around" de Bruijn seq.*/ say 'length of the' @deB length($) /*display the length of de Bruijn seq.*/ say say 'first 100 digits of the' @deB":" say left($, 100) say say ' last 100 digits of the' @deB":" say right($, 100) say call val $ /*call the CHECK sub for verification*/

              @deB= 'reversed'   @deB           /*next,  we'll check on a reversed seq.*/

$$= reverse($) /*do what a mirror does, reversify it.*/ call val $$ /*call the CHECK sub for verification*/ good= substr($, 4444, 1) /*obtain the 4,444th digit in the seq. */

bad=  right(good+1, 1)                          /*add 1 to it,  obtain rightmost digit.*/

$= overlay(bad,$, 4444) /*put that digit into the 4,444th place*/

              @deB= 'overlaid' subword(@deB, 2) /* [↑] this'll cause a validateion barf*/

call val $ /*call the CHECK sub for verification*/ exit /*stick a fork in it, we're all done. */ /*──────────────────────────────────────────────────────────────────────────────────────*/ val: parse arg $$$; e= 0; _= copies('─',8) /*count of errors (missing PINs) so far*/

    say;      say _ 'validating the'    @deB"." /*display what's happening in the pgm. */
        do pin=0  for 1e4; pin4= right(pin,4,0) /* [↓]  maybe add leading zeros to pin.*/
        if pos(pin4, $$$)\==0  then iterate     /*Was number found?  Just as expected. */
        say 'PIN number '      pin       " wasn't found in"         @deb'.'
        e= e + 1                                /*bump the counter for number of errors*/
        end   /*pin*/                           /* [↑]  validate all 10,000 pin numbers*/
    if e==0  then e= 'No'                       /*Gooder English (sic) the error count.*/
    say _   e   'errors found.'                 /*display the number of errors found.  */
    return</lang>
output:
length of the de Bruijn sequence 10003

first 100 digits of the de Bruijn sequence:
0000100020003000400050006000700080009001100120013001400150016001700180019002100220023002400250026002

 last 100 digits of the de Bruijn sequence:
6989699769986999777787779778877897798779978787978887889789878997979887989799879998888988998989999000


──────── validating the de Bruijn sequence.
──────── No errors found.

──────── validating the reversed de Bruijn sequence.
──────── No errors found.

──────── validating the overlaid de Bruijn sequence.
PIN number  1459  wasn't found in overlaid de Bruijn sequence.
PIN number  4591  wasn't found in overlaid de Bruijn sequence.
PIN number  5814  wasn't found in overlaid de Bruijn sequence.
PIN number  8145  wasn't found in overlaid de Bruijn sequence.
──────── 4 errors found.