Pseudo-random numbers/Splitmix64

From Rosetta Code
Revision as of 10:09, 16 August 2020 by Chunes (talk | contribs) (→‎{{header|Factor}}: use 2^ instead of ^)
Pseudo-random numbers/Splitmix64 is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

Shiftmix64 is the default pseudo-random number generator algorithm in Java and is included / available in many other languages. It uses a fairly simple algorithm that, though it is considered to be poor for cryptographic purposes, is very fast to calculate, and is "good enough" for many random number needs. It passes several fairly rigorous PRNG "fitness" tests that some more complex algorithms fail.

Shiftmix64 is not recommended for demanding random number requirements, but is often used to calculate initial states for other more complex pseudo-random number generators.

The "standard" shiftmix64 maintains one 64 bit state variable and returns 64 bits of random data with each call.

Basic pseudocode algorithm:

    uint64 state                                  /* The state can be seeded with any (upto) 64 bit integer value. */

    next_int() {
        state += 0x9e3779b97f4a7c15               /* increment the state variable */
        uint64 z = state                          /* copy the state to a working variable */
        z = (z ^ (z >> 30)) * 0xbf58476d1ce4e5b9  /* xor the variable with the variable right bit shifted 30 then multiply by a constant */
        z = (z ^ (z >> 27)) * 0x94d049bb133111eb  /* xor the variable with the variable right bit shifted 27 then multiply by a constant */
        return z ^ (z >> 31)                      /* return the variable xored with itself right bit shifted 31 */
    }

    next_float() {
        return next_int() / (1 << 64)             /* divide by 2^64 to return a value between 0 and 1 */
    }

The returned value should hold 64 bits of numeric data. If your language does not support unsigned 64 bit integers directly you may need to apply appropriate bitmasks during bitwise operations.

In keeping with the general layout of several recent pseudo-random number tasks:


Task
  • Write a class or set of functions that generates pseudo-random numbers using shiftmix64.
  • Show the first five integers generated using the seed 1234567.
    9086773575395155592
   12075168200337577354
    5360881288835356517
   10975871416297420776
    2494458855142162015    
  • Show that for an initial seed of 987654321, the counts of 100_000 repetitions of floor next_float() * 5 is as follows:
   0: 19929, 1: 19817, 2: 20085, 3: 20122, 4: 20047
  • Show your output here, on this page.


See also


Related tasks

C

Code copied from the reference C implementation used by Java. The results seem to differ from what is listed, however. I am using GNU GCC v7.1.1. <lang c>/* Written in 2015 by Sebastiano Vigna (vigna@acm.org)

To the extent possible under law, the author has dedicated all copyright and related and neighboring rights to this software to the public domain worldwide. This software is distributed without any warranty.

See <http://creativecommons.org/publicdomain/zero/1.0/>. */

  1. include <stdint.h>
  2. include <stdio.h>
  3. include <math.h>

/* This is a fixed-increment version of Java 8's SplittableRandom generator

  See http://dx.doi.org/10.1145/2714064.2660195 and 
  http://docs.oracle.com/javase/8/docs/api/java/util/SplittableRandom.html
  It is a very fast generator passing BigCrush, and it can be useful if
  for some reason you absolutely want 64 bits of state. */

static uint64_t x; /* The state can be seeded with any value. */

uint64_t next() { uint64_t z = (x += 0x9e3779b97f4a7c15); z = (z ^ (z >> 30)) * 0xbf58476d1ce4e5b9; z = (z ^ (z >> 27)) * 0x94d049bb133111eb; return z ^ (z >> 31); }

double next_float() {

   return next() / pow(2.0, 64);

}

int main() {

   int i, j;
   x = 1234567;
   for(i = 0; i < 5; ++i)
       printf("%llu\n", next()); /* needed to use %lu verb for GCC 7.5.0-3 */
   x = 987654321;
   int vec5[5] = {0, 0, 0, 0, 0};
   for(i = 0; i < 100000; ++i) {
       j = next_float() * 5.0;
       vec5[j] += 1;
   }
   for(i = 0; i < 5; ++i)
       printf("%d: %d  ", i, vec5[i]);

}

</lang>

Output:
6457827717110365317
3203168211198807973
9817491932198370423
4593380528125082431
16408922859458223821
0: 20027  1: 19892  2: 20073  3: 19978  4: 20030  

Factor

<lang factor>USING: io kernel math math.bitwise math.functions math.statistics namespaces prettyprint sequences ;

SYMBOL: state

seed ( n -- ) 64 bits state set ;
next-int ( -- n )
   0x9e3779b97f4a7c15 state [ + 64 bits ] change
   state get -30 0xbf58476d1ce4e5b9 -27 0x94d049bb133111eb -31 1
   [ [ dupd shift bitxor ] dip * 64 bits ] 2tri@ ;
next-float ( -- x ) next-int 64 2^ /f ;

! Test next-int "Seed: 1234567; first five integer values" print 1234567 seed 5 [ next-int . ] times nl

! Test next-float "Seed: 987654321; first 100,000 float values histogram" print 987654321 seed 100,000 [ next-float 5 * >integer ] replicate histogram .</lang>

Output:
Seed: 1234567; first five integer values
6457827717110365317
3203168211198807973
9817491932198370423
4593380528125082431
16408922859458223821

Seed: 987654321; first 100,000 float values histogram
H{ { 0 20027 } { 1 19892 } { 2 20073 } { 3 19978 } { 4 20030 } }

Go

Same answers as C and Wren. <lang Go>package main

import (

   "fmt"
   "math"

)

type Splitmix64 struct{ state uint64 }

func Splitmix64New(state uint64) *Splitmix64 { return &Splitmix64{state} }

func (sm64 *Splitmix64) nextInt() uint64 {

   sm64.state += 0x9e3779b97f4a7c15
   z := sm64.state
   z = (z ^ (z >> 30)) * 0xbf58476d1ce4e5b9
   z = (z ^ (z >> 27)) * 0x94d049bb133111eb
   return z ^ (z >> 31)

}

func (sm64 *Splitmix64) nextFloat() float64 {

   return float64(sm64.nextInt()) / (1 << 64)

}

func main() {

   randomGen := Splitmix64New(1234567)
   for i := 0; i < 5; i++ {
       fmt.Println(randomGen.nextInt())
   }
   var counts [5]int
   randomGen = Splitmix64New(987654321)
   for i := 0; i < 1e5; i++ {
       j := int(math.Floor(randomGen.nextFloat() * 5))
       counts[j]++
   }
   fmt.Println("\nThe counts for 100,000 repetitions are:")
   for i := 0; i < 5; i++ {
       fmt.Printf("  %d : %d\n", i, counts[i])
   }

}</lang>

Output:
6457827717110365317
3203168211198807973
9817491932198370423
4593380528125082431
16408922859458223821

The counts for 100,000 repetitions are:
  0 : 20027
  1 : 19892
  2 : 20073
  3 : 19978
  4 : 20030

Raku

Works with: Rakudo version 2020.07

<lang perl6>class splitmix64 {

   has $!state;
   submethod BUILD ( Int :$seed where * >= 0 = 1 ) { $!state = $seed }
   method next-int {
       my $next = $!state += 0x9e3779b97f4a7c15;
       $next  +^= ($next +> 30) * 0xbf58476d1ce4e5b9;
       $next  +^= ($next +> 27) * 0x94d049bb133111eb;
       ($next +^  ($next +> 31)) +& (2⁶⁴ - 1);
   }
   method next-rat { self.next-int / 2⁶⁴ }

}

  1. Test next-int

say 'Seed: 1234567; first five Int values'; my $rng = splitmix64.new( :seed(1234567) ); .say for $rng.next-int xx 5;


  1. Test next-rat (since these are rational numbers by default)

say "\nSeed: 987654321; first 1e5 Rat values histogram"; $rng = splitmix64.new( :seed(987654321) ); say ( ($rng.next-rat * 5).floor xx 100_000 ).Bag;</lang>

Output:
Seed: 1234567; first five Int values
9086773575395155592
12075168200337577354
5360881288835356517
10975871416297420776
2494458855142162015

Seed: 987654321; first 1e5 Rat values histogram
Bag(0(19929) 1(19817) 2(20085) 3(20122) 4(20047)

Wren

Library: Wren-big

No 64 bit integers so we use BigInt with a mask.

Same answers as C reference implementation above. Can't see any essential difference between that and the pseudo-code. Raku's next-int method may need intermediate masks. <lang ecmascript>import "/big" for BigInt

var Const1 = BigInt.fromBaseString("9e3779b97f4a7c15", 16) var Const2 = BigInt.fromBaseString("bf58476d1ce4e5b9", 16) var Const3 = BigInt.fromBaseString("94d049bb133111eb", 16) var Mask64 = (BigInt.one << 64) - BigInt.one

class Splitmix64 {

   construct new(state) {
       _state  = state
   }
   nextInt {
       _state = (_state + Const1) & Mask64
       var z = _state
       z = ((z ^ (z >> 30)) * Const2) & Mask64
       z = ((z ^ (z >> 27)) * Const3) & Mask64
       return (z ^ (z >> 31)) & Mask64
   }
   nextFloat { nextInt.toNum / 2.pow(64) }

}

var randomGen = Splitmix64.new(BigInt.new(1234567)) for (i in 0..4) System.print(randomGen.nextInt)

var counts = List.filled(5, 0) randomGen = Splitmix64.new(BigInt.new(987654321)) for (i in 1..1e5) {

   var i = (randomGen.nextFloat * 5).floor
   counts[i] = counts[i] + 1

} System.print("\nThe counts for 100,000 repetitions are:") for (i in 0..4) System.print("  %(i) : %(counts[i])")</lang>

Output:
6457827717110365317
3203168211198807973
9817491932198370423
4593380528125082431
16408922859458223821

The counts for 100,000 repetitions are:
  0 : 20027
  1 : 19892
  2 : 20073
  3 : 19978
  4 : 20030