Benford's law: Difference between revisions

Content added Content deleted

Inline

Revision as of 20:15, 21 May 2013

This page uses content from Wikipedia. The original article was at Benford's_law. The list of authors can be seen in the page history. As with Rosetta Code, the text of Wikipedia is available under the GNU FDL. (See links for details on variance)

Benford's law, also called the first-digit law, refers to the frequency distribution of digits in many (but not all) real-life sources of data. In this distribution, the number 1 occurs as the first digit about 30% of the time, while larger numbers occur in that position less frequently: 9 as the first digit less than 5% of the time. This distribution of first digits is the same as the widths of gridlines on a logarithmic scale. Benford's law also concerns the expected distribution for digits beyond the first, which approach a uniform distribution.

This result has been found to apply to a wide variety of data sets, including electricity bills, street addresses, stock prices, population numbers, death rates, lengths of rivers, physical and mathematical constants, and processes described by power laws (which are very common in nature). It tends to be most accurate when values are distributed across multiple orders of magnitude.

A set of numbers is said to satisfy Benford's law if the leading digit $d$ ( $d\in \{1,\ldots ,9\}$ ) occurs with probability

P(d)=\log _{10}(d+1)-\log _{10}(d)=\log _{10}\left(1+{\frac {1}{d}}\right)

For this task, write (a) routine(s) to calculate the distribution of first significant (non-zero) digits in a collection of numbers, then display the actual vs. expected distribution in the way most convenient for your language (table / graph / histogram / whatever).

Use the first 1000 numbers from the Fibonacci sequence as your data set. No need to show how the Fibonacci numbers are obtained. You can generate them or load them from a file; whichever is easiest. Display your actual vs expected distribution.

For extra credit: Show the distribution for one other set of numbers from a page on Wikipedia. State which Wikipedia page it can be obtained from and what the set enumerates. Again, no need to display the actual list of numbers or the code to load them.

See also:

numberphile.com.

C++

<lang C++>//to cope with the big numbers , I used the Class Library for Numbers( CLN ) //if used prepackaged you can compile writing "g++ -lcln yourprogram.cpp -o yourprogram"

include <cln/integer.h>
include <cln/integer_io.h>
include <iostream>
include <algorithm>
include <vector>
include <iomanip>
include <sstream>
include <string>
include <cstdlib>
include <cmath>

using namespace cln ;

class NextNum { public :

  NextNum ( cl_I & a , cl_I & b ) : first( a ) , second ( b ) { }
  cl_I operator( )( ) {
     cl_I result = first + second ;
     first = second ;
     second = result ;
     return result ;
  }

private :

  cl_I first ;
  cl_I second ;

} ;

double findPercentage( const std::vector<cl_I> & fibos , int number ) {

  int sum = 0 ;
  for ( std::vector<cl_I>::const_iterator vlci = fibos.begin( ) ;

vlci != fibos.end( ) ; vlci++ ) {

     std::ostringstream os ;
     fprintdecimal ( os , *vlci ) ;//from header file cln/integer_io.h
     std::string numberstring ( os.str( ) ) ;
     if ( std::atoi( numberstring.substr( 0 , 1 ).c_str( )) == number )

sum++ ;

  }
  double found = static_cast<double>( sum ) / 1000 ;
  return found ;

}

int main( ) {

  std::vector<cl_I> fibonaccis( 1000 ) ;
  fibonaccis[ 0 ] = 0 ;
  fibonaccis[ 1 ] = 1 ;
  cl_I a = 0 ;
  cl_I b = 1 ;
  //since a and b are passed as references to the generator's constructor
  //they are constantly changed !
  std::generate_n( fibonaccis.begin( ) + 2 , 998 , NextNum( a , b ) ) ;
  std::cout << std::endl ;
  std::cout << "                found                    expected\n" ;
  for ( int i = 1 ; i < 10 ; i++ ) {
     double found = findPercentage( fibonaccis , i ) ;
     double expected = std::log10( 1 + 1 / static_cast<double>( i )) ;
     std::cout << i << " :" << std::setw( 16 ) << std::right << found * 100 << " %" ;
     std::cout.precision( 3 ) ;
     std::cout << std::setw( 26 ) << std::right << expected * 100 << " %\n" ;
  }
  return 0 ;

}</lang>

Output:

                found                    expected
1 :            30.1 %                      30.1 %
2 :            17.7 %                      17.6 %
3 :            12.5 %                      12.5 %
4 :             9.5 %                      9.69 %
5 :               8 %                      7.92 %
6 :             6.7 %                      6.69 %
7 :             5.6 %                       5.8 %
8 :             5.3 %                      5.12 %
9 :             4.5 %                      4.58 %

D

Translation of: Scala

<lang d>import std.stdio, std.range, std.math, std.conv, std.bigint;

double[2][9] benford(R)(R seq) if (isForwardRange!R && !isInfinite!R) {

   typeof(return) freqs = 0;
   uint seqLen = 0;
   foreach (d; seq)
       if (d != 0) {
           freqs[d.text[0] - '1'][1]++;
           seqLen++;
       }

   foreach (immutable i, ref p; freqs)
       p = [log10(1.0 + 1.0 / (i + 1)), p[1] / seqLen];
   return freqs;

}

void main() {

   auto fibs = recurrence!q{a[n - 1] + a[n - 2]}(1.BigInt, 1.BigInt);

   writefln("%9s %9s %9s", "Actual", "Expected", "Deviation");
   foreach (immutable i, immutable p; fibs.take(1000).benford)
       writefln("%d: %5.2f%% | %5.2f%% | %5.4f%%",
                i+1, p[1] * 100, p[0] * 100, abs(p[1] - p[0]) * 100);

}</lang>

Output:

   Actual  Expected Deviation
1: 30.10% | 30.10% | 0.0030%
2: 17.70% | 17.61% | 0.0908%
3: 12.50% | 12.49% | 0.0061%
4:  9.60% |  9.69% | 0.0910%
5:  8.00% |  7.92% | 0.0818%
6:  6.70% |  6.69% | 0.0053%
7:  5.60% |  5.80% | 0.1992%
8:  5.30% |  5.12% | 0.1847%
9:  4.50% |  4.58% | 0.0757%

Alternative Version

The output is the same. <lang d>import std.stdio, std.range, std.math, std.conv, std.bigint,

      std.algorithm, std.array;

auto benford(R)(R seq) if (isForwardRange!R && !isInfinite!R) {

   return seq.filter!q{a != 0}.map!q{a.text[0]-'1'}.array.sort().group;

}

void main() {

   auto fibs = recurrence!q{a[n - 1] + a[n - 2]}(1.BigInt, 1.BigInt);
   auto expected = iota(1, 10).map!(d => log10(1.0 + 1.0 / d));

   enum N = 1_000;
   writefln("%9s %9s %9s", "Actual", "Expected", "Deviation");
   foreach (immutable i, immutable f; fibs.take(N).benford)
       writefln("%d: %5.2f%% | %5.2f%% | %5.4f%%", i + 1,
                f * 100.0 / N, expected[i] * 100,
                abs((f / cast(double)N) - expected[i]) * 100);

}</lang>

Erlang

<lang Erlang> -module( benfords_law ). -export( [actual_distribution/1, distribution/1, task/0] ).

actual_distribution( Ns ) -> lists:foldl( fun first_digit_count/2, dict:new(), Ns ).

distribution( N ) -> math:log10( 1 + (1 / N) ).

task() -> Total = 1000, Fibonaccis = fib( Total ), Actual_dict = actual_distribution( Fibonaccis ), Keys = lists:sort( dict:fetch_keys( Actual_dict) ), io:fwrite( "Digit Actual Benfords expected~n" ), lists:foreach( fun( N ) -> io:fwrite( "~p ~p ~p~n", [N, dict:fetch(N, Actual_dict) / Total, distribution(N)] ) end, Keys ).

fib( N ) -> fib( N, 0, 1, [] ). fib( 0, Current, _, Acc ) -> lists:reverse( [Current | Acc] ); fib( N, Current, Next, Acc ) -> fib( N-1, Next, Current+Next, [Current | Acc] ).

first_digit_count( 0, Dict ) -> Dict; first_digit_count( N, Dict ) -> [Key | _] = erlang:integer_to_list( N ), dict:update_counter( Key - 48, 1, Dict ). </lang>

Output:

7> benfords_law:task().
Digit   Actual  Benfords expected
1       0.301   0.3010299956639812
2       0.177   0.17609125905568124
3       0.125   0.12493873660829993
4       0.096   0.09691001300805642
5       0.08    0.07918124604762482
6       0.067   0.06694678963061322
7       0.056   0.05799194697768673
8       0.053   0.05115252244738129
9       0.045   0.04575749056067514

Fortran

FORTRAN 90. Compilation and output of this program using emacs compile command and a fairly obvious Makefile entry: <lang fortran>-*- mode: compilation; default-directory: "/tmp/" -*- Compilation started at Sat May 18 01:13:00

a=./f && make $a && $a f95 -Wall -ffree-form f.F -o f

 0.301030010      0.176091254      0.124938756       9.69100147E-02   7.91812614E-02   6.69467747E-02   5.79919666E-02   5.11525236E-02   4.57575098E-02 THE LAW
 0.300999999      0.177000001      0.125000000       9.60000008E-02   7.99999982E-02   6.70000017E-02   5.70000000E-02   5.29999994E-02   4.50000018E-02 LEADING FIBONACCI DIGIT

Compilation finished at Sat May 18 01:13:00</lang>

<lang fortran>subroutine fibber(a,b,c,d)

 ! compute most significant digits, Fibonacci like.
 implicit none
 integer (kind=8), intent(in) :: a,b
 integer (kind=8), intent(out) :: c,d
 d = a + b
 if (15 .lt. log10(float(d))) then
   c = b/10
   d = d/10
 else
   c = b
 endif

end subroutine fibber

integer function leadingDigit(a)

 implicit none
 integer (kind=8), intent(in) :: a
 integer (kind=8) :: b
 b = a
 do while (9 .lt. b)
   b = b/10
 end do
 leadingDigit = transfer(b,leadingDigit)

end function leadingDigit

real function benfordsLaw(a)

 implicit none
 integer, intent(in) :: a
 benfordsLaw = log10(1.0 + 1.0 / a)

end function benfordsLaw

program benford

 implicit none

 interface

   subroutine fibber(a,b,c,d)
     implicit none
     integer (kind=8), intent(in) :: a,b
     integer (kind=8), intent(out) :: c,d
   end subroutine fibber

   integer function leadingDigit(a)
     implicit none
     integer (kind=8), intent(in) :: a
   end function leadingDigit

   real function benfordsLaw(a)
     implicit none
     integer, intent(in) :: a
   end function benfordsLaw

 end interface

 integer (kind=8) :: a, b, c, d
 integer :: i, count(10)
 data count/10*0/
 a = 1
 b = 1
 do i = 1, 1001
   count(leadingDigit(a)) = count(leadingDigit(a)) + 1
   call fibber(a,b,c,d)
   a = c
   b = d
 end do
 write(6,*) (benfordsLaw(i),i=1,9),'THE LAW'
 write(6,*) (count(i)/1000.0 ,i=1,9),'LEADING FIBONACCI DIGIT'

end program benford</lang>

Haskell

<lang haskell>import qualified Data.Map as M import Data.Char (digitToInt)

fstdigit :: Integer -> Int fstdigit = digitToInt . head . show

n = 1000::Int fibs = 1:1:zipWith (+) fibs (tail fibs) fibdata = map fstdigit $ take n fibs freqs = M.fromListWith (+) $ zip fibdata (repeat 1)

tab :: [(Int, Double, Double)] tab = [(d,

      (fromIntegral (M.findWithDefault 0 d freqs) /(fromIntegral n) ),
       logBase 10.0 $ 1 + 1/(fromIntegral d) ) | d<-[1..9]]

main = print tab</lang>

Output:

[(1,0.301,0.301029995663981),
(2,0.177,0.176091259055681),
(3,0.125,0.1249387366083),
(4,0.096,0.0969100130080564),
(5,0.08,0.0791812460476248),
(6,0.067,0.0669467896306132),
(7,0.056,0.0579919469776867),
(8,0.053,0.0511525224473813),
(9,0.045,0.0457574905606751)]

J

We show the correlation coefficient of Benford's law with the leading digits of the first 1000 Fibonacci numbers is almost unity. <lang J>log10 =: 10&^. benford =: log10@:(1+%) assert '0.30 0.18 0.12 0.10 0.08 0.07 0.06 0.05 0.05' -: 5j2 ": benford >: i. 9

append_next_fib =: , +/@:(_2&{.) assert 5 8 13 -: append_next_fib 5 8

leading_digits =: {.@":&> assert '581' -: leading_digits 5 8 13x

count =: #/.~ /: ~. assert 2 1 3 4 -: count 'XCXBAXACXC' NB. 2 A's, 1 B, 3 C's, and some X's.

normalize =: % +/ assert 1r3 2r3 -: normalize 1 2x

FIB =: append_next_fib ^: (1000-#) 1 1 LDF =: leading_digits FIB

TALLY_BY_KEY =: count LDF assert 9 -: # TALLY_BY_KEY NB. If all of [1-9] are present then we know what the digits are.

mean =: +/ % # center=: - mean mp =: $:~ :(+/ .*) num =: mp&:center den =: %:@:(*&:(+/@:(*:@:center))) r =: num % den NB. r is the LibreOffice correl function assert '_0.982' -: 6j3 ": 1 2 3 r 6 5 3 NB. confirmed using LibreOffice correl function

assert '0.9999' -: 6j4 ": (normalize TALLY_BY_KEY) r benford >: i.9

assert '0.9999' -: 6j4 ": TALLY_BY_KEY r benford >: i.9 NB. Of course we don't need normalization</lang>

Mathematica

<lang mathematica>fibdata = Array[First@IntegerDigits@Fibonacci@# &, 1000]; Table[{d, N@Count[fibdata, d]/Length@fibdata, Log10[1. + 1/d]}, {d, 1,

   9}] // Grid</lang>

Output:

1	0.301	0.30103
2	0.177	0.176091
3	0.125	0.124939
4	0.096	0.09691
5	0.08	0.0791812
6	0.067	0.0669468
7	0.056	0.0579919
8	0.053	0.0511525
9	0.045	0.0457575

Perl

<lang Perl>#!/usr/bin/perl use strict ; use warnings ; use POSIX qw( log10 ) ;

my @fibonacci = ( 0 , 1 ) ; while ( @fibonacci != 1000 ) {

  push @fibonacci , $fibonacci[ -1 ] + $fibonacci[ -2 ] ;

} my @actuals ; my @expected ; for my $i( 1..9 ) {

  my $sum = 0 ;
  map { $sum++ if $_ =~ /\A$i/ } @fibonacci ;
  push @actuals , $sum / 1000  ;
  push @expected , log10( 1 + 1/$i ) ;

} print " Observed Expected\n" ; for my $i( 1..9 ) {

  print "$i : " ;
  my $result = sprintf ( "%.2f" , 100 * $actuals[ $i - 1 ] ) ;
  printf "%11s %%" , $result ;
  $result = sprintf ( "%.2f" , 100 * $expected[ $i - 1 ] ) ;
  printf "%15s %%\n" , $result ;

}</lang>

Output:

 
         Observed         Expected
1 :       30.10 %          30.10 %
2 :       17.70 %          17.61 %
3 :       12.50 %          12.49 %
4 :        9.50 %           9.69 %
5 :        8.00 %           7.92 %
6 :        6.70 %           6.69 %
7 :        5.60 %           5.80 %
8 :        5.30 %           5.12 %
9 :        4.50 %           4.58 %

Perl 6

<lang perl6>sub benford (@a) { @a.grep(/<[1..9]>/)».match(/<[1..9]>/).bag }

sub dump (%distribution, $base = 10) {

   printf "%9s %9s  %s\n", <Actual Expected Deviation>;
   for 1 .. 9 -> $digit {
       my $actual = %distribution{$digit} * 100 / [+] %distribution.values;
       my $expected = (1 + 1 / $digit).log($base) * 100;
       printf "%d: %5.2f%% | %5.2f%% | %.2f%%\n",
         $digit, $actual, $expected, abs($expected - $actual);
   }

}

( 1, 1, 2, *+* ... * )[^1000].&benford.&dump;</lang>

Output: First 1000 Fibonaccis

   Actual  Expected  Deviation
1: 30.10% | 30.10% | 0.00%
2: 17.70% | 17.61% | 0.09%
3: 12.50% | 12.49% | 0.01%
4:  9.60% |  9.69% | 0.09%
5:  8.00% |  7.92% | 0.08%
6:  6.70% |  6.69% | 0.01%
7:  5.60% |  5.80% | 0.20%
8:  5.30% |  5.12% | 0.18%
9:  4.50% |  4.58% | 0.08%

Extra credit: Square Kilometers of land under cultivation, by country / territory. First column from Wikipedia: Land use statistics by country.

   Actual  Expected  Deviation
1: 33.33% | 30.10% | 3.23%
2: 18.31% | 17.61% | 0.70%
3: 13.15% | 12.49% | 0.65%
4:  8.45% |  9.69% | 1.24%
5:  9.39% |  7.92% | 1.47%
6:  5.63% |  6.69% | 1.06%
7:  4.69% |  5.80% | 1.10%
8:  5.16% |  5.12% | 0.05%
9:  1.88% |  4.58% | 2.70%

PL/pgSQL

<lang SQL> WITH recursive constant(val) AS ( select 1000. ) , fib(a,b) AS ( SELECT CAST(0 AS numeric), CAST(1 AS numeric) UNION ALL SELECT b,a+b FROM fib ) , benford(first_digit, probability_real, probability_theoretical) AS ( SELECT *, CAST(log(1. + 1./CAST(first_digit AS INT)) AS NUMERIC(5,4)) probability_theoretical FROM ( SELECT first_digit, CAST(COUNT(1)/(select val from constant) AS NUMERIC(5,4)) probability_real FROM ( SELECT SUBSTRING(CAST(a AS VARCHAR(100)),1,1) first_digit FROM fib WHERE SUBSTRING(CAST(a AS VARCHAR(100)),1,1) <> '0' LIMIT (select val from constant) ) t GROUP BY first_digit ) f ORDER BY first_digit ASC ) select * from benford cross join

    (select cast(corr(probability_theoretical,probability_real) as numeric(5,4)) correlation
     from benford) c

</lang>

Prolog

<lang Prolog>%_________________________________________________________________ % Does the Fibonacci sequence follow Benford's law? %~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ % Fibonacci sequence generator fib(C, [P,S], C, N) :- N is P + S. fib(C, [P,S], Cv, V) :- succ(C, Cn), N is P + S, !, fib(Cn, [S,N], Cv, V).

fib(0, 0). fib(1, 1). fib(C, N) :- fib(2, [0,1], C, N). % Generate from 3rd sequence on

% The benford law calculated benford(D, Val) :- Val is log10(1+1/D).

% Retrieves the first characters of the first 1000 fibonacci numbers % (excluding zero) firstchar(V) :- fib(C,N), N =\= 0, atom_chars(N, [Ch|_]), number_chars(V, [Ch]), (C>999-> !; true).

% Increment the n'th list item (1 based), result -> third argument. incNth(1, [Dh|Dt], [Ch|Dt]) :- !, succ(Dh, Ch). incNth(H, [Dh|Dt], [Dh|Ct]) :- succ(Hn, H), !, incNth(Hn, Dt, Ct).

% Calculate the frequency of the all the list items freq([], D, D). freq([H|T], D, C) :- incNth(H, D, L), !, freq(T, L, C).

% Output the results writeHdr :- format('~t~w~15| - ~t~w\n', ['Benford', 'Measured']). writeData(Benford, Freq) :- format('~t~2f%~15| - ~t~2f%\n', [Benford*100, Freq*100]).

go :- % main goal findall(B, (between(1,9,N), benford(N,B)), Benford), findall(C, firstchar(C), Fc), freq(Fc, Freq), writeHdr, maplist(writeData, Benford, Freq).</lang>

Output:

?- go.
        Benford - Measured
         30.10% - 30.10%
         17.61% - 17.70%
         12.49% - 12.50%
          9.69% - 9.60%
          7.92% - 8.00%
          6.69% - 6.70%
          5.80% - 5.60%
          5.12% - 5.30%
          4.58% - 4.50%

Python

Works with Python 3.X & 2.7 <lang python>from __future__ import division from itertools import islice, count from collections import Counter from math import log10 from random import randint

expected = [log10(1+1/d) for d in range(1,10)]

def fib():

   a,b = 1,1
   while True:
       yield a
       a,b = b,a+b

powers of 3 as a test sequence

def power_of_threes():

   return (3**k for k in count(0))

def heads(s):

   for a in s: yield int(str(a)[0])

def show_dist(title, s):

   c = Counter(s)
   size = sum(c.values())
   res = [c[d]/size for d in range(1,10)]

   print("\n%s Benfords deviation" % title)
   for r, e in zip(res, expected):
       print("%5.1f%% %5.1f%%  %5.1f%%" % (r*100., e*100., abs(r - e)*100.))

def rand1000():

   while True: yield randint(1,9999)

if __name__ == '__main__':

   show_dist("fibbed", islice(heads(fib()), 1000))
   show_dist("threes", islice(heads(power_of_threes()), 1000))

   # just to show that not all kind-of-random sets behave like that
   show_dist("random", islice(heads(rand1000()), 10000))</lang>

Output:

fibbed Benfords deviation
 30.1%  30.1%    0.0%
 17.7%  17.6%    0.1%
 12.5%  12.5%    0.0%
  9.6%   9.7%    0.1%
  8.0%   7.9%    0.1%
  6.7%   6.7%    0.0%
  5.6%   5.8%    0.2%
  5.3%   5.1%    0.2%
  4.5%   4.6%    0.1%

threes Benfords deviation
 30.0%  30.1%    0.1%
 17.7%  17.6%    0.1%
 12.3%  12.5%    0.2%
  9.8%   9.7%    0.1%
  7.9%   7.9%    0.0%
  6.6%   6.7%    0.1%
  5.9%   5.8%    0.1%
  5.2%   5.1%    0.1%
  4.6%   4.6%    0.0%

random Benfords deviation
 11.2%  30.1%   18.9%
 10.9%  17.6%    6.7%
 11.6%  12.5%    0.9%
 11.1%   9.7%    1.4%
 11.6%   7.9%    3.7%
 11.4%   6.7%    4.7%
 10.3%   5.8%    4.5%
 11.0%   5.1%    5.9%
 10.9%   4.6%    6.3%

Racket

<lang Racket>#lang racket

(define (log10 n) (/ (log n) (log 10)))

(define (first-digit n)

 (quotient n (expt 10 (inexact->exact (floor (log10 n))))))

(define N 10000)

(define fibs

 (let loop ([n N] [a 0] [b 1])
   (if (zero? n) '() (cons b (loop (sub1 n) b (+ a b))))))

(define v (make-vector 10 0)) (for ([n fibs])

 (define f (first-digit n))
 (vector-set! v f (add1 (vector-ref v f))))

(printf "N OBS EXP\n") (define (pct n) (~r (* n 100.0) #:precision 1 #:min-width 4)) (for ([i (in-range 1 10)])

 (printf "~a: ~a% ~a%\n" i
         (pct (/ (vector-ref v i) N))
         (pct (log10 (+ 1 (/ i))))))

Output


N OBS EXP
1: 30.1% 30.1%
2: 17.6% 17.6%
3: 12.5% 12.5%
4: 9.7% 9.7%
5: 7.9% 7.9%
6: 6.7% 6.7%
7: 5.8% 5.8%
8: 5.1% 5.1%
9: 4.6% 4.6%</lang>

REXX

The REXX language practically hasn't any high math functions, so the E, LN, and LOG functions were included herein. Note that the E and LN10 functions return a limited amount of accuracy, and they should be greater than 50 digits (in this case).

Note that prime numbers don't lend themselves to Benford's law. Apologies to those people's eyes looking at the prime number generator. Uf-ta!

For the extra credit stuff, I choose to generate the primes and factorials rather than find a web-page with them listed, as each list is very easy to generate. <lang rexx>/*REXX program demonstrates some common trig functions (30 digits shown)*/ numeric digits 50 /*use only 50 digits for LN, LOG.*/ parse arg N .; if N== then N=1000 /*allow sample size specification*/

              /*══════════════apply Benford's law to Fibonacci numbers.*/

@.=1; do j=3 to N; jm1=j-1; jm2=j-2; @.j=@.jm2+@.jm1; end /*j*/ call show_results "Benford's law applied to" N 'Fibonacci numbers'

              /*══════════════apply Benford's law to prime numbers.    */

p=0; do j=2 until p==N; if \isPrime(j) then iterate; p=p+1; @.p=j;end call show_results "Benford's law applied to" N 'prime numbers'

              /*══════════════apply Benford's law to factorials.       */
        do j=1  for N;      @.j=!(j);  end  /*j*/

call show_results "Benford's law applied to" N 'factorial products' exit /*stick a fork in it, we're done.*/ /*──────────────────────────────────SHOW_RESULTS subroutine─────────────*/ show_results: w1=max(length('observed'),length(N-2)) ; say pad=' '; w2=max(length('expected' ),length(N )) say pad 'digit' pad center('observed',w1) pad center('expected',w2) say pad '─────' pad center(,w1,'─') pad center(,w2,'─') pad arg(1) !.=0; do j=1 for N; _=left(@.j,1); !._=!._+1; end /*get 1st digits.*/

       do k=1  for 9                  /*show results for Fibonacci nums*/
       say pad center(k,5) pad center(format(!.k/N,,length(N-2)),w1),
                           pad center(format(log(1+1/k),,length(N)+2),w2)
       end   /*k*/

return /*──────────────────────────────────one─line subroutines───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────*/ !: procedure; parse arg x; !=1; do j=2 to x; !=!*j; end /*j*/; return ! e: return 2.7182818284590452353602874713526624977572470936999595749669676277240766303535 isprime: procedure; parse arg x; if wordpos(x,'2 3 5 7')\==0 then return 1; if x//2==0 then return 0; if x//3==0 then return 0; do j=5 by 6 until j*j>x; if x//j==0 then return 0; if x//(j+2)==0 then return 0; end; return 1 ln10:return 2.30258509299404568401799145468436420760110148862877297603332790096757260967735248023599720508959829834196778404228624863340952546508280675666628736909878168948290720832555468084379989482623319852839350530896538 ln:procedure expose $.;parse arg x,f;if x==10 then do;_=ln10();xx=format(_);if xx\==_ then return xx;end;call e;ig=x>1.5;is=1-2*(ig\==1);ii=0;xx=x;return .ln_comp() .ln_comp:do while ig&xx>1.5|\ig&xx<.5;_=e();do k=-1;iz=xx*_**-is;if k>=0&(ig&iz<1|\ig&iz>.5) then leave;_=_*_;izz=iz;end;xx=izz;ii=ii+is*2**k;end;x=x*e()**-ii-1;z=0;_=-1;p=z;do k=1;_=-_*x;z=z+_/k;if z=p then leave;p=z;end;return z+ii log:return ln(arg(1))/ln(10)</lang> output when using the default input:

    digit     observed      expected
    ─────     ────────      ────────     Benford's law applied to 1000 Fibonacci numbers
      1         0.301       0.301030
      2         0.177       0.176091
      3         0.125       0.124939
      4         0.096       0.096910
      5         0.080       0.079181
      6         0.067       0.066947
      7         0.056       0.057992
      8         0.053       0.051153
      9         0.045       0.045757

    digit     observed      expected
    ─────     ────────      ────────     Benford's law applied to 1000 prime numbers
      1         0.160       0.301030
      2         0.146       0.176091
      3         0.139       0.124939
      4         0.139       0.096910
      5         0.131       0.079181
      6         0.135       0.066947
      7         0.118       0.057992
      8         0.017       0.051153
      9         0.015       0.045757

    digit     observed      expected
    ─────     ────────      ────────     Benford's law applied to 1000 factorial products
      1         0.293       0.301030
      2         0.176       0.176091
      3         0.124       0.124939
      4         0.102       0.096910
      5         0.069       0.079181
      6         0.087       0.066947
      7         0.051       0.057992
      8         0.051       0.051153
      9         0.047       0.045757

output when using 100,000 primes:

    digit     observed      expected
    ─────     ────────      ────────     Benford's law applied to 100000 prime numbers
      1        0.31087      .3010300
      2        0.09142      .1760912
      3        0.08960      .1249387
      4        0.08747      .0969100
      5        0.08615      .0791812
      6        0.08458      .0669467
      7        0.08435      .0579919
      8        0.08326      .0511525
      9        0.08230      .0457574

Scala

<lang scala>// Fibonacci Sequence (begining with 1,1): 1 1 2 3 5 8 13 21 34 55 ... val fibs : Stream[BigInt] = { def series(i:BigInt,j:BigInt):Stream[BigInt] = i #:: series(j, i+j); series(1,0).tail.tail }

/**

* Given a numeric sequence, return the distribution of the most-signicant-digit 
* as expected by Benford's Law and then by actual distribution.
*/

def benford[N:Numeric]( data:Seq[N] ) : Map[Int,(Double,Double)] = {

 import scala.math._
 
 val maxSize = 10000000  // An arbitrary size to avoid problems with endless streams
 
 val size = (data.take(maxSize)).size.toDouble
 
 val distribution = data.take(maxSize).groupBy(_.toString.head.toString.toInt).map{ case (d,l) => (d -> l.size) }
 
 (for( i <- (1 to 9) ) yield { (i -> (log10(1D + 1D / i), (distribution(i) / size))) }).toMap

}

{

 println( "Fibonacci Sequence (size=1000): 1 1 2 3 5 8 13 21 34 55 ...\n" )
 println( "%9s %9s %9s".format( "Actual", "Expected", "Deviation" ) )

 benford( fibs.take(1000) ).toList.sorted foreach { 
   case (k, v) => println( "%d: %5.2f%% | %5.2f%% | %5.4f%%".format(k,v._2*100,v._1*100,math.abs(v._2-v._1)*100) ) 
 }

}</lang>

Output:

Fibonacci Sequence (size=1000): 1 1 2 3 5 8 13 21 34 55 ...

   Actual  Expected Deviation
1: 30.10% | 30.10% | 0.0030%
2: 17.70% | 17.61% | 0.0909%
3: 12.50% | 12.49% | 0.0061%
4:  9.60% |  9.69% | 0.0910%
5:  8.00% |  7.92% | 0.0819%
6:  6.70% |  6.69% | 0.0053%
7:  5.60% |  5.80% | 0.1992%
8:  5.30% |  5.12% | 0.1847%
9:  4.50% |  4.58% | 0.0757%

Tcl

<lang tcl>proc benfordTest {numbers} {

   # Count the leading digits (RE matches first digit in each number,
   # even if negative)
   set accum {1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0}
   foreach n $numbers {

if {[regexp {[1-9]} $n digit]} { dict incr accum $digit }

   # Print the report
   puts " digit | measured | theory"
   puts "-------+----------+--------"
   dict for {digit count} $accum {

puts [format "%6d | %7.2f%% | %5.2f%%" $digit \ [expr {$count * 100.0 / [llength $numbers]}] \ [expr {log(1+1./$digit)/log(10)*100.0}]]

}</lang> Demonstrating with Fibonacci numbers: <lang tcl>proc fibs n {

   for {set a 1;set b [set i 0]} {$i < $n} {incr i} {

lappend result [set b [expr {$a + [set a $b]}]]

   }
   return $result

} benfordTest [fibs 1000]</lang>

Output:

 digit | measured | theory
-------+----------+--------
     1 |   30.10% | 30.10%
     2 |   17.70% | 17.61%
     3 |   12.50% | 12.49%
     4 |    9.60% |  9.69%
     5 |    8.00% |  7.92%
     6 |    6.70% |  6.69%
     7 |    5.60% |  5.80%
     8 |    5.30% |  5.12%
     9 |    4.50% |  4.58%