Entropy: Difference between revisions

From Rosetta Code
Content added Content deleted
No edit summary
imported>Thebeez
 
(141 intermediate revisions by 70 users not shown)
Line 1: Line 1:
{{task|Mathematics}}
{{task|Mathematics}}
Calculate the [[wp:Entropy (information theory)|information entropy]] (Shannon entropy) of a given input string.


;Task:
Entropy is the [[wp:Expected value|expected value]] of the measure of [[wp:Self-information|information]] content in a system.
In general, the Shannon entropy of a variable <math>X</math> is defined as:
Calculate the Shannon entropy &nbsp; H &nbsp; of a given input string.
:<math>H(X) = \sum_{x\in\Omega} P(x) I(x)</math>
where the information content <math>I(x) = -\log_{b} P(x)</math>.
If the base of the logarithm <math>b = 2</math>, the result is expressed in ''bits'', a [[wp:Units of information|unit of information]].
Therefore, given a string <math>S</math> of length <math>n</math> where <math>P(s_i)</math> is the relative frequency of each character, the entropy of a string in bits is:
:<math>H(S) = -\sum_{i=0}^n P(s_i) \log_2 (P(s_i))</math>
For this task, use "<tt>1223334444</tt>" as an example.
The result should be around 1.84644 bits.


Given the discrete random variable <math>X</math> that is a string of <math>N</math> "symbols" (total characters) consisting of <math>n</math> different characters (n=2 for binary), the Shannon entropy of X in '''bits/symbol''' is :
Related Tasks:
:<math>H_2(X) = -\sum_{i=1}^n \frac{count_i}{N} \log_2 \left(\frac{count_i}{N}\right)</math>


where <math>count_i</math> is the count of character <math>n_i</math>.
:::* [[Fibonacci_word]]
:::* [[Entropy/Narcissist]]


For this task, use X="<tt>1223334444</tt>" as an example. The result should be 1.84644... bits/symbol. This assumes X was a random variable, which may not be the case, or it may depend on the observer.

This coding problem calculates the "specific" or "[[wp:Intensive_and_extensive_properties|intensive]]" entropy that finds its parallel in physics with "specific entropy" S<sup>0</sup> which is entropy per kg or per mole, not like physical entropy S and therefore not the "information" content of a file. It comes from Boltzmann's H-theorem where <math>S=k_B N H</math> where N=number of molecules. Boltzmann's H is the same equation as Shannon's H, and it gives the specific entropy H on a "per molecule" basis.

The "total", "absolute", or "[[wp:Intensive_and_extensive_properties|extensive]]" information entropy is
:<math>S=H_2 N</math> bits
This is not the entropy being coded here, but it is the closest to physical entropy and a measure of the information content of a string. But it does not look for any patterns that might be available for compression, so it is a very restricted, basic, and certain measure of "information". Every binary file with an equal number of 1's and 0's will have S=N bits. All hex files with equal symbol frequencies will have <math>S=N \log_2(16)</math> bits of entropy. The total entropy in bits of the example above is S= 10*18.4644 = 18.4644 bits.

The H function does not look for any patterns in data or check if X was a random variable. For example, X=000000111111 gives the same calculated entropy in all senses as Y=010011100101. For most purposes it is usually more relevant to divide the gzip length by the length of the original data to get an informal measure of how much "order" was in the data.

Two other "entropies" are useful:

Normalized specific entropy:
:<math>H_n=\frac{H_2 * \log(2)}{\log(n)}</math>
which varies from 0 to 1 and it has units of "entropy/symbol" or just 1/symbol. For this example, H<sub>n<\sub>= 0.923.

Normalized total (extensive) entropy:
:<math>S_n = \frac{H_2 N * \log(2)}{\log(n)}</math>
which varies from 0 to N and does not have units. It is simply the "entropy", but it needs to be called "total normalized extensive entropy" so that it is not confused with Shannon's (specific) entropy or physical entropy. For this example, S<sub>n<\sub>= 9.23.

Shannon himself is the reason his "entropy/symbol" H function is very confusingly called "entropy". That's like calling a function that returns a speed a "meter". See section 1.7 of his classic [http://worrydream.com/refs/Shannon%20-%20A%20Mathematical%20Theory%20of%20Communication.pdf A Mathematical Theory of Communication] and search on "per symbol" and "units" to see he always stated his entropy H has units of "bits/symbol" or "entropy/symbol" or "information/symbol". So it is legitimate to say entropy NH is "information".

In keeping with Landauer's limit, the physics entropy generated from erasing N bits is <math>S = H_2 N k_B \ln(2)</math> if the bit storage device is perfectly efficient. This can be solved for H<sub>2</sub>*N to (arguably) get the number of bits of information that a physical entropy represents.

;Related tasks:
:* [[Fibonacci_word]]
:* [[Entropy/Narcissist]]
<br><br>

=={{header|11l}}==
<syntaxhighlight lang="11l">F entropy(source)
DefaultDict[Char, Int] hist
L(c) source
hist[c]++
V r = 0.0
L(v) hist.values()
V c = Float(v) / source.len
r -= c * log2(c)
R r

print(entropy(‘1223334444’))</syntaxhighlight>
{{out}}
<pre>
1.84644
</pre>


=={{header|Ada}}==
=={{header|Ada}}==
Uses Ada 2012.
Uses Ada 2012.
<lang Ada>with Ada.Text_IO, Ada.Float_Text_IO, Ada.Numerics.Elementary_Functions;
<syntaxhighlight lang="ada">with Ada.Text_IO, Ada.Float_Text_IO, Ada.Numerics.Elementary_Functions;


procedure Count_Entropy is
procedure Count_Entropy is
Line 47: Line 84:
Put(Result, Fore => 1, Aft => 5, Exp => 0);
Put(Result, Fore => 1, Aft => 5, Exp => 0);
end;
end;
end Count_Entropy;</lang>
end Count_Entropy;</syntaxhighlight>


=={{header|Aime}}==
=={{header|Aime}}==
<lang aime>integer i, l;
<syntaxhighlight lang="aime">integer c;
real h, v;
record r;
real h, x;
index x;
text s;
data s;


s = argv(1);
for (, c in (s = argv(1))) {
x[c] += 1r;
l = length(s);

i = l;
while (i) {
i -= 1;
rn_a_integer(r, cut(s, i, 1), 1);
}
}


h = 0;
h = 0;
if (r_first(r, s)) {
for (, v in x) {
do {
v /= ~s;
x = r_q_integer(r, s);
h -= v * log2(v);
x /= l;
h -= x * log2(x);
} while (r_greater(r, s, s));
}
}


o_form("/d6/\n", h);</syntaxhighlight>
o_real(6, h);
o_newline();</lang>
Examples:
Examples:
<pre>$ aime -a tmp/entr 1223334444
<pre>$ aime -a tmp/entr 1223334444
Line 84: Line 112:


=={{header|ALGOL 68}}==
=={{header|ALGOL 68}}==
<syntaxhighlight lang="algol68">BEGIN
{{works with|ALGOL 68G|Any - tested with release 2.8.win32}}
<lang algol68># calculate the shannon entropy of a string #
# calculate the shannon entropy of a string #
PROC shannon entropy = ( STRING s )REAL:

PROC shannon entropy = ( STRING s )REAL:
BEGIN
BEGIN

INT string length = ( UPB s - LWB s ) + 1;
INT string length = ( UPB s - LWB s ) + 1;
# count the occurences of each character #

# count the occurances of each character #

[ 0 : max abs char ]INT char count;
[ 0 : max abs char ]INT char count;

FOR char pos FROM LWB char count TO UPB char count DO
FOR char pos FROM LWB char count TO UPB char count DO
char count[ char pos ] := 0
char count[ char pos ] := 0
OD;
OD;

FOR char pos FROM LWB s TO UPB s DO
FOR char pos FROM LWB s TO UPB s DO
char count[ ABS s[ char pos ] ] +:= 1
char count[ ABS s[ char pos ] ] +:= 1
OD;
OD;

# calculate the entropy, we use log base 10 and then convert #
# calculate the entropy, we use log base 10 and then convert #
# to log base 2 after calculating the sum #
# to log base 2 after calculating the sum #

REAL entropy := 0;
REAL entropy := 0;

FOR char pos FROM LWB char count TO UPB char count DO
FOR char pos FROM LWB char count TO UPB char count DO
IF char count[ char pos ] /= 0
IF char count[ char pos ] /= 0
Line 117: Line 136:
FI
FI
OD;
OD;

entropy / log( 2 )
entropy / log( 2 )
END; # shannon entropy #
END; # shannon entropy #




main:
(
# test the shannon entropy routine #
# test the shannon entropy routine #
print( ( shannon entropy( "1223334444" ), newline ) )
print( ( shannon entropy( "1223334444" ), newline ) )

)
END</syntaxhighlight>
</lang>
{{out}}
{{out}}
<pre>
<pre>
Line 136: Line 150:
=={{header|ALGOL W}}==
=={{header|ALGOL W}}==
{{trans|ALGOL 68}}
{{trans|ALGOL 68}}
<lang algolw>begin
<syntaxhighlight lang="algolw">begin
% calculates the shannon entropy of a string %
% calculates the shannon entropy of a string %
% strings are fixed length in algol W and the length is part of the %
% strings are fixed length in algol W and the length is part of the %
Line 189: Line 203:
write( shannon_entropy( "1223334444", 10 ) )
write( shannon_entropy( "1223334444", 10 ) )


end.</lang>
end.</syntaxhighlight>
{{out}}
{{out}}
<pre>
<pre>
1.846439
1.846439
</pre>
</pre>

=={{header|APL}}==
<syntaxhighlight lang="apl">
ENTROPY←{-+/R×2⍟R←(+⌿⍵∘.=∪⍵)÷⍴⍵}

⍝ How it works:
⎕←UNIQUE←∪X←'1223334444'
1234
⎕←TABLE_OF_OCCURENCES←X∘.=UNIQUE
1 0 0 0
0 1 0 0
0 1 0 0
0 0 1 0
0 0 1 0
0 0 1 0
0 0 0 1
0 0 0 1
0 0 0 1
0 0 0 1
⎕←COUNT←+⌿TABLE_OF_OCCURENCES
1 2 3 4
⎕←N←⍴X
10
⎕←RATIO←COUNT÷N
0.1 0.2 0.3 0.4
-+/RATIO×2⍟RATIO
1.846439345
</syntaxhighlight>
{{out}}
<pre>
ENTROPY X
1.846439345
</pre>

=={{header|Arturo}}==
<syntaxhighlight lang="rebol">entropy: function [s][
t: #[]
loop s 'c [
unless key? t c -> t\[c]: 0
t\[c]: t\[c] + 1
]
result: new 0
loop values t 'x ->
'result - (x//(size s)) * log x//(size s) 2

return result
]

print entropy "1223334444"</syntaxhighlight>

{{out}}

<pre>1.846439344671015</pre>


=={{header|AutoHotkey}}==
=={{header|AutoHotkey}}==
<lang AutoHotkey>MsgBox, % Entropy(1223334444)
<syntaxhighlight lang="autohotkey">MsgBox, % Entropy(1223334444)


Entropy(n)
Entropy(n)
Line 213: Line 280:
}
}
return, e
return, e
}</lang>
}</syntaxhighlight>
{{out}}
{{out}}
<pre>1.846440</pre>
<pre>1.846440</pre>


=={{header|AWK}}==
=={{header|AWK}}==
<lang awk>#!/usr/bin/awk -f
<syntaxhighlight lang="awk">#!/usr/bin/awk -f
{
{
N = length
for (i=1; i<= length($0); i++) {
for (i = 1; i <= N; ++i)
H[substr($0,i,1)]++;
++H[substr($0, i, 1)]
N++;
}
}
}


END {
END {
for (i in H) {
for (i in H)
p = H[i]/N;
S += H[i] * log(H[i])
E -= p * log(p);
print (log(N) - S / N) / log(2)
}</syntaxhighlight>
}
print E/log(2);
}</lang>
{{out|Usage}}
{{out|Usage}}
<lang bash> echo 1223334444 |./entropy.awk
<syntaxhighlight lang="sh"> echo 1223334444 |./entropy.awk
1.84644 </lang>
1.84644</syntaxhighlight>

=={{header|BASIC}}==
Works with older (unstructured) Microsoft-style BASIC.
<syntaxhighlight lang="basic">10 DEF FN L(X)=LOG(X)/LOG(2)
20 S$="1223334444"
30 U$=""
40 FOR I=1 TO LEN(S$)
50 K=0
60 FOR J=1 TO LEN(U$)
70 IF MID$(U$,J,1)=MID$(S$,I,1) THEN K=1
80 NEXT J
90 IF K=0 THEN U$=U$+MID$(S$,I,1)
100 NEXT I
110 DIM R(LEN(U$)-1)
120 FOR I=1 TO LEN(U$)
130 C=0
140 FOR J=1 TO LEN(S$)
150 IF MID$(U$,I,1)=MID$(S$,J,1) THEN C=C+1
160 NEXT J
170 R(I-1)=(C/LEN(S$))*FN L(C/LEN(S$))
180 NEXT I
190 E=0
200 FOR I=0 TO LEN(U$)-1
210 E=E-R(I)
220 NEXT I
230 PRINT E</syntaxhighlight>
{{out}}
<pre>1.84643935</pre>

==={{header|QBasic}}===
<syntaxhighlight lang="qbasic">FUNCTION L (X)
L = LOG(X) / LOG(2)
END FUNCTION

S$ = "1223334444"
U$ = ""
FOR I = 1 TO LEN(S$)
K = 0
FOR J = 1 TO LEN(U$)
IF MID$(U$, J, 1) = MID$(S$, I, 1) THEN K = 1
NEXT J
IF K = 0 THEN U$ = U$ + MID$(S$, I, 1)
NEXT I
DIM R(LEN(U$) - 1)
FOR I = 1 TO LEN(U$)
C = 0
FOR J = 1 TO LEN(S$)
IF MID$(U$, I, 1) = MID$(S$, J, 1) THEN C = C + 1
NEXT J
R(I - 1) = (C / LEN(S$)) * L(C / LEN(S$))
NEXT I
E = 0
FOR I = 0 TO LEN(U$) - 1
E = E - R(I)
NEXT I
PRINT E
END</syntaxhighlight>

==={{header|Sinclair ZX81 BASIC}}===
Works with 1k of RAM.
<syntaxhighlight lang="basic"> 10 LET X$="1223334444"
20 LET U$=""
30 FOR I=1 TO LEN X$
40 LET K=0
50 FOR J=1 TO LEN U$
60 IF U$(J)=X$(I) THEN LET K=K+1
70 NEXT J
80 IF K=0 THEN LET U$=U$+X$(I)
90 NEXT I
100 DIM R(LEN U$)
110 FOR I=1 TO LEN U$
120 LET C=0
130 FOR J=1 TO LEN X$
140 IF U$(I)=X$(J) THEN LET C=C+1
150 NEXT J
160 LET R(I)=C/LEN X$*LN (C/LEN X$)/LN 2
170 NEXT I
180 LET E=0
190 FOR I=1 TO LEN U$
200 LET E=E-R(I)
210 NEXT I
220 PRINT E</syntaxhighlight>
{{out}}
<pre>1.8464393</pre>
==={{header|uBasic/4tH}}===
{{Trans|QBasic}}
uBasic/4tH is an integer BASIC only. So, fixed point arithmetic is required go fulfill this task. Some loss of precision is unavoidable.
<syntaxhighlight lang="basic">If Info("wordsize") < 64 Then Print "This program requires a 64-bit uBasic" : End

s := "1223334444"
u := ""
x := FUNC(_Fln(FUNC(_Ntof(2)))) ' calculate LN(2)

For i = 0 TO Len(s)-1
k = 0
For j = 0 TO Len(u)-1
If Peek(u, j) = Peek(s, i) Then k = 1
Next
If k = 0 THEN u = Join(u, Char (Peek (s, i)))
Next

Dim @r(Len(u)-1)

For i = 0 TO Len(u)-1
c = 0
For J = 0 TO Len(s)-1
If Peek(u, i) = Peek (s, j) Then c = c + 1
Next
q = FUNC(_Fdiv(c, Len(s)))
@r(i) = FUNC(_Fmul(q, FUNC(_Fdiv(FUNC(_Fln(q)), x))))
Next

e = 0
For i = 0 To Len(u) - 1
e = e - @r(i)
Next

Print Using "+?.####"; FUNC(_Ftoi(e))

End

_Fln Param (1) : Return (FUNC(_Ln(a@*4))/4)
_Fmul Param (2) : Return ((a@*b@)/16384)
_Fdiv Param (2) : Return ((a@*16384)/b@)
_Ntof Param (1) : Return (a@*16384)
_Ftoi Param (1) : Return ((10000*a@)/16384)

_Ln
Param (1)
Local (2)

c@=681391
If (a@<32768) Then a@=SHL(a@, 16) : c@=c@-726817
If (a@<8388608) Then a@=SHL(a@, 8) : c@=c@-363409
If (a@<134217728) Then a@=SHL(a@, 4) : c@=c@-181704
If (a@<536870912) Then a@=SHL(a@, 2) : c@=c@-90852
If (a@<1073741824) Then a@=SHL(a@, 1) : c@=c@-45426
b@=a@+SHL(a@, -1) : If (AND(b@, 2147483648)) = 0 Then a@=b@ : c@=c@-26573
b@=a@+SHL(a@, -2) : If (AND(b@, 2147483648)) = 0 Then a@=b@ : c@=c@-14624
b@=a@+SHL(a@, -3) : If (AND(b@, 2147483648)) = 0 Then a@=b@ : c@=c@-7719
b@=a@+SHL(a@, -4) : If (AND(b@, 2147483648)) = 0 Then a@=b@ : c@=c@-3973
b@=a@+SHL(a@, -5) : If (AND(b@, 2147483648)) = 0 Then a@=b@ : c@=c@-2017
b@=a@+SHL(a@, -6) : If (AND(b@, 2147483648)) = 0 Then a@=b@ : c@=c@-1016
b@=a@+SHL(a@, -7) : If (AND(b@, 2147483648)) = 0 Then a@=b@ : c@=c@-510
a@=2147483648-a@;
c@=c@-SHL(a@, -15)
Return (c@)</syntaxhighlight>
{{Out}}
<pre>1.8461

0 OK, 0:638</pre>

=={{header|BBC BASIC}}==
{{trans|APL}}
<syntaxhighlight lang="bbcbasic">REM >entropy
PRINT FNentropy("1223334444")
END
:
DEF FNentropy(x$)
LOCAL unique$, count%, n%, ratio(), u%, i%, j%
unique$ = ""
n% = LEN x$
FOR i% = 1 TO n%
IF INSTR(unique$, MID$(x$, i%, 1)) = 0 THEN unique$ += MID$(x$, i%, 1)
NEXT
u% = LEN unique$
DIM ratio(u% - 1)
FOR i% = 1 TO u%
count% = 0
FOR j% = 1 TO n%
IF MID$(unique$, i%, 1) = MID$(x$, j%, 1) THEN count% += 1
NEXT
ratio(i% - 1) = (count% / n%) * FNlogtwo(count% / n%)
NEXT
= -SUM(ratio())
:
DEF FNlogtwo(n)
= LN n / LN 2</syntaxhighlight>
{{out}}
<pre>1.84643934</pre>

=={{header|BQN}}==
<syntaxhighlight lang="bqn">H ← -∘(+´⊢×2⋆⁼⊢)∘((+˝⊢=⌜⍷)÷≠)

H "1223334444"</syntaxhighlight>
{{out}}
<pre>1.8464393446710154</pre>


=={{header|Burlesque}}==
=={{header|Burlesque}}==
<lang burlesque>blsq ) "1223334444"F:u[vv^^{1\/?/2\/LG}m[?*++
<syntaxhighlight lang="burlesque">blsq ) "1223334444"F:u[vv^^{1\/?/2\/LG}m[?*++
1.8464393446710157</lang>
1.8464393446710157</syntaxhighlight>


=={{header|C}}==
=={{header|C}}==
<syntaxhighlight lang="c">#include <stdio.h>

<lang c>#include <stdio.h>
#include <stdlib.h>
#include <stdlib.h>
#include <stdbool.h>
#include <stdbool.h>
#include <string.h>
#include <string.h>
#include <math.h>
#include <math.h>

#define MAXLEN 100 //maximum string length
#define MAXLEN 100 //maximum string length

int makehist(char *S,int *hist,int len){
int makehist(unsigned char *S,int *hist,int len){
int wherechar[256];
int wherechar[256];
int i,histlen;
int i,histlen;
Line 265: Line 515:
return histlen;
return histlen;
}
}

double entropy(int *hist,int histlen,int len){
double entropy(int *hist,int histlen,int len){
int i;
int i;
Line 275: Line 525:
return H;
return H;
}
}

int main(void){
int main(void){
char S[MAXLEN];
unsigned char S[MAXLEN];
int len,*hist,histlen;
int len,*hist,histlen;
double H;
double H;
Line 288: Line 538:
printf("%lf\n",H);
printf("%lf\n",H);
return 0;
return 0;
}</lang>
}</syntaxhighlight>
Examples:
Examples:
<lang>$ ./entropy
<syntaxhighlight lang="text">$ ./entropy
1223334444
1223334444
1.846439
1.846439
$ ./entropy
$ ./entropy
Rosetta Code is the best site in the world!
Rosetta Code is the best site in the world!
3.646513</lang>
3.646513</syntaxhighlight>


=={{header|C++}}==
=={{header|C sharp|C#}}==
<lang cpp>#include <string>
#include <map>
#include <iostream>
#include <algorithm>
#include <cmath>

double log2( double number ) {
return log( number ) / log( 2 ) ;
}

int main( int argc , char *argv[ ] ) {
std::string teststring( argv[ 1 ] ) ;
std::map<char , int> frequencies ;
for ( char c : teststring )
frequencies[ c ] ++ ;
int numlen = teststring.length( ) ;
double infocontent = 0 ;
for ( std::pair<char , int> p : frequencies ) {
double freq = static_cast<double>( p.second ) / numlen ;
infocontent += freq * log2( freq ) ;
}
infocontent *= -1 ;
std::cout << "The information content of " << teststring
<< " is " << infocontent << " !\n" ;
return 0 ;
}</lang>
{{out}}
<pre>The information content of 1223334444 is 1.84644 !</pre>

=={{header|Clojure}}==
<lang Clojure>(defn entropy [s]
(let [len (count s), log-2 (Math/log 2)]
(->> (frequencies s)
(map (fn [[_ v]]
(let [rf (/ v len)]
(-> (Math/log rf) (/ log-2) (* rf) Math/abs))))
(reduce +))))</lang>
{{out}}
<lang Clojure>(entropy "1223334444")
1.8464393446710154</lang>
=={{header|C sharp}}==
Translation of C++.
Translation of C++.
<lang csharp>
<syntaxhighlight lang="csharp">
using System;
using System;
using System.Collections.Generic;
using System.Collections.Generic;
Line 380: Line 589:
}
}
}
}
</syntaxhighlight>
</lang>
{{out}}
{{out}}
<pre>The Entropy of 1223334444 is 1.84643934467102</pre>
<pre>The Entropy of 1223334444 is 1.84643934467102</pre>
Without using Hashtables or Dictionaries:
Without using Hashtables or Dictionaries:
<lang csharp>using System;
<syntaxhighlight lang="csharp">using System;
namespace Entropy
namespace Entropy
{
{
Line 426: Line 635:
}
}
}
}
}</lang>
}</syntaxhighlight>

=={{header|C++}}==
<syntaxhighlight lang="cpp">#include <string>
#include <map>
#include <iostream>
#include <algorithm>
#include <cmath>

double log2( double number ) {
return log( number ) / log( 2 ) ;
}

int main( int argc , char *argv[ ] ) {
std::string teststring( argv[ 1 ] ) ;
std::map<char , int> frequencies ;
for ( char c : teststring )
frequencies[ c ] ++ ;
int numlen = teststring.length( ) ;
double infocontent = 0 ;
for ( std::pair<char , int> p : frequencies ) {
double freq = static_cast<double>( p.second ) / numlen ;
infocontent -= freq * log2( freq ) ;
}
std::cout << "The information content of " << teststring
<< " is " << infocontent << std::endl ;
return 0 ;
}</syntaxhighlight>
{{out}}
<pre>(entropy "1223334444")
The information content of 1223334444 is 1.84644</pre>

=={{header|Clojure}}==
<syntaxhighlight lang="clojure">(defn entropy [s]
(let [len (count s), log-2 (Math/log 2)]
(->> (frequencies s)
(map (fn [[_ v]]
(let [rf (/ v len)]
(-> (Math/log rf) (/ log-2) (* rf) Math/abs))))
(reduce +))))</syntaxhighlight>
{{out}}
<syntaxhighlight lang="clojure">(entropy "1223334444")
1.8464393446710154</syntaxhighlight>

=={{header|CLU}}==
<syntaxhighlight lang="clu">% NOTE: when compiling with Portable CLU,
% this program needs to be merged with 'useful.lib' to get log()
%
% pclu -merge $CLUHOME/lib/useful.lib -compile entropy.clu

shannon = proc (s: string) returns (real)
% find the frequency of each character
freq: array[int] := array[int]$fill(0, 256, 0)
for c: char in string$chars(s) do
i: int := char$c2i(c)
freq[i] := freq[i] + 1
end
% calculate the component for each character
h: real := 0.0
rlen: real := real$i2r(string$size(s))
for i: int in array[int]$indexes(freq) do
if freq[i] ~= 0 then
f: real := real$i2r(freq[i]) / rlen
h := h - f * log(f) / log(2.0)
end
end
return (h)
end shannon

start_up = proc ()
po: stream := stream$primary_output()
stream$putl(po, f_form(shannon("1223334444"), 1, 6))
end start_up </syntaxhighlight>
{{out}}
<pre>1.846439</pre>


=={{header|CoffeeScript}}==
=={{header|CoffeeScript}}==
<lang coffeescript>entropy = (s) ->
<syntaxhighlight lang="coffeescript">entropy = (s) ->
freq = (s) ->
freq = (s) ->
result = {}
result = {}
Line 441: Line 726:
((frq[f]/n for f of frq).reduce ((e, p) -> e - p * Math.log(p)), 0) * Math.LOG2E
((frq[f]/n for f of frq).reduce ((e, p) -> e - p * Math.log(p)), 0) * Math.LOG2E


console.log "The entropy of the string '1223334444' is #{entropy '1223334444'}"</lang>
console.log "The entropy of the string '1223334444' is #{entropy '1223334444'}"</syntaxhighlight>
{{out}}
{{out}}
<pre>The entropy of the string '1223334444' is 1.8464393446710157</pre>
<pre>The entropy of the string '1223334444' is 1.8464393446710157</pre>


=={{header|Common Lisp}}==
=={{header|Common Lisp}}==
Not very Common Lisp-y version:
<lang lisp>(defun entropy (string)
<syntaxhighlight lang="lisp">(defun entropy (string)
(let ((table (make-hash-table :test 'equal))
(let ((table (make-hash-table :test 'equal))
(entropy 0))
(entropy 0))
(mapc (lambda (c) (setf (gethash c table) (+ (gethash c table 0) 1)))
(mapc (lambda (c) (setf (gethash c table) (+ (gethash c table 0) 1)))
(coerce string 'list))
(coerce string 'list))
(maphash (lambda (k v) (decf entropy (* (/ v (length input-string)) (log (/ v (length input-string)) 2))))
(maphash (lambda (k v)
(decf entropy (* (/ v (length input-string))
(log (/ v (length input-string)) 2))))
table)
table)
entropy))
entropy))</syntaxhighlight>

</lang>
More like Common Lisp version:

<syntaxhighlight lang="lisp">(defun entropy (string &aux (length (length string)))
(declare (type string string))
(let ((table (make-hash-table)))
(loop for char across string
do (incf (gethash char table 0)))
(- (loop for freq being each hash-value in table
for freq/length = (/ freq length)
sum (* freq/length (log freq/length 2))))))</syntaxhighlight>

=={{header|Crystal}}==
<syntaxhighlight lang="ruby"># Method to calculate sum of Float64 array
def sum(array : Array(Float64))
res = 0
array.each do |n|
res += n
end
res
end

# Method to calculate which char appears how often
def histogram(source : String)
hist = {} of Char => Int32
l = 0
source.each_char do |e|
if !hist.has_key? e
hist[e] = 0
end
hist[e] += 1
end
return Tuple.new(source.size, hist)
end

# Method to calculate entropy from histogram
def entropy(hist : Hash(Char, Int32), l : Int32)
elist = [] of Float64
hist.each do |el|
v = el[1]
c = v / l
elist << (-c * Math.log(c, 2))
end
return sum elist
end

source = "1223334444"
hist_res = histogram source
l = hist_res[0]
h = hist_res[1]
puts ".[Results]."
puts "Length: " + l.to_s
puts "Entropy: " + (entropy h, l).to_s</syntaxhighlight>


=={{header|D}}==
=={{header|D}}==
<lang d>import std.stdio, std.algorithm, std.math;
<syntaxhighlight lang="d">import std.stdio, std.algorithm, std.math;


double entropy(T)(T[] s)
double entropy(T)(T[] s)
Line 472: Line 812:
void main() {
void main() {
"1223334444"d.dup.entropy.writeln;
"1223334444"d.dup.entropy.writeln;
}</lang>
}</syntaxhighlight>
{{out}}
{{out}}
<pre>1.84644</pre>
<pre>1.84644</pre>

=={{header|Delphi}}==
{{libheader| StrUtils}}
{{libheader| Math}}
{{Trans|Pascal}}
Just fix Pascal code to run in Delphi.
<syntaxhighlight lang="delphi">
program Entropytest;

uses
StrUtils,
Math;

type
FArray = array of CARDINAL;

var
strng: string = '1223334444';

// list unique characters in a string
function uniquechars(str: string): string;
var
n: CARDINAL;
begin
Result := '';
for n := 1 to length(str) do
if (PosEx(str[n], str, n) > 0) and (PosEx(str[n], Result, 1) = 0) then
Result := Result + str[n];
end;

// obtain a list of character-frequencies for a string
// given a string containing its unique characters
function frequencies(str, ustr: string): FArray;
var
u, s, p, o: CARDINAL;
begin
SetLength(Result, Length(ustr) + 1);
p := 0;
for u := 1 to length(ustr) do
for s := 1 to length(str) do
begin
o := p;
p := PosEx(ustr[u], str, s);
if (p > o) then
INC(Result[u]);
end;
end;

// Obtain the Shannon entropy of a string
function entropy(s: string): EXTENDED;
var
pf: FArray;
us: string;
i, l: CARDINAL;
begin
us := uniquechars(s);
pf := frequencies(s, us);
l := length(s);
Result := 0.0;
for i := 1 to length(us) do
Result := Result - pf[i] / l * log2(pf[i] / l);
end;

begin
Writeln('Entropy of "', strng, '" is ', entropy(strng): 2: 5, ' bits.');
readln;
end.</syntaxhighlight>

=={{header|EasyLang}}==
<syntaxhighlight>
func entropy s$ .
len d[] 255
for c$ in strchars s$
d[strcode c$] += 1
.
for cnt in d[]
if cnt > 0
prop = cnt / len s$
entr -= (prop * log10 prop / log10 2)
.
.
return entr
.
print entropy "1223334444"
</syntaxhighlight>

=={{header|EchoLisp}}==
<syntaxhighlight lang="scheme">
(lib 'hash)
;; counter: hash-table[key]++
(define (count++ ht k )
(hash-set ht k (1+ (hash-ref! ht k 0))))

(define (hi count n )
(define pi (// count n))
(- (* pi (log2 pi))))
;; (H [string|list]) → entropy (bits)
(define (H info)
(define S (if(string? info) (string->list info) info))
(define ht (make-hash))
(define n (length S))
(for ((s S)) (count++ ht s))
(for/sum ((s (make-set S))) (hi (hash-ref ht s) n)))
</syntaxhighlight>
{{out}}
<syntaxhighlight lang="scheme">
;; by increasing entropy

(H "🔴") → 0
(H "🔵🔴") → 1
(H "1223334444") → 1.8464393446710154
(H "♖♘♗♕♔♗♘♖♙♙♙♙♙♙♙♙♙") → 2.05632607578088
(H "EchoLisp") → 3
(H "Longtemps je me suis couché de bonne heure") → 3.860828877124944
(H "azertyuiopmlkjhgfdsqwxcvbn") → 4.700439718141092
(H (for/list ((i 1000)) (random 1000))) → 9.13772704467521
(H (for/list ((i 100_000)) (random 100_000))) → 15.777516877140766
(H (for/list ((i 1000_000)) (random 1000_000))) → 19.104028424596976

</syntaxhighlight>

=={{header|Elena}}==
{{trans|C#}}
ELENA 6.x :
<syntaxhighlight lang="elena">import system'math;
import system'collections;
import system'routines;
import extensions;
extension op
{
logTwo()
= self.ln() / 2.ln();
}
public program()
{
var input := console.readLine();
var infoC := 0.0r;
var table := Dictionary.new();
input.forEach::(ch)
{
var n := table[ch];
if (nil == n)
{
table[ch] := 1
}
else
{
table[ch] := n + 1
}
};
var freq := 0;
table.forEach::(letter)
{
freq := letter.toInt().realDiv(input.Length);
infoC += (freq * freq.logTwo())
};
infoC *= -1;
console.printLine("The Entropy of ", input, " is ", infoC)
}</syntaxhighlight>
{{out}}
<pre>
The Entropy of 1223334444 is 1.846439344671
</pre>


=={{header|Elixir}}==
=={{header|Elixir}}==
{{works with|Erlang/OTP|18}}
{{works with|Erlang/OTP|18}}
<code>:math.log2</code> was added in OTP 18.
<code>:math.log2</code> was added in OTP 18.
<lang elixir>defmodule RC do
<syntaxhighlight lang="elixir">defmodule RC do
def entropy(str) do
def entropy(str) do
leng = String.length(str)
leng = String.length(str)
String.split(str, "", trim: true)
String.graphemes(str)
|> Enum.group_by(&(&1))
|> Enum.group_by(&(&1))
|> Enum.map(fn{_,value} -> length(value) end)
|> Enum.map(fn{_,value} -> length(value) end)
Line 492: Line 1,005:
end
end


IO.inspect RC.entropy("1223334444")</lang>
IO.inspect RC.entropy("1223334444")</syntaxhighlight>


{{out}}
{{out}}
Line 500: Line 1,013:


=={{header|Emacs Lisp}}==
=={{header|Emacs Lisp}}==
<lang lisp>(defun shannon-entropy (input)
<syntaxhighlight lang="lisp">(defun shannon-entropy (input)
(let ((freq-table (make-hash-table))
(let ((freq-table (make-hash-table))
(entropy 0)
(entropy 0)
Line 514: Line 1,027:
(log (/ v length) 2)))))
(log (/ v length) 2)))))
freq-table)
freq-table)
(- entropy)))</lang>
(- entropy)))</syntaxhighlight>


{{out}}
{{out}}
Line 521: Line 1,034:
as shown below (type ctrl-j at the end of the first line
as shown below (type ctrl-j at the end of the first line
and the output will be placed by emacs on the second line).
and the output will be placed by emacs on the second line).
<lang lisp>(shannon-entropy "1223334444")
<syntaxhighlight lang="lisp">(shannon-entropy "1223334444")
1.8464393446710154</lang>
1.8464393446710154</syntaxhighlight>


=={{header|Erlang}}==
=={{header|Erlang}}==
<syntaxhighlight lang="erlang">
<lang Erlang>
-module( entropy ).
-module( entropy ).


Line 545: Line 1,058:
Frequency = How_many / String_length,
Frequency = How_many / String_length,
{String_length, Acc - (Frequency * math:log(Frequency))}.
{String_length, Acc - (Frequency * math:log(Frequency))}.
</syntaxhighlight>
</lang>


{{out}}
{{out}}
Line 554: Line 1,067:


=={{header|Euler Math Toolbox}}==
=={{header|Euler Math Toolbox}}==
<lang EulerMathToolbox>>function entropy (s) ...
<syntaxhighlight lang="eulermathtoolbox">>function entropy (s) ...
$ v=strtochar(s);
$ v=strtochar(s);
$ m=getmultiplicities(unique(v),v);
$ m=getmultiplicities(unique(v),v);
Line 561: Line 1,074:
$endfunction
$endfunction
>entropy("1223334444")
>entropy("1223334444")
1.84643934467</lang>
1.84643934467</syntaxhighlight>

=={{header|Excel}}==
This solution uses the <code>LAMBDA</code>, <code>LET</code>, and <code>MAP</code> functions introduced into the Microsoft 365 version of Excel in 2021. The <code>LET</code> function is able to use functions as first class citizens. Taking advantage of this makes the solution much simpler. The solution below looks for the string in cell <code>A1</code>.
<syntaxhighlight lang="excel">
=LET(
_MainS,A1,
_N,LEN(_MainS),
_Chars,UNIQUE(MID(_MainS,SEQUENCE(LEN(_MainS),1,1,1),1)),
calcH,LAMBDA(_c,(_c/_N)*LOG(_c/_N,2)),
getCount,LAMBDA(_i,LEN(_MainS)-LEN(SUBSTITUTE(_MainS,_i,""))),
_CharMap,MAP(_Chars,LAMBDA(a, calcH(getCount(a)))),
-SUM(_CharMap)
)
</syntaxhighlight>
_Chars uses the <code>SEQUENCE</code> function to split the text into an array. The <code>UNIQUE</code> function then returns a list of unique characters in the string.

<code>calcH</code> applies the calculation described at the top of the page that will then be summed for each character

<code>getCount</code> uses the <code>SUBSTITUTE</code> method to count the occurrences of a character within the string.

If you needed to re-use this calculation then you could wrap it in a <code>LAMBDA</code> function within the name manager, changing <code>A1</code> to a variable name (e.g. <code>String</code>):
<syntaxhighlight lang="excel">
ShannonEntropyH2=LAMBDA(String,LET(_MainS,String,_N,LEN(_MainS),_Chars,UNIQUE(MID(_MainS,SEQUENCE(LEN(_MainS),1,1,1),1)),calcH,LAMBDA(_c,(_c/_N)*LOG(_c/_N,2)),getCount,LAMBDA(_i,LEN(_MainS)-LEN(SUBSTITUTE(_MainS,_i,""))),_CharMap,MAP(_Chars,LAMBDA(a, calcH(getCount(a)))),-SUM(_CharMap)))
</syntaxhighlight>
Then you can just use the named lambda. E.g. If A1 = 1223334444 then:
<syntaxhighlight lang="excel">
=ShannonEntropyH2(A1)
</syntaxhighlight>
Returns 1.846439345





=={{header|F_Sharp|F#}}==
=={{header|F_Sharp|F#}}==
<lang fsharp>open System
<syntaxhighlight lang="fsharp">open System


let ld x = Math.Log x / Math.Log 2.
let ld x = Math.Log x / Math.Log 2.
Line 575: Line 1,118:
|> Seq.fold (fun e p -> e - p * ld p) 0.
|> Seq.fold (fun e p -> e - p * ld p) 0.


printfn "%f" (entropy "1223334444")</lang>
printfn "%f" (entropy "1223334444")</syntaxhighlight>
{{out}}
{{out}}
<pre>1.846439</pre>
<pre>1.846439</pre>

=={{header|Factor}}==
<syntaxhighlight lang="factor">USING: assocs kernel math math.functions math.statistics
prettyprint sequences ;
IN: rosetta-code.entropy

: shannon-entropy ( str -- entropy )
[ length ] [ histogram >alist [ second ] map ] bi
[ swap / ] with map
[ dup log 2 log / * ] map-sum neg ;
"1223334444" shannon-entropy .
"Factor is my favorite programming language." shannon-entropy .</syntaxhighlight>
{{out}}
<pre>
1.846439344671015
4.04291723248433
</pre>


=={{header|Forth}}==
=={{header|Forth}}==
<lang forth>: flog2 ( f -- f ) fln 2e fln f/ ;
<syntaxhighlight lang="forth">: flog2 ( f -- f ) fln 2e fln f/ ;


create freq 256 cells allot
create freq 256 cells allot
Line 601: Line 1,162:


s" 1223334444" entropy f. \ 1.84643934467102 ok
s" 1223334444" entropy f. \ 1.84643934467102 ok
</syntaxhighlight>
</lang>


=={{header|Fortran}}==
=={{header|Fortran}}==

Please find the GNU/linux compilation instructions along with sample run among the comments at the start of the FORTRAN 2008 source. This program acquires input from the command line argument, thereby demonstrating the fairly new get_command_argument intrinsic subroutine. The expression of the algorithm is a rough translated of the j solution. Thank you.
Please find the GNU/linux compilation instructions along with sample run among the comments at the start of the FORTRAN 2008 source. This program acquires input from the command line argument, thereby demonstrating the fairly new get_command_argument intrinsic subroutine. The expression of the algorithm is a rough translated of the j solution. Thank you.
<syntaxhighlight lang="fortran">
<lang FORTRAN>
!-*- mode: compilation; default-directory: "/tmp/" -*-
!-*- mode: compilation; default-directory: "/tmp/" -*-
!Compilation started at Tue May 21 21:43:12
!Compilation started at Tue May 21 21:43:12
Line 665: Line 1,225:


end program shannonEntropy
end program shannonEntropy
</syntaxhighlight>
</lang>

=={{header|FreeBASIC}}==
=={{header|FreeBASIC}}==
<lang FreeBASIC>' version 25-06-2015
<syntaxhighlight lang="freebasic">' version 25-06-2015
' compile with: fbc -s console
' compile with: fbc -s console


Line 697: Line 1,258:


' empty keyboard buffer
' empty keyboard buffer
While InKey <> "" : Var _key_ = InKey : Wend
While InKey <> "" : Wend
Print : Print "hit any key to end program"
Print : Print "hit any key to end program"
Sleep
Sleep
End</lang>
End</syntaxhighlight>
{{out}}
{{out}}
<pre>Char count
<pre>Char count
Line 709: Line 1,270:


The Entropy of "1223334444" is 1.846439344671015</pre>
The Entropy of "1223334444" is 1.846439344671015</pre>

=={{header|friendly interactive shell}}==
=={{header|friendly interactive shell}}==
Sort of hacky, but friendly interactive shell isn't really optimized for mathematic tasks (in fact, it doesn't even have associative arrays).
Sort of hacky, but friendly interactive shell isn't really optimized for mathematic tasks (in fact, it doesn't even have associative arrays).


<lang fishshell>function entropy
<syntaxhighlight lang="fishshell">function entropy
for arg in $argv
for arg in $argv
set name count_$arg
set name count_$arg
Line 732: Line 1,294:
echo "$entropy / l(2)" | bc -l
echo "$entropy / l(2)" | bc -l
end
end
entropy (echo 1223334444 | fold -w1)</lang>
entropy (echo 1223334444 | fold -w1)</syntaxhighlight>
{{out}}
{{out}}
<pre>1.84643934467101549345</pre>
<pre>1.84643934467101549345</pre>

=={{header|Fōrmulæ}}==

{{FormulaeEntry|page=https://formulae.org/?script=examples/Entropy}}

'''Solution'''

[[File:Fōrmulæ - Entropy 01.png]]

'''Test case'''

[[File:Fōrmulæ - Entropy 02.png]]

[[File:Fōrmulæ - Entropy 03.png]]

[[File:Fōrmulæ - Entropy 04.png]]

[[File:Fōrmulæ - Entropy 05.png]]

=={{header|Go}}==
=={{header|Go}}==
===Go: Slice version===
<lang go>package main
<syntaxhighlight lang="go">package main


import (
import (
"fmt"
"fmt"
"math"
"math"
"strings"
)
)


func main(){
const s = "1223334444"
fmt.Println(H("1223334444"))
}

// for ASCII strings
func H(data string) (entropy float64) {
if data == "" {
return 0
}
for i := 0; i < 256; i++ {
px := float64(strings.Count(data, string(byte(i)))) / float64(len(data))
if px > 0 {
entropy += -px * math.Log2(px)
}
}
return entropy
}</syntaxhighlight>
{{out}}
<pre>
1.8464393446710154
</pre>

===Go: Map version===
<syntaxhighlight lang="go">package main

import (
"fmt"
"math"
)


func main() {
func main() {
const s = "1223334444"

l := float64(0)
m := map[rune]float64{}
m := map[rune]float64{}
for _, r := range s {
for _, r := range s {
m[r]++
m[r]++
l++
}
}
hm := 0.
var hm float64
for _, c := range m {
for _, c := range m {
hm += c * math.Log2(c)
hm += c * math.Log2(c)
}
}
const l = float64(len(s))
fmt.Println(math.Log2(l) - hm/l)
fmt.Println(math.Log2(l) - hm/l)
}</lang>
}</syntaxhighlight>
{{out}}
{{out}}
<pre>
<pre>
Line 763: Line 1,377:


=={{header|Groovy}}==
=={{header|Groovy}}==
<lang groovy>String.metaClass.getShannonEntrophy = {
<syntaxhighlight lang="groovy">String.metaClass.getShannonEntrophy = {
-delegate.inject([:]) { map, v -> map[v] = (map[v] ?: 0) + 1; map }.values().inject(0.0) { sum, v ->
-delegate.inject([:]) { map, v -> map[v] = (map[v] ?: 0) + 1; map }.values().inject(0.0) { sum, v ->
def p = (BigDecimal)v / delegate.size()
def p = (BigDecimal)v / delegate.size()
sum + p * Math.log(p) / Math.log(2)
sum + p * Math.log(p) / Math.log(2)
}
}
}</lang>
}</syntaxhighlight>
Testing
Testing
<lang groovy>[ '1223334444': '1.846439344671',
<syntaxhighlight lang="groovy">[ '1223334444': '1.846439344671',
'1223334444555555555': '1.969811065121',
'1223334444555555555': '1.969811065121',
'122333': '1.459147917061',
'122333': '1.459147917061',
Line 780: Line 1,394:
println "Checking $s has a shannon entrophy of $expected"
println "Checking $s has a shannon entrophy of $expected"
assert sprintf('%.12f', s.shannonEntrophy) == expected
assert sprintf('%.12f', s.shannonEntrophy) == expected
}</lang>
}</syntaxhighlight>
{{out}}
{{out}}
<pre>Checking 1223334444 has a shannon entrophy of 1.846439344671
<pre>Checking 1223334444 has a shannon entrophy of 1.846439344671
Line 791: Line 1,405:


=={{header|Haskell}}==
=={{header|Haskell}}==
<lang haskell>import Data.List
<syntaxhighlight lang="haskell">import Data.List


main = print $ entropy "1223334444"
main = print $ entropy "1223334444"


entropy s =
entropy :: (Ord a, Floating c) => [a] -> c
sum . map lg' . fq' . map (fromIntegral.length) . group . sort $ s
entropy = sum . map lg . fq . map genericLength . group . sort
where lg' c = (c * ) . logBase 2 $ 1.0 / c
where lg c = -c * logBase 2 c
fq' c = let sc = sum c in map (/ sc) c </lang>
fq c = let sc = sum c in map (/ sc) c</syntaxhighlight>


=={{header|Icon}} and {{header|Unicon}}==


Or, inlining with an applicative expression (turns out to be fractionally faster):

<syntaxhighlight lang="haskell">import Data.List (genericLength, group, sort)

entropy
:: (Ord a, Floating c)
=> [a] -> c
entropy =
sum .
map (negate . ((*) <*> logBase 2)) .
(map =<< flip (/) . sum) . map genericLength . group . sort

main :: IO ()
main = print $ entropy "1223334444"</syntaxhighlight>

{{out}}
<pre>1.8464393446710154</pre>

=={{header|Icon}} and {{header|Unicon}}==
Hmmm, the 2nd equation sums across the length of the string (for the
Hmmm, the 2nd equation sums across the length of the string (for the
example, that would be the sum of 10 terms). However, the answer cited
example, that would be the sum of 10 terms). However, the answer cited
Line 809: Line 1,441:
description.
description.


<lang unicon>procedure main(a)
<syntaxhighlight lang="unicon">procedure main(a)
s := !a | "1223334444"
s := !a | "1223334444"
write(H(s))
write(H(s))
Line 819: Line 1,451:
every (h := 0.0) -:= P[c := key(P)] * log(P[c],2)
every (h := 0.0) -:= P[c := key(P)] * log(P[c],2)
return h
return h
end</lang>
end</syntaxhighlight>


{{out}}
{{out}}
Line 829: Line 1,461:


=={{header|J}}==
=={{header|J}}==
'''Solution''':<lang j> entropy=: +/@(-@* 2&^.)@(#/.~ % #)</lang>
'''Solution''':<syntaxhighlight lang="j"> entropy=: +/@(-@* 2&^.)@(#/.~ % #)</syntaxhighlight>
{{out|Example}}
{{out|Example}}
<lang j> entropy '1223334444'
<syntaxhighlight lang="j"> entropy '1223334444'
1.84644
1.84644
entropy i.256
entropy i.256
Line 840: Line 1,472:
1
1
entropy 256$0 1 2 3
entropy 256$0 1 2 3
2</lang>
2</syntaxhighlight>


So it looks like entropy is roughly the number of bits which would be needed to ''distinguish between'' each item in the argument (for example, with perfect compression). Note that in some contexts this might not be the same thing as information because the choice of the items themselves might matter. But it's good enough in contexts with a fixed set of symbols.
So it looks like entropy is roughly the number of bits which would be needed to ''distinguish between'' each item in the argument (for example, with perfect compression). Note that in some contexts this might not be the same thing as information because the choice of the items themselves might matter. But it's good enough in contexts with a fixed set of symbols.
Line 848: Line 1,480:
{{trans|REXX}}
{{trans|REXX}}
{{works with|Java|7+}}
{{works with|Java|7+}}
<lang java5>import java.lang.Math;
<syntaxhighlight lang="java5">import java.lang.Math;
import java.util.Map;
import java.util.Map;
import java.util.HashMap;
import java.util.HashMap;
Line 898: Line 1,530:
return;
return;
}
}
}</lang>
}</syntaxhighlight>
{{out}}
{{out}}
<pre>
<pre>
Line 911: Line 1,543:


=={{header|JavaScript}}==
=={{header|JavaScript}}==
{{works with|ECMA-262 (5.1)}}
{{works with|ECMAScript 2015}}
Calculate the entropy of a string by determining the frequency of each character, then summing each character's probability multiplied by the log base 2 of that same probability, taking the negative of the sum.
The proces function builds a histogram of character frequencies then iterates over it.
<syntaxhighlight lang="javascript">// Shannon entropy in bits per symbol.
function entropy(str) {
const len = str.length


// Build a frequency map from the string.
The entropy function calls into process and evaluates the frequencies as they're passed back.
const frequencies = Array.from(str)
<lang JavaScript>(function(shannon) {
.reduce((freq, c) => (freq[c] = (freq[c] || 0) + 1) && freq, {})
// Create a dictionary of character frequencies and iterate over it.

function process(s, evaluator) {
// Sum the frequency of each character.
var h = Object.create(null), k;
return Object.values(frequencies)
s.split('').forEach(function(c) {
.reduce((sum, f) => sum - f/len * Math.log2(f/len), 0)
h[c] && h[c]++ || (h[c] = 1); });
if (evaluator) for (k in h) evaluator(k, h[k]);
return h;
};
// Measure the entropy of a string in bits per symbol.
shannon.entropy = function(s) {
var sum = 0,len = s.length;
process(s, function(k, f) {
var p = f/len;
sum -= p * Math.log(p) / Math.log(2);
});
return sum;
};
})(window.shannon = window.shannon || {});
// Log the Shannon entropy of a string.
function logEntropy(s) {
console.log('Entropy of "' + s + '" in bits per symbol:', shannon.entropy(s));
}
}

logEntropy('1223334444');
console.log(entropy('1223334444')) // 1.8464393446710154
console.log(entropy('0')) // 0
logEntropy('0');
console.log(entropy('01')) // 1
logEntropy('01');
console.log(entropy('0123')) // 2
logEntropy('0123');
logEntropy('01234567');
console.log(entropy('01234567')) // 3
logEntropy('0123456789abcdef');</lang>
console.log(entropy('0123456789abcdef')) // 4</syntaxhighlight>
{{out}}
{{out}}
<pre>Entropy of "1223334444" in bits per symbol: 1.8464393446710154
<pre>1.8464393446710154
0
Entropy of "0" in bits per symbol: 0
1
Entropy of "01" in bits per symbol: 1
2
Entropy of "0123" in bits per symbol: 2
3
Entropy of "01234567" in bits per symbol: 3
4</pre>
Entropy of "0123456789abcdef" in bits per symbol: 4</pre>
;Another variant
<syntaxhighlight lang="javascript">const entropy = (s) => {
const split = s.split('');
const counter = {};
split.forEach(ch => {
if (!counter[ch]) counter[ch] = 1;
else counter[ch]++;
});


const lengthf = s.length * 1.0;
const counts = Object.values(counter);
return -1 * counts
.map(count => count / lengthf * Math.log2(count / lengthf))
.reduce((a, b) => a + b);
};</syntaxhighlight>
{{out}}
<pre>console.log(entropy("1223334444")); // 1.8464393446710154</pre>


=={{header|jq}}==
=={{header|jq}}==
Line 960: Line 1,596:
The helper function, ''counter'', could be defined as an inner function of ''entropy'', but for the sake of clarity and because it is independently useful,
The helper function, ''counter'', could be defined as an inner function of ''entropy'', but for the sake of clarity and because it is independently useful,
it is defined separately.
it is defined separately.
<lang jq># Input: an array of strings.
<syntaxhighlight lang="jq"># Input: an array of strings.
# Output: an object with the strings as keys, the values of which are the corresponding frequencies.
# Output: an object with the strings as keys, the values of which are the corresponding frequencies.
def counter:
def counter:
Line 969: Line 1,605:
(explode | map( [.] | implode ) | counter
(explode | map( [.] | implode ) | counter
| [ .[] | . * log ] | add) as $sum
| [ .[] | . * log ] | add) as $sum
| ((length|log) - ($sum / length)) / (2|log) ;</lang>
| ((length|log) - ($sum / length)) / (2|log) ;</syntaxhighlight>


{{out|Example}}
{{out|Example}}
<lang jq>"1223334444" | entropy # => 1.8464393446710154</lang>
<syntaxhighlight lang="jq">"1223334444" | entropy # => 1.8464393446710154</syntaxhighlight>

=={{header|Jsish}}==
From Javascript entry.
<syntaxhighlight lang="javascript">/* Shannon entropy, in Jsish */

function values(obj:object):array {
var vals = [];
for (var key in obj) vals.push(obj[key]);
return vals;
}

function entropy(s) {
var split = s.split('');
var counter = {};
split.forEach(function(ch) {
if (!counter[ch]) counter[ch] = 1;
else counter[ch]++;
});

var lengthf = s.length * 1.0;
var counts = values(counter);
return -1 * counts.map(function(count) {
return count / lengthf * (Math.log(count / lengthf) / Math.log(2));
})
.reduce(function(a, b) { return a + b; }
);
};

if (Interp.conf('unitTest')) {
; entropy('1223334444');
; entropy('Rosetta Code');
; entropy('password');
}</syntaxhighlight>

{{out}}
<pre>prompt$ jsish --U entropy.jsi
entropy('1223334444') ==> 1.84643934467102
entropy('Rosetta Code') ==> 3.08496250072116
entropy('password') ==> 2.75</pre>


=={{header|Julia}}==
=={{header|Julia}}==
{{works with|Julia|0.6}}
A oneliner, probably not efficient on very long strings.
<lang Julia>entropy(s)=-sum(x->x*log(2,x), [count(x->x==c,s)/length(s) for c in unique(s)])</lang>
<syntaxhighlight lang="julia">entropy(s) = -sum(x -> x * log(2, x), count(x -> x == c, s) / length(s) for c in unique(s))
@show entropy("1223334444")
{{Out}}
@show entropy([1, 2, 3, 1, 2, 1, 2, 3, 1, 2, 3, 4, 5])</syntaxhighlight>
<pre>julia> entropy("1223334444")

1.8464393446710154</pre>
{{out}}
<pre>entropy("1223334444") = 1.8464393446710154
entropy([1, 2, 3, 1, 2, 1, 2, 3, 1, 2, 3, 4, 5]) = 2.103909910282364</pre>

=={{header|K}}==
{{works with|ngn/k}}
<syntaxhighlight lang="k">entropy: {(`ln[#x]-(+/{x*`ln@x}@+/{x=\:?x}x)%#x)%`ln@2}

entropy "1223334444"</syntaxhighlight>
{{out}}
<pre>1.8464393446710161</pre>

=={{header|Kotlin}}==
<syntaxhighlight lang="kotlin">// version 1.0.6

fun log2(d: Double) = Math.log(d) / Math.log(2.0)

fun shannon(s: String): Double {
val counters = mutableMapOf<Char, Int>()
for (c in s) {
if (counters.containsKey(c)) counters[c] = counters[c]!! + 1
else counters.put(c, 1)
}
val nn = s.length.toDouble()
var sum = 0.0
for (key in counters.keys) {
val term = counters[key]!! / nn
sum += term * log2(term)
}
return -sum
}

fun main(args: Array<String>) {
val samples = arrayOf(
"1223334444",
"1223334444555555555",
"122333",
"1227774444",
"aaBBcccDDDD",
"1234567890abcdefghijklmnopqrstuvwxyz",
"Rosetta Code"
)
println(" String Entropy")
println("------------------------------------ ------------------")
for (sample in samples) println("${sample.padEnd(36)} -> ${"%18.16f".format(shannon(sample))}")
}</syntaxhighlight>
{{out}}
<pre>
String Entropy
------------------------------------ ------------------
1223334444 -> 1.8464393446710154
1223334444555555555 -> 1.9698110652780971
122333 -> 1.4591479170272448
1227774444 -> 1.8464393446710154
aaBBcccDDDD -> 1.9362600275315274
1234567890abcdefghijklmnopqrstuvwxyz -> 5.1699250014423095
Rosetta Code -> 3.0849625007211556
</pre>

=={{header|Ksh}}==
{{works with|ksh93}}
<syntaxhighlight lang="ksh">function entropy {
typeset -i i len=${#1}
typeset -X13 r=0
typeset -Ai counts

for ((i = 0; i < len; ++i))
do
counts[${1:i:1}]+=1
done
for i in "${counts[@]}"
do
r+='i * log2(i)'
done
r='log2(len) - r / len'
print -r -- "$r"
}

printf '%g\n' "$(entropy '1223334444')"</syntaxhighlight>
{{out}}
<pre>1.84644</pre>

=={{header|Lambdatalk}}==
<syntaxhighlight lang="scheme">
{def entropy

{def entropy.count
{lambda {:s :c :i}
{let { {:c {/ {A.get :i :c} {A.length :s}}}
} {* :c {log2 :c}}}}}

{def entropy.sum
{lambda {:s :c}
{- {+ {S.map {entropy.count :s :c}
{S.serie 0 {- {A.length :c} 1}}}}}}}

{lambda {:s}
{entropy.sum {A.split :s} {cdr {W.frequency :s}}}}}
-> entropy

The W.frequency function is explained in rosettacode.org/wiki/Letter_frequency#Lambdatalk

{def txt 1223334444}
-> txt
{def F {W.frequency {txt}}}
-> F
characters: {car {F}} -> [1,2,3,4]
frequencies: {cdr {F}} -> [1,2,3,4]
{entropy {txt}}
-> 1.8464393446710154

{entropy 0}
-> 0
{entropy 00000000000000}
-> 0
{entropy 11111111111111}
-> 0
{entropy 01}
-> 1
{entropy Lambdatalk}
-> 2.8464393446710154
{entropy entropy}
-> 2.807354922057604
{entropy abcdefgh}
-> 3
{entropy Rosetta Code}
-> 3.084962500721156
{entropy Longtemps je me suis couché de bonne heure}
-> 3.8608288771249444
{entropy abcdefghijklmnopqrstuvwxyz}
-> 4.70043971814109
{entropy abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz}
-> 4.70043971814109

</syntaxhighlight>

=={{header|Lang5}}==
<syntaxhighlight lang="lang5">: -rot rot rot ; [] '__A set : dip swap __A swap 1 compress append '__A
set execute __A -1 extract nip ; : nip swap drop ; : sum '+ reduce ;
: 2array 2 compress ; : comb "" split ; : lensize length nip ;
: <group> #( a -- 'a )
grade subscript dup 's dress distinct strip
length 1 2array reshape swap
'A set
: `filter(*) A in A swap select ;
'`filter apply
;

: elements(*) lensize ;
: entropy #( s -- n )
length "<group> 'elements apply" dip /
dup neg swap log * 2 log / sum ;

"1223334444" comb entropy . # 1.84643934467102</syntaxhighlight>


=={{header|Liberty BASIC}}==
=={{header|Liberty BASIC}}==
<syntaxhighlight lang="lb">
<lang lb>
dim countOfChar( 255) ' all possible one-byte ASCII chars
dim countOfChar( 255) ' all possible one-byte ASCII chars


Line 1,015: Line 1,844:
logBase =log( x) /log( 2)
logBase =log( x) /log( 2)
end function
end function
</syntaxhighlight>
</lang>
{{Out}}
{{Out}}
<pre> Characters used and the number of occurrences of each
<pre> Characters used and the number of occurrences of each
Line 1,025: Line 1,854:
The result should be around 1.84644 bits.</pre>
The result should be around 1.84644 bits.</pre>


=={{header|Lua}}==
<syntaxhighlight lang="lua">function log2 (x) return math.log(x) / math.log(2) end


function entropy (X)
local N, count, sum, i = X:len(), {}, 0
for char = 1, N do
i = X:sub(char, char)
if count[i] then
count[i] = count[i] + 1
else
count[i] = 1
end
end
for n_i, count_i in pairs(count) do
sum = sum + count_i / N * log2(count_i / N)
end
return -sum
end


print(entropy("1223334444"))</syntaxhighlight>
=={{header|Lang5}}==
<lang lang5>: -rot rot rot ; [] '__A set : dip swap __A swap 1 compress append '__A
set execute __A -1 extract nip ; : nip swap drop ; : sum '+ reduce ;
: 2array 2 compress ; : comb "" split ; : lensize length nip ;
: <group> #( a -- 'a )
grade subscript dup 's dress distinct strip
length 1 2array reshape swap
'A set
: `filter(*) A in A swap select ;
'`filter apply
;

: elements(*) lensize ;
: entropy #( s -- n )
length "<group> 'elements apply" dip /
dup neg swap log * 2 log / sum ;

"1223334444" comb entropy . # 1.84643934467102</lang>


=={{header|Mathematica}}==
=={{header|Mathematica}} / {{header|Wolfram Language}}==
<lang Mathematica>shE[s_String] := -Plus @@ ((# Log[2., #]) & /@ ((Length /@ Gather[#])/
<syntaxhighlight lang="mathematica">shE[s_String] := -Plus @@ ((# Log[2., #]) & /@ ((Length /@ Gather[#])/
Length[#]) &[Characters[s]])</lang>
Length[#]) &[Characters[s]])</syntaxhighlight>
{{out|Example}}<lang Mathematica> shE["1223334444"]
{{out|Example}}<syntaxhighlight lang="mathematica"> shE["1223334444"]
1.84644
1.84644
shE["Rosetta Code"]
shE["Rosetta Code"]
3.08496</lang>
3.08496</syntaxhighlight>


=={{header|MATLAB}} / {{header|Octave}}==
=={{header|MATLAB}} / {{header|Octave}}==
This version allows for any input vectors,
This version allows for any input vectors,
including strings, floats, negative integers, etc.
including strings, floats, negative integers, etc.
<lang MATLAB>function E = entropy(d)
<syntaxhighlight lang="matlab">function E = entropy(d)
if ischar(d), d=abs(d); end;
if ischar(d), d=abs(d); end;
[Y,I,J] = unique(d);
[Y,I,J] = unique(d);
Line 1,063: Line 1,892:
p = full(H(H>0))/length(d);
p = full(H(H>0))/length(d);
E = -sum(p.*log2(p));
E = -sum(p.*log2(p));
end; </lang>
end; </syntaxhighlight>
{{out|Usage}}
{{out|Usage}}
<lang MATLAB>> entropy('1223334444')
<syntaxhighlight lang="matlab">> entropy('1223334444')
ans = 1.8464</lang>
ans = 1.8464</syntaxhighlight>

=={{header|MiniScript}}==
<syntaxhighlight lang="miniscript">entropy = function(s)
count = {}
for c in s
if count.hasIndex(c) then count[c] = count[c]+1 else count[c] = 1
end for
sum = 0
for x in count.values
countOverN = x / s.len
sum = sum + countOverN * log(countOverN, 2)
end for
return -sum
end function

print entropy("1223334444")</syntaxhighlight>

{{out}}
<pre>1.846439</pre>

=={{header|Modula-2}}==
<syntaxhighlight lang="modula2">MODULE Entropy;
FROM InOut IMPORT WriteString, WriteLn;
FROM RealInOut IMPORT WriteReal;
FROM Strings IMPORT Length;
FROM MathLib IMPORT ln;

PROCEDURE entropy(s: ARRAY OF CHAR): REAL;
VAR freq: ARRAY [0..255] OF CARDINAL;
i, length: CARDINAL;
h, f: REAL;
BEGIN
(* the entropy of the empty string is zero *)
length := Length(s);
IF length = 0 THEN RETURN 0.0; END;
(* find the frequency of each character *)
FOR i := 0 TO 255 DO freq[i] := 0; END;
FOR i := 0 TO length-1 DO
INC(freq[ORD(s[i])]);
END;
(* calculate the component for each character *)
h := 0.0;
FOR i := 0 TO 255 DO
IF freq[i] # 0 THEN
f := FLOAT(freq[i]) / FLOAT(length);
h := h - f * (ln(f) / ln(2.0));
END;
END;
RETURN h;
END entropy;

BEGIN
WriteReal(entropy("1223334444"), 14);
WriteLn;
END Entropy.</syntaxhighlight>
{{out}}
<pre> 1.8464394E+00</pre>


=={{header|NetRexx}}==
=={{header|NetRexx}}==
{{trans|REXX}}
{{trans|REXX}}
<lang NetRexx>/* NetRexx */
<syntaxhighlight lang="netrexx">/* NetRexx */
options replace format comments java crossref savelog symbols
options replace format comments java crossref savelog symbols


Line 1,143: Line 2,031:
end report
end report
return
return
</syntaxhighlight>
</lang>
{{out}}
{{out}}
<pre>
<pre>
Line 1,159: Line 2,047:


=={{header|Nim}}==
=={{header|Nim}}==
<lang nim>import tables, math
<syntaxhighlight lang="nim">import tables, math


proc entropy(s): float =
proc entropy(s: string): float =
var t = initCountTable[char]()
var t = initCountTable[char]()
for c in s: t.inc(c)
for c in s: t.inc(c)
for x in t.values: result -= x/s.len * log2(x/s.len)
for x in t.values: result -= x/s.len * log2(x/s.len)


echo entropy("1223334444")</lang>
echo entropy("1223334444")</syntaxhighlight>




=={{header|Objeck}}==
<syntaxhighlight lang="objeck">use Collection;


class Entropy {
function : native : GetShannonEntropy(result : String) ~ Float {
frequencies := IntMap->New();


each(i : result) {
c := result->Get(i);


if(frequencies->Has(c)) {
count := frequencies->Find(c)->As(IntHolder);
count->Set(count->Get() + 1);
}
else {
frequencies->Insert(c, IntHolder->New(1));
};
};


length := result->Size();
entropy := 0.0;


counts := frequencies->GetValues();
each(i : counts) {
count := counts->Get(i)->As(IntHolder)->Get();
freq := count->As(Float) / length;
entropy += freq * (freq->Log() / 2.0->Log());
};


return -1 * entropy;
}


function : Main(args : String[]) ~ Nil {
inputs := [
"1223334444",
"1223334444555555555",
"122333",
"1227774444",
"aaBBcccDDDD",
"1234567890abcdefghijklmnopqrstuvwxyz",
"Rosetta Code"];


each(i : inputs) {
input := inputs[i];
"Shannon entropy of '{$input}': "->Print();
GetShannonEntropy(inputs[i])->PrintLine();
};
}
}</syntaxhighlight>


Output:
<pre>
Shannon entropy of '1223334444': 1.84644
Shannon entropy of '1223334444555555555': 1.96981
Shannon entropy of '122333': 1.45915
Shannon entropy of '1227774444': 1.84644
Shannon entropy of 'aaBBcccDDDD': 1.93626
Shannon entropy of '1234567890abcdefghijklmnopqrstuvwxyz': 5.16993
Shannon entropy of 'Rosetta Code': 3.08496
</pre>


=={{header|OCaml}}==
=={{header|OCaml}}==
;By using a map, purely functional
<lang ocaml>(* generic OCaml, using a mutable Hashtbl *)
<syntaxhighlight lang="ocaml">module CharMap = Map.Make(Char)

let entropy s =
let count map c =
CharMap.update c (function Some n -> Some (n +. 1.) | None -> Some 1.) map
and calc _ n sum =
sum +. n *. Float.log2 n
in
let sum = CharMap.fold calc (String.fold_left count CharMap.empty s) 0.
and len = float (String.length s) in
Float.log2 len -. sum /. len


let () =
entropy "1223334444" |> string_of_float |> print_endline</syntaxhighlight>
;By using a mutable Hashtbl
<syntaxhighlight lang="ocaml">
(* pre-bake & return an inner-loop function to bin & assemble a character frequency map *)
(* pre-bake & return an inner-loop function to bin & assemble a character frequency map *)
let get_fproc (m: (char, int) Hashtbl.t) :(char -> unit) =
let get_fproc (m: (char, int) Hashtbl.t) :(char -> unit) =
Line 1,211: Line 2,163:


-1.0 *. List.fold_left (fun b x -> b +. calc x) 0.0 relative_probs
-1.0 *. List.fold_left (fun b x -> b +. calc x) 0.0 relative_probs
</syntaxhighlight>
</lang>
{{out}}

<pre>1.84643934467</pre>
'''output:'''

1.84643934467


=={{header|Oforth}}==
=={{header|Oforth}}==
<syntaxhighlight lang="oforth">: entropy(s) -- f

<lang Oforth>func: entropy(s)
{
| freq sz |
| freq sz |
s size dup ifZero: [ return ] asFloat ->sz
s size dup ifZero: [ return ] asFloat ->sz
ListBuffer newSize(255) ->freq
ListBuffer initValue(255, 0) ->freq
s apply(#[ dup freq at dup ifNull: [ drop 0 ] 1 + swap freq put ])
s apply( #[ dup freq at 1+ freq put ] )
0.0 freq applyIf(#notNull, #[ sz / dup ln * - ]) Ln2 /
0.0 freq applyIf( #[ 0 <> ], #[ sz / dup ln * - ] ) Ln2 / ;
}
entropy("1223334444") .</syntaxhighlight>


{{out}}
entropy("1223334444") println</lang>
<pre>1.84643934467102</pre>


=={{header|ooRexx}}==
{{trans|REXX}}
<syntaxhighlight lang="oorexx">/* REXX */
Numeric Digits 16
Parse Arg s
If s='' Then
s="1223334444"
occ.=0
chars=''
n=0
cn=0
Do i=1 To length(s)
c=substr(s,i,1)
If pos(c,chars)=0 Then Do
cn=cn+1
chars=chars||c
End
occ.c=occ.c+1
n=n+1
End
do ci=1 To cn
c=substr(chars,ci,1)
p.c=occ.c/n
/* say c p.c */
End
e=0
Do ci=1 To cn
c=substr(chars,ci,1)
e=e+p.c*rxcalclog(p.c)/rxcalclog(2)
End
Say s 'Entropy' format(-e,,12)
Exit

::requires 'rxmath' LIBRARY </syntaxhighlight>
{{out}}
{{out}}
<pre>1223334444 Entropy 1.846439344671</pre>
<pre>
1.84643934467102
</pre>


=={{header|Pascal}}==
=={{header|PARI/GP}}==
<syntaxhighlight lang="parigp">entropy(s)=s=Vec(s);my(v=vecsort(s,,8));-sum(i=1,#v,(x->x*log(x))(sum(j=1,#s,v[i]==s[j])/#s))/log(2)</syntaxhighlight>
<pre>>entropy("1223334444")
%1 = 1.8464393446710154934341977463050452232</pre>


=={{header|Pascal}}==
Free Pascal (http://freepascal.org).
Free Pascal (http://freepascal.org).
<syntaxhighlight lang="pascal">

<lang Pascal>
PROGRAM entropytest;
PROGRAM entropytest;


Line 1,290: Line 2,275:
Writeln('Entropy of "',strng,'" is ',entropy(strng):2:5, ' bits.');
Writeln('Entropy of "',strng,'" is ',entropy(strng):2:5, ' bits.');
END.
END.
</syntaxhighlight>
</lang>


{{out}}
{{out}}
Line 1,296: Line 2,281:
Entropy of "1223334444" is 1.84644 bits.
Entropy of "1223334444" is 1.84644 bits.
</pre>
</pre>

=={{header|PARI/GP}}==
<lang parigp>entropy(s)=s=Vec(s);my(v=vecsort(s,,8));-sum(i=1,#v,(x->x*log(x))(sum(j=1,#s,v[i]==s[j])/#s))/log(2)</lang>
<pre>>entropy("1223334444")
%1 = 1.8464393446710154934341977463050452232</pre>


=={{header|Perl}}==
=={{header|Perl}}==
<lang Perl>sub entropy {
<syntaxhighlight lang="perl">sub entropy {
my %count; $count{$_}++ for @_;
my %count; $count{$_}++ for @_;
my $entropy = 0;
my $entropy = 0;
Line 1,313: Line 2,293:
}
}
print entropy split //, "1223334444";</lang>
print entropy split //, "1223334444";</syntaxhighlight>


=={{header|Perl 6}}==
=={{header|Phix}}==
<!--<syntaxhighlight lang="phix">(phixonline)-->
<lang Perl 6>sub entropy(@a) {
<span style="color: #008080;">with</span> <span style="color: #008080;">javascript_semantics</span>
[+] map -> \p { p * -log p }, bag(@a).values »/» +@a;
<span style="color: #008080;">function</span> <span style="color: #000000;">entropy</span><span style="color: #0000FF;">(</span><span style="color: #004080;">sequence</span> <span style="color: #000000;">s</span><span style="color: #0000FF;">)</span>
}
<span style="color: #004080;">sequence</span> <span style="color: #000000;">symbols</span> <span style="color: #0000FF;">=</span> <span style="color: #0000FF;">{},</span>

<span style="color: #000000;">counts</span> <span style="color: #0000FF;">=</span> <span style="color: #0000FF;">{}</span>
say log(2) R/ entropy '1223334444'.comb;</lang>
<span style="color: #004080;">integer</span> <span style="color: #000000;">N</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">s</span><span style="color: #0000FF;">)</span>
<span style="color: #008080;">for</span> <span style="color: #000000;">i</span><span style="color: #0000FF;">=</span><span style="color: #000000;">1</span> <span style="color: #008080;">to</span> <span style="color: #000000;">N</span> <span style="color: #008080;">do</span>
<span style="color: #004080;">object</span> <span style="color: #000000;">si</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">s</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">]</span>
<span style="color: #004080;">integer</span> <span style="color: #000000;">k</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">find</span><span style="color: #0000FF;">(</span><span style="color: #000000;">si</span><span style="color: #0000FF;">,</span><span style="color: #000000;">symbols</span><span style="color: #0000FF;">)</span>
<span style="color: #008080;">if</span> <span style="color: #000000;">k</span><span style="color: #0000FF;">=</span><span style="color: #000000;">0</span> <span style="color: #008080;">then</span>
<span style="color: #000000;">symbols</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">append</span><span style="color: #0000FF;">(</span><span style="color: #000000;">symbols</span><span style="color: #0000FF;">,</span><span style="color: #000000;">si</span><span style="color: #0000FF;">)</span>
<span style="color: #000000;">counts</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">append</span><span style="color: #0000FF;">(</span><span style="color: #000000;">counts</span><span style="color: #0000FF;">,</span><span style="color: #000000;">1</span><span style="color: #0000FF;">)</span>
<span style="color: #008080;">else</span>
<span style="color: #000000;">counts</span><span style="color: #0000FF;">[</span><span style="color: #000000;">k</span><span style="color: #0000FF;">]</span> <span style="color: #0000FF;">+=</span> <span style="color: #000000;">1</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">for</span>
<span style="color: #004080;">atom</span> <span style="color: #000000;">H</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">0</span>
<span style="color: #004080;">integer</span> <span style="color: #000000;">n</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">counts</span><span style="color: #0000FF;">)</span>
<span style="color: #008080;">for</span> <span style="color: #000000;">i</span><span style="color: #0000FF;">=</span><span style="color: #000000;">1</span> <span style="color: #008080;">to</span> <span style="color: #000000;">n</span> <span style="color: #008080;">do</span>
<span style="color: #004080;">atom</span> <span style="color: #000000;">ci</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">counts</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">]/</span><span style="color: #000000;">N</span>
<span style="color: #000000;">H</span> <span style="color: #0000FF;">-=</span> <span style="color: #000000;">ci</span><span style="color: #0000FF;">*</span><span style="color: #7060A8;">log2</span><span style="color: #0000FF;">(</span><span style="color: #000000;">ci</span><span style="color: #0000FF;">)</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">for</span>
<span style="color: #008080;">return</span> <span style="color: #000000;">H</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">function</span>
<span style="color: #0000FF;">?</span><span style="color: #000000;">entropy</span><span style="color: #0000FF;">(</span><span style="color: #008000;">"1223334444"</span><span style="color: #0000FF;">)</span>
<!--</syntaxhighlight>-->
{{out}}
{{out}}
<pre>
<pre>1.84643934467102</pre>
1.846439345
</pre>


=={{header|PHP}}==
In case we would like to add this function to Perl 6's core, here is one way it could be done:
<syntaxhighlight lang="php"><?php


function shannonEntropy($string) {
<lang Perl 6>use MONKEY_TYPING;
$h = 0.0;
augment class Bag {
$len = strlen($string);
method entropy {
foreach (count_chars($string, 1) as $count) {
[+] map -> \p { - p * log p },
$h -= (double) ($count / $len) * log((double) ($count / $len), 2);
self.values »/» +self;
}
}
return $h;
}
}


$strings = array(
say '1223334444'.comb.Bag.entropy / log 2;</lang>
'1223334444',
'1225554444',
'aaBBcccDDDD',
'122333444455555',
'Rosetta Code',
'1234567890abcdefghijklmnopqrstuvwxyz',
);

foreach ($strings AS $string) {
printf(
'%36s : %s' . PHP_EOL,
$string,
number_format(shannonEntropy($string), 6)
);
}</syntaxhighlight>

{{out}}
<pre> 1223334444 : 1.846439
1225554444 : 1.846439
aaBBcccDDDD : 1.936260
122333444455555 : 2.149255
Rosetta Code : 3.084963
1234567890abcdefghijklmnopqrstuvwxyz : 5.169925</pre>

=={{header|Picat}}==
<syntaxhighlight lang="picat">go =>
["1223334444",
"Rosetta Code is the best site in the world!",
"1234567890abcdefghijklmnopqrstuvwxyz",
"Picat is fun"].map(entropy).println(),
nl.

% probabilities of each element/character in L
entropy(L) = Entropy =>
Len = L.length,
Occ = new_map(), % # of occurrences
foreach(E in L)
Occ.put(E, Occ.get(E,0) + 1)
end,
Entropy = -sum([P2*log2(P2) : _C=P in Occ, P2 = P/Len]).</syntaxhighlight>

{{out}}
<pre>[1.846439344671016,3.646513010214172,5.169925001442309,3.251629167387823]</pre>

=={{header|PicoLisp}}==
PicoLisp only supports fixed point arithmetic, but it does have the ability to call libc transcendental functions (for log)
<syntaxhighlight lang="picolisp">
(scl 8)
(load "@lib/math.l")

(setq LN2 0.693147180559945309417)

(de tabulate-chars (Str)
(let Map NIL
(for Ch (chop Str)
(if (assoc Ch Map)
(con @ (inc (cdr @)))
(setq Map (cons (cons Ch 1) Map))))
Map))

(de entropy (Str)
(let (
Sz (length Str)
Hist (tabulate-chars Str)
)
(*/
(sum
'((Pair)
(let R (*/ (cdr Pair) 1. Sz)
(- (*/ R (log R) 1.))))
Hist)
1. LN2)))

</syntaxhighlight>
{{Out}}
<pre>
: (format (entropy "1223334444") *Scl)
-> "1.84643934"
</pre>


=={{header|PL/I}}==
=={{header|PL/I}}==
<lang pli>*process source xref attributes or(!);
<syntaxhighlight lang="pli">*process source xref attributes or(!);
/*--------------------------------------------------------------------
/*--------------------------------------------------------------------
* 08.08.2014 Walter Pachl translated from REXX version 1
* 08.08.2014 Walter Pachl translated from REXX version 1
Line 1,370: Line 2,455:
End;
End;
Put Edit('s='''!!s!!''' Entropy=',-e)(Skip,a,f(15,12));
Put Edit('s='''!!s!!''' Entropy=',-e)(Skip,a,f(15,12));
End;</lang>
End;</syntaxhighlight>
{{out}}
{{out}}
<pre>s='1223334444' Entropy= 1.846439344671</pre>
<pre>s='1223334444' Entropy= 1.846439344671</pre>


=={{header|PowerShell}}==
=={{header|PowerShell}}==
<syntaxhighlight lang="powershell">
<lang PowerShell>
function entropy ($string) {
function entropy ($string) {
$n = $string.Length
$n = $string.Length
Line 1,385: Line 2,470:
}
}
entropy "1223334444"
entropy "1223334444"
</syntaxhighlight>
</lang>
<b>Output:</b>
<b>Output:</b>
<pre>
<pre>
1.84643934467102
1.84643934467102
</pre>

=={{header|Prolog}}==
{{works with|Swi-Prolog|7.3.3}}
This solution calculates the run-length encoding of the input string to get the relative frequencies of its characters.
<syntaxhighlight lang="prolog">:-module(shannon_entropy, [shannon_entropy/2]).

%! shannon_entropy(+String, -Entropy) is det.
%
% Calculate the Shannon Entropy of String.
%
% Example query:
% ==
% ?- shannon_entropy(1223334444, H).
% H = 1.8464393446710154.
% ==
%
shannon_entropy(String, Entropy):-
atom_chars(String, Cs)
,relative_frequencies(Cs, Frequencies)
,findall(CI
,(member(_C-F, Frequencies)
,log2(F, L)
,CI is F * L
)
,CIs)
,foldl(sum, CIs, 0, E)
,Entropy is -E.

%! frequencies(+Characters,-Frequencies) is det.
%
% Calculates the relative frequencies of elements in the list of
% Characters.
%
% Frequencies is a key-value list with elements of the form:
% C-F, where C a character in the list and F its relative
% frequency in the list.
%
% Example query:
% ==
% ?- relative_frequencies([a,a,a,b,b,b,b,b,b,c,c,c,a,a,f], Fs).
% Fs = [a-0.3333333333333333, b-0.4, c-0.2,f-0.06666666666666667].
% ==
%
relative_frequencies(List, Frequencies):-
run_length_encoding(List, Rle)
% Sort Run-length encoded list and aggregate lengths by element
,keysort(Rle, Sorted_Rle)
,group_pairs_by_key(Sorted_Rle, Elements_Run_lengths)
,length(List, Elements_in_list)
,findall(E-Frequency_of_E
,(member(E-RLs, Elements_Run_lengths)
% Sum the list of lengths of runs of E
,foldl(plus, RLs, 0, Occurences_of_E)
,Frequency_of_E is Occurences_of_E / Elements_in_list
)
,Frequencies).


%! run_length_encoding(+List, -Run_length_encoding) is det.
%
% Converts a list to its run-length encoded form where each "run"
% of contiguous repeats of the same element is replaced by that
% element and the length of the run.
%
% Run_length_encoding is a key-value list, where each element is a
% term:
%
% Element:term-Repetitions:number.
%
% Example query:
% ==
% ?- run_length_encoding([a,a,a,b,b,b,b,b,b,c,c,c,a,a,f], RLE).
% RLE = [a-3, b-6, c-3, a-2, f-1].
% ==
%
run_length_encoding([], []-0):-
!. % No more results needed.

run_length_encoding([Head|List], Run_length_encoded_list):-
run_length_encoding(List, [Head-1], Reversed_list)
% The resulting list is in reverse order due to the head-to-tail processing
,reverse(Reversed_list, Run_length_encoded_list).

%! run_length_encoding(+List,+Initialiser,-Accumulator) is det.
%
% Business end of run_length_encoding/3. Calculates the run-length
% encoded form of a list and binds the result to the Accumulator.
% Initialiser is a list [H-1] where H is the first element of the
% input list.
%
run_length_encoding([], Fs, Fs).

% Run of F consecutive occurrences of C
run_length_encoding([C|Cs],[C-F|Fs], Acc):-
% Backtracking would produce successive counts
% of runs of C at different indices in the list.
!
,F_ is F + 1
,run_length_encoding(Cs, [C-F_| Fs], Acc).

% End of a run of consecutive identical elements.
run_length_encoding([C|Cs], Fs, Acc):-
run_length_encoding(Cs,[C-1|Fs], Acc).


/* Arithmetic helper predicates */

%! log2(N, L2_N) is det.
%
% L2_N is the logarithm with base 2 of N.
%
log2(N, L2_N):-
L_10 is log10(N)
,L_2 is log10(2)
,L2_N is L_10 / L_2.

%! sum(+A,+B,?Sum) is det.
%
% True when Sum is the sum of numbers A and B.
%
% Helper predicate to allow foldl/4 to do addition. The following
% call will raise an error (because there is no predicate +/3):
% ==
% foldl(+, [1,2,3], 0, Result).
% ==
%
% This will not raise an error:
% ==
% foldl(sum, [1,2,3], 0, Result).
% ==
%
sum(A, B, Sum):-
must_be(number, A)
,must_be(number, B)
,Sum is A + B.
</syntaxhighlight>

Example query:

<pre>
?- shannon_entropy(1223334444, H).
H = 1.8464393446710154.
</pre>
</pre>


=={{header|PureBasic}}==
=={{header|PureBasic}}==
<lang purebasic>#TESTSTR="1223334444"
<syntaxhighlight lang="purebasic">#TESTSTR="1223334444"
NewMap uchar.i() : Define.d e
NewMap uchar.i() : Define.d e


Line 1,413: Line 2,641:
OpenConsole()
OpenConsole()
Print("Entropy of ["+#TESTSTR+"] = "+StrD(e,15))
Print("Entropy of ["+#TESTSTR+"] = "+StrD(e,15))
Input()</lang>
Input()</syntaxhighlight>
{{out}}
{{out}}
<pre>Entropy of [1223334444] = 1.846439344671015</pre>
<pre>Entropy of [1223334444] = 1.846439344671015</pre>
Line 1,419: Line 2,647:
=={{header|Python}}==
=={{header|Python}}==
===Python: Longer version===
===Python: Longer version===
<lang python>from __future__ import division
<syntaxhighlight lang="python">from __future__ import division
import math
import math


Line 1,452: Line 2,680:
print 'Length',l
print 'Length',l
print 'Entropy:', entropy(h, l)
print 'Entropy:', entropy(h, l)
printHist(h)</lang>
printHist(h)</syntaxhighlight>


{{out}}
{{out}}
Line 1,467: Line 2,695:


===Python: More succinct version===
===Python: More succinct version===

The <tt>Counter</tt> module is only available in Python >= 2.7.
The <tt>Counter</tt> module is only available in Python >= 2.7.
<syntaxhighlight lang="python">from math import log2
from collections import Counter


def entropy(s):
<lang python>>>> import math
p, lns = Counter(s), float(len(s))
>>> from collections import Counter
return log2(lns) - sum(count * log2(count) for count in p.values()) / lns
>>>

>>> def entropy(s):
print(entropy("1223334444"))</syntaxhighlight>
... p, lns = Counter(s), float(len(s))
{{out}}
... return -sum( count/lns * math.log(count/lns, 2) for count in p.values())
<pre>1.8464393446710154</pre>
...
>>> entropy("1223334444")
1.8464393446710154
>>> </lang>


===Uses Python 2===
===Uses Python 2===
<lang python>def Entropy(text):
<syntaxhighlight lang="python">def Entropy(text):
import math
import math
log2=lambda x:math.log(x)/math.log(2)
log2=lambda x:math.log(x)/math.log(2)
Line 1,500: Line 2,726:


while True:
while True:
print Entropy(raw_input('>>>'))</lang>
print Entropy(raw_input('>>>'))</syntaxhighlight>


=={{header|R}}==
=={{header|R}}==
<syntaxhighlight lang="rsplus">
<lang r>entropy = function(s)
entropy <- function(str) {
{freq = prop.table(table(strsplit(s, '')[1]))
-sum(freq * log(freq, base = 2))}
vec <- strsplit(str, "")[[1]]
N <- length(vec)
p_xi <- table(vec) / N
-sum(p_xi * log(p_xi, 2))
}
</syntaxhighlight>


{{out}}
print(entropy("1223334444")) # 1.846439</lang>
<pre>
> entropy("1223334444")
[1] 1.846439
</pre>


=={{header|Racket}}==
=={{header|Racket}}==
<lang racket>#lang racket
<syntaxhighlight lang="racket">#lang racket
(require math)
(require math)
(provide entropy hash-entropy list-entropy digital-entropy)
(provide entropy hash-entropy list-entropy digital-entropy)
Line 1,532: Line 2,768:
(check-= (entropy "xggooopppp") 1.8464393446710154 1E-8))
(check-= (entropy "xggooopppp") 1.8464393446710154 1E-8))


(module+ main (entropy "1223334444"))</lang>
(module+ main (entropy "1223334444"))</syntaxhighlight>
{{out}}
{{out}}
<pre> 1.8464393446710154</pre>
<pre> 1.8464393446710154</pre>

=={{header|Raku}}==
(formerly Perl 6)
{{works with|rakudo|2015-09-09}}
<syntaxhighlight lang="raku" line>sub entropy(@a) {
[+] map -> \p { p * -log p }, bag(@a).values »/» +@a;
}

say log(2) R/ entropy '1223334444'.comb;</syntaxhighlight>
{{out}}
<pre>1.84643934467102</pre>

In case we would like to add this function to Raku's core, here is one way it could be done:

<syntaxhighlight lang="raku" line>use MONKEY-TYPING;
augment class Bag {
method entropy {
[+] map -> \p { - p * log p },
self.values »/» +self;
}
}

say '1223334444'.comb.Bag.entropy / log 2;</syntaxhighlight>


=={{header|REXX}}==
=={{header|REXX}}==
===version 1===
===version 1===


<lang rexx>/* REXX ***************************************************************
<syntaxhighlight lang="rexx">/* REXX ***************************************************************
* 28.02.2013 Walter Pachl
* 28.02.2013 Walter Pachl
* 12.03.2013 Walter Pachl typo in log corrected. thanx for testing
* 12.03.2013 Walter Pachl typo in log corrected. thanx for testing
Line 1,638: Line 2,897:
Numeric Digits (prec)
Numeric Digits (prec)
r=r+0
r=r+0
Return r </lang>
Return r </syntaxhighlight>

<!-- these types of comparisons are not part of this Rosetta Code task, and
since the results are identical, why post them?



<lang rexx>/* REXX ***************************************************************
<syntaxhighlight lang="rexx">/* REXX ***************************************************************
* Test program to compare Versions 1 and 2
* Test program to compare Versions 1 and 2
* (the latter tweaked to be acceptable by my (oo)Rexx
* (the latter tweaked to be acceptable by my (oo)Rexx
Line 1,663: Line 2,926:
Say ' '
Say ' '
Return
Return
</syntaxhighlight>
</lang>
{{out}}
{{out}}
<pre>Version 1: 1223334444 Entropy 1.846439344671
<pre>Version 1: 1223334444 Entropy 1.846439344671
Line 1,678: Line 2,941:


Version 1: 1234567890abcdefghijklmnopqrstuvwxyz Entropy 5.169925001442
Version 1: 1234567890abcdefghijklmnopqrstuvwxyz Entropy 5.169925001442
Version 2: 1234567890abcdefghijklmnopqrstuvwxyz Entropy 5.169925001442</pre>
Version 2: 1234567890abcdefghijklmnopqrstuvwxyz Entropy 5.169925001442</pre> !-->


===version 2===
===version 2===
Line 1,684: Line 2,947:


The &nbsp; '''LOG2''' &nbsp; subroutine is only included here for functionality, not to document how to calculate &nbsp; LOG<sub>2</sub> &nbsp; using REXX.
The &nbsp; '''LOG2''' &nbsp; subroutine is only included here for functionality, not to document how to calculate &nbsp; LOG<sub>2</sub> &nbsp; using REXX.
<lang rexx>/*REXX program calculates the information entropy for a given character string*/
<syntaxhighlight lang="rexx">/*REXX program calculates the information entropy for a specified character string. */
numeric digits 50 /*use 50 decimal digits for precision. */
numeric digits length( e() ) % 2 - length(.) /*use 1/2 of the decimal digits of E. */
parse arg $; if $='' then $=1223334444 /*obtain the optional input from the CL*/
parse arg $; if $='' then $= 1223334444 /*obtain the optional input from the CL*/
#=0; @.=0; L=length($); $$= /*define handy-dandy REXX variables. */
#=0; @.= 0; L= length($) /*define handy-dandy REXX variables. */
$$= /*initialize the $$ list. */

do j=1 for L; _=substr($,j,1) /*process each character in $ string.*/
do j=1 for L; _= substr($, j, 1) /*process each character in $ string.*/
if @._==0 then do; #=#+1 /*Unique? Yes, bump character counter.*/
if @._==0 then do; #= # + 1 /*Unique? Yes, bump character counter.*/
$$=$$ || _ /*add this character to the $$ list. */
$$= $$ || _ /*add this character to the $$ list. */
end
end
@._=@._+1 /*keep track of this character's count.*/
@._= @._ + 1 /*keep track of this character's count.*/
end /*j*/
end /*j*/
sum=0 /*calculate info entropy for each char.*/
sum= 0 /*calculate info entropy for each char.*/
do i=1 for #; _=substr($$,i,1) /*obtain a character from unique list. */
do i=1 for #; _= substr($$, i, 1) /*obtain a character from unique list. */
sum=sum - @._/L * log2(@._/L) /*add (negatively) the char entropies. */
sum= sum - @._/L * log2(@._/L) /*add (negatively) the char entropies. */
end /*i*/
end /*i*/

say ' input string: ' $
say ' input string: ' $
say 'string length: ' L
say 'string length: ' L
say ' unique chars: ' # ; say
say ' unique chars: ' #; say
say 'the information entropy of the string ──► ' format(sum,,12) " bits."
say 'the information entropy of the string ──► ' format(sum,,12) " bits."
exit /*stick a fork in it, we're all done. */
exit /*stick a fork in it, we're all done. */
/*──────────────────────────────────────────────────────────────────────────────────────*/
/*──────────────────────────────────LOG2 subroutine───────────────────────────*/
e: e= 2.718281828459045235360287471352662497757247093699959574966967627724076630; return e
log2: procedure; parse arg x 1 ox; ig= x>1.5; is=1-2*(ig\==1); ii=0
/*──────────────────────────────────────────────────────────────────────────────────────*/
numeric digits digits()+5 /* [↓] precision of E must be ≥ digits().*/
log2: procedure; parse arg x 1 ox; ig= x>1.5; ii= 0; is= 1 - 2 * (ig\==1)
e=2.7182818284590452353602874713526624977572470936999595749669676277240766303535
do while ig & ox>1.5 | \ig&ox<.5; _=e; do k=-1; iz=ox* _**-is
numeric digits digits()+5; call e /*the precision of E must be≥digits(). */
if k>=0 & (ig & iz<1 | \ig&iz>.5) then leave; _=_*_; izz=iz; end
do while ig & ox>1.5 | \ig&ox<.5; _= e; do j=-1; iz= ox * _ ** -is
ox=izz; ii=ii+is*2**k; end; x=x* e** -ii-1; z=0; _=-1; p=z
if j>=0 & (ig & iz<1 | \ig&iz>.5) then leave; _= _ * _; izz= iz; end /*j*/
do k=1; _=-_*x; z=z+_/k; if z=p then leave; p=z; end /*k*/
ox=izz; ii=ii+is*2**j; end /*while*/; x= x * e** -ii -1; z= 0; _= -1; p= z
r=z+ii; if arg()==2 then return r; return r/log2(2,0)</lang>
do k=1; _= -_ * x; z= z+_/k; if z=p then leave; p= z; end /*k*/
r= z + ii; if arg()==2 then return r; return r / log2(2, .)</syntaxhighlight>
{{out}} when using the default input of: <tt> 1223334444 </tt>
{{out|output|text=&nbsp; when using the default input of: &nbsp; &nbsp; <tt> 1223334444 </tt>}}
<pre>
<pre>
input string: 1223334444
input string: 1223334444
Line 1,722: Line 2,985:
the information entropy of the string ──► 1.846439344671 bits.
the information entropy of the string ──► 1.846439344671 bits.
</pre>
</pre>
{{out}} when using the input of: <tt> Rosetta Code </tt>
{{out|output|text=&nbsp; when using the input of: &nbsp; &nbsp; <tt> Rosetta Code </tt>}}
<pre>
<pre>
input string: Rosetta Code
input string: Rosetta Code
Line 1,731: Line 2,994:
</pre>
</pre>


=={{header|Ruby}}==
=={{header|Ring}}==
<syntaxhighlight lang="ring">
{{works with|Ruby|1.9}}
decimals(8)
<lang ruby>def entropy(s)
entropy = 0
counts = Hash.new(0.0)
countOfChar = list(255)
s.each_char { |c| counts[c] += 1 }
leng = s.length
source="1223334444"
charCount =len( source)
usedChar =""
for i =1 to len( source)
ch =substr(source, i, 1)
if not(substr( usedChar, ch)) usedChar =usedChar +ch ok
j =substr( usedChar, ch)
countOfChar[j] =countOfChar[j] +1
next
l =len(usedChar)
for i =1 to l
probability =countOfChar[i] /charCount
entropy =entropy - (probability *logBase(probability, 2))
next
see "Characters used and the number of occurrences of each " + nl
for i =1 to l
see "'" + substr(usedChar, i, 1) + "' " + countOfChar[i] + nl
next
see " Entropy of " + source + " is " + entropy + " bits." + nl
see " The result should be around 1.84644 bits." + nl
func logBase (x, b)
logBase =log( x) /log( 2)
return logBase
</syntaxhighlight>
Output:
<pre>
Characters used and the number of occurrences of each
'1' 1
'2' 2
'3' 3
'4' 4
Entropy of 1223334444 is 1.84643934 bits.
The result should be around 1.84644 bits.
</pre>

=={{header|RPL}}==
{{works with|Halcyon Calc|4.2.7}}
{| class="wikitable"
! Code
! Comments
|-
|
DUP SIZE 2 LN → str len log2
≪ { 255 } 0 CON
1 len '''FOR''' j
str j DUP SUB
NUM DUP2 GET 1 + PUT
'''NEXT'''
0 1 255 '''FOR''' j
'''IF''' OVER j GET
'''THEN''' LAST len / DUP LN log2 / * + '''END'''
'''NEXT'''
NEG SWAP DROP
≫ ≫ '<span style="color:blue">NTROP</span>' STO
|
<span style="color:blue">NTROP</span> ''( "string" -- entropy )''
Initialize local variables
Initialize a vector with 255 counters
For each character in the string...
... increase the counter according to ASCII code
For each non-zero counter
calculate term
Change sign and forget the vector
|}
The following line of code delivers what is required:
"1223334444" <span style="color:blue">NTROP</span>
{{out}}
<pre>
1: 1.84643934467
</pre>

=={{header|Ruby}}==
<syntaxhighlight lang="ruby">def entropy(s)
counts = s.chars.tally
leng = s.length.to_f
counts.values.reduce(0) do |entropy, count|
counts.values.reduce(0) do |entropy, count|
freq = count / leng
freq = count / leng
Line 1,744: Line 3,093:
end
end


p entropy("1223334444")</lang>
p entropy("1223334444")</syntaxhighlight>
{{out}}
{{out}}
<pre>
<pre>
1.8464393446710154
1.8464393446710154
</pre>
</pre>

One-liner, same performance (or better):
=={{header|Run BASIC}}==
<lang ruby>def entropy2(s)
<syntaxhighlight lang="runbasic">dim chrCnt( 255) ' possible ASCII chars
s.each_char.group_by(&:to_s).values.map { |x| x.length / s.length.to_f }.reduce(0) { |e, x| e - x*Math.log2(x) }

end</lang>
source$ = "1223334444"
numChar = len(source$)

for i = 1 to len(source$) ' count which chars are used in source
ch$ = mid$(source$,i,1)
if not( instr(chrUsed$, ch$)) then chrUsed$ = chrUsed$ + ch$
j = instr(chrUsed$, ch$)
chrCnt(j) =chrCnt(j) +1
next i

lc = len(chrUsed$)
for i = 1 to lc
odds = chrCnt(i) /numChar
entropy = entropy - (odds * (log(odds) / log(2)))
next i

print " Characters used and times used of each "
for i = 1 to lc
print " '"; mid$(chrUsed$,i,1); "'";chr$(9);chrCnt(i)
next i

print " Entropy of '"; source$; "' is "; entropy; " bits."

end</syntaxhighlight><pre>
Characters used and times used of each
'1' 1
'2' 2
'3' 3
'4' 4
Entropy of '1223334444' is 1.84643939 bits.
</pre>


=={{header|Rust}}==
=={{header|Rust}}==
<syntaxhighlight lang="rust">fn entropy(s: &[u8]) -> f32 {
<lang Rust>// works for Rust 0.9
let mut histogram = [0u64; 256];
fn entropy(s: &str) -> f32 {
let mut entropy: f32 = 0.0;
let mut histogram = [0, ..256];
let len = s.len();


for &b in s {
for i in range(0, len) { histogram[s[i]] += 1; }
histogram[b as usize] += 1;
for i in range(0, 256) {
}
if histogram[i] > 0 {
let ratio = (histogram[i] as f32 / len as f32) as f32;
entropy -= (ratio * log2(ratio)) as f32;
}
}


histogram
entropy
.iter()
}</lang>
.cloned()
.filter(|&h| h != 0)
.map(|h| h as f32 / s.len() as f32)
.map(|ratio| -ratio * ratio.log2())
.sum()
}

fn main() {
let arg = std::env::args().nth(1).expect("Need a string.");
println!("Entropy of {} is {}.", arg, entropy(arg.as_bytes()));
}</syntaxhighlight>
{{out}}
<pre>$ ./entropy 1223334444
Entropy of 1223334444 is 1.8464394.
</pre>


=={{header|Scala}}==
=={{header|Scala}}==
<lang scala>import scala.math._
<syntaxhighlight lang="scala">import scala.math._


def entropy( v:String ) = { v
def entropy( v:String ) = { v
Line 1,784: Line 3,172:


// Confirm that "1223334444" has an entropy of about 1.84644
// Confirm that "1223334444" has an entropy of about 1.84644
assert( math.round( entropy("1223334444") * 100000 ) * 0.00001 == 1.84644 )</lang>
assert( math.round( entropy("1223334444") * 100000 ) * 0.00001 == 1.84644 )</syntaxhighlight>


=={{header|scheme}}==
=={{header|scheme}}==
A version capable of calculating multidimensional entropy.
A version capable of calculating multidimensional entropy.
<lang scheme>
<syntaxhighlight lang="scheme">
(define (entropy input)
(define (entropy input)
(define (close? a b)
(define (close? a b)
Line 1,831: Line 3,219:
(entropy (list 1 2 2 3 3 3 4 4 4 4))
(entropy (list 1 2 2 3 3 3 4 4 4 4))
(entropy (list (list 1 1) (list 1.1 1.1) (list 1.2 1.2) (list 1.3 1.3) (list 1.5 1.5) (list 1.6 1.6)))
(entropy (list (list 1 1) (list 1.1 1.1) (list 1.2 1.2) (list 1.3 1.3) (list 1.5 1.5) (list 1.6 1.6)))
</lang>
</syntaxhighlight>


{{out}}
{{out}}
Line 1,839: Line 3,227:
1.4591479170272448 bits
1.4591479170272448 bits
</pre>
</pre>

=={{header|Scilab}}==
<syntaxhighlight lang="text">function E = entropy(d)
d=strsplit(d);
n=unique(string(d));
N=size(d,'r');
count=zeros(n);
n_size = size(n,'r');
for i = 1:n_size
count(i) = sum ( d == n(i) );
end
E=0;
for i=1:length(count)
E = E - count(i)/N * log(count(i)/N) / log(2);
end
endfunction

word ='1223334444';
E = entropy(word);
disp('The entropy of '+word+' is '+string(E)+'.');</syntaxhighlight>

{{out}}
<pre> The entropy of 1223334444 is 1.8464393.</pre>


=={{header|Seed7}}==
=={{header|Seed7}}==
<lang seed7>$ include "seed7_05.s7i";
<syntaxhighlight lang="seed7">$ include "seed7_05.s7i";
include "float.s7i";
include "float.s7i";
include "math.s7i";
include "math.s7i";
Line 1,869: Line 3,282:
begin
begin
writeln(entropy("1223334444") digits 5);
writeln(entropy("1223334444") digits 5);
end func;</lang>
end func;</syntaxhighlight>


{{out}}
{{out}}
Line 1,875: Line 3,288:
1.84644
1.84644
</pre>
</pre>

=={{header|SETL}}==
<syntaxhighlight lang="setl">program shannon_entropy;
print(entropy "1223334444");

op entropy(symbols);
hist := {};
loop for symbol in symbols do
hist(symbol) +:= 1;
end loop;
h := 0.0;
loop for count = hist(symbol) do
f := count / #symbols;
h -:= f * log f / log 2;
end loop;
return h;
end op;
end program; </syntaxhighlight>
{{out}}
<pre>1.84643934467102</pre>


=={{header|Sidef}}==
=={{header|Sidef}}==
<lang ruby>func entropy(s) {
<syntaxhighlight lang="ruby">func entropy(s) {
var counts = Hash.new();
var counts = Hash.new;
s.each { |c| counts{c} := 0 ++ };
counts.default(0);
s.each { |c| counts[c]++ };
var len = s.len;
var len = s.len;
[0, counts.values.map {|count|
[0, counts.values.map {|count|
var freq = count/len; freq * freq.log2 }...
var freq = count/len; freq * freq.log2 }...
]«-»;
]«-»;
}
}
 

say entropy("1223334444");</lang>
say entropy("1223334444");</syntaxhighlight>
{{out}}
{{out}}
<pre>1.846439344671015493434197746305045223237</pre>
<pre>1.846439344671015493434197746305045223237</pre>

=={{header|Standard ML}}==
<syntaxhighlight lang="standard ml">val Entropy = fn input =>
let
val N = Real.fromInt (String.size input) ;
val term = fn a => Math.ln (a/N) * a / ( Math.ln 2.0 * N ) ;
val v0 = Vector.tabulate (255,fn i=>0) ;
val freq = Vector.map Real.fromInt (* List.foldr: count occurrences *)
(List.foldr (fn (i,v) => Vector.update( v, ord i, Vector.sub(v,ord i) + 1) ) v0 (explode input) )
in
~ (Vector.foldr (fn (a,s) => if a > 0.0 then term a + s else s) 0.0 freq )

end ;</syntaxhighlight>
Entropy "1223334444" ;
val it = 1.846439345: real

=={{header|Swift}}==
<syntaxhighlight lang="swift">import Foundation

func entropy(of x: String) -> Double {
return x
.reduce(into: [String: Int](), {cur, char in
cur[String(char), default: 0] += 1
})
.values
.map({i in Double(i) / Double(x.count) } as (Int) -> Double)
.map({p in -p * log2(p) } as (Double) -> Double)
.reduce(0.0, +)
}

print(entropy(of: "1223334444"))</syntaxhighlight>

{{out}}
<pre>1.8464393446710154</pre>


=={{header|Tcl}}==
=={{header|Tcl}}==
<lang tcl>proc entropy {str} {
<syntaxhighlight lang="tcl">proc entropy {str} {
set log2 [expr log(2)]
set log2 [expr log(2)]
foreach char [split $str ""] {dict incr counts $char}
foreach char [split $str ""] {dict incr counts $char}
Line 1,901: Line 3,369:
}
}
return $entropy
return $entropy
}</lang>
}</syntaxhighlight>
Demonstration:
Demonstration:
<lang tcl>puts [format "entropy = %.5f" [entropy "1223334444"]]
<syntaxhighlight lang="tcl">puts [format "entropy = %.5f" [entropy "1223334444"]]
puts [format "entropy = %.5f" [entropy "Rosetta Code"]]</lang>
puts [format "entropy = %.5f" [entropy "Rosetta Code"]]</syntaxhighlight>
{{out}}
{{out}}
<pre>
<pre>
entropy = 1.84644
entropy = 1.84644
entropy = 3.08496
entropy = 3.08496
</pre>

=={{header|V (Vlang)}}==
===Vlang: Map version===
<syntaxhighlight lang="v (vlang)">import math
import arrays

fn hist(source string) map[string]int {
mut hist := map[string]int{}
for e in source.split('') {
if e !in hist {
hist[e] = 0
}
hist[e]+=1
}
return hist
}

fn entropy(hist map[string]int, l int) f64 {
mut elist := []f64{}
for _,v in hist {
c := f64(v) / f64(l)
elist << -c * math.log2(c)
}
return arrays.sum<f64>(elist) or {-1}
}

fn main(){
input := "1223334444"
h := hist(input)
e := entropy(h, input.len)
println(e)
}</syntaxhighlight>
{{out}}
<pre>
1.8464393446710152
</pre>

=={{header|Wren}}==
{{trans|Go}}
<syntaxhighlight lang="wren">var s = "1223334444"
var m = {}
for (c in s) {
var d = m[c]
m[c] = (d) ? d + 1 : 1
}
var hm = 0
for (k in m.keys) {
var c = m[k]
hm = hm + c * c.log2
}
var l = s.count
System.print(l.log2 - hm/l)</syntaxhighlight>

{{out}}
<pre>
1.846439344671
</pre>
</pre>


=={{header|XPL0}}==
=={{header|XPL0}}==
<lang XPL0>code real RlOut=48, Ln=54; \intrinsic routines
<syntaxhighlight lang="xpl0">code real RlOut=48, Ln=54; \intrinsic routines
string 0; \use zero-terminated strings
string 0; \use zero-terminated strings


Line 1,938: Line 3,463:
];
];


RlOut(0, Entropy("1223334444"))</lang>
RlOut(0, Entropy("1223334444"))</syntaxhighlight>
{{out}}
{{out}}
<pre>
<pre>
1.84644
1.84644
</pre>

=={{header|Zig}}==
<syntaxhighlight lang="zig">
const std = @import("std");
const math = std.math;

pub fn main() !void {
const stdout = std.io.getStdOut().outStream();
try stdout.print("{d:.12}\n", .{H("1223334444")});
}

fn H(s: []const u8) f64 {
var counts = [_]u16{0} ** 256;
for (s) |ch|
counts[ch] += 1;

var h: f64 = 0;
for (counts) |c|
if (c != 0) {
const p = @intToFloat(f64, c) / @intToFloat(f64, s.len);
h -= p * math.log2(p);
};

return h;
}
</syntaxhighlight>
{{Out}}
<pre>
1.846439344671
</pre>
</pre>


=={{header|zkl}}==
=={{header|zkl}}==
{{trans|D}}
{{trans|D}}
<lang zkl>fcn entropy(text){
<syntaxhighlight lang="zkl">fcn entropy(text){
text.pump(Void,fcn(c,freq){ c=c.toAsc(); freq[c]+=1; freq }
text.pump(Void,fcn(c,freq){ c=c.toAsc(); freq[c]+=1; freq }
.fp1( (0).pump(256,List,0.0).copy() )) // array[256] of 0.0
.fp1( (0).pump(256,List,0.0).copy() )) // array[256] of 0.0
Line 1,955: Line 3,510:
}
}


entropy("1223334444").println(" bits");</lang>
entropy("1223334444").println(" bits");</syntaxhighlight>
{{out}}
{{out}}
<pre>
<pre>
1.84644 bits
1.84644 bits
</pre>
</pre>

=={{header|ZX Spectrum Basic}}==
{{trans|FreeBASIC}}
<syntaxhighlight lang="zxbasic">10 LET s$="1223334444": LET base=2: LET entropy=0
20 LET sourcelen=LEN s$
30 DIM t(255)
40 FOR i=1 TO sourcelen
50 LET number= CODE s$(i)
60 LET t(number)=t(number)+1
70 NEXT i
80 PRINT "Char";TAB (6);"Count"
90 FOR i=1 TO 255
100 IF t(i)<>0 THEN PRINT CHR$ i;TAB (6);t(i): LET prop=t(i)/sourcelen: LET entropy=entropy-(prop*(LN prop)/(LN base))
110 NEXT i
120 PRINT '"The Entropy of """;s$;""" is ";entropy</syntaxhighlight>

Latest revision as of 18:03, 17 February 2024

Task
Entropy
You are encouraged to solve this task according to the task description, using any language you may know.
Task

Calculate the Shannon entropy   H   of a given input string.

Given the discrete random variable that is a string of "symbols" (total characters) consisting of different characters (n=2 for binary), the Shannon entropy of X in bits/symbol is :

where is the count of character .

For this task, use X="1223334444" as an example. The result should be 1.84644... bits/symbol. This assumes X was a random variable, which may not be the case, or it may depend on the observer.

This coding problem calculates the "specific" or "intensive" entropy that finds its parallel in physics with "specific entropy" S0 which is entropy per kg or per mole, not like physical entropy S and therefore not the "information" content of a file. It comes from Boltzmann's H-theorem where where N=number of molecules. Boltzmann's H is the same equation as Shannon's H, and it gives the specific entropy H on a "per molecule" basis.

The "total", "absolute", or "extensive" information entropy is

bits

This is not the entropy being coded here, but it is the closest to physical entropy and a measure of the information content of a string. But it does not look for any patterns that might be available for compression, so it is a very restricted, basic, and certain measure of "information". Every binary file with an equal number of 1's and 0's will have S=N bits. All hex files with equal symbol frequencies will have bits of entropy. The total entropy in bits of the example above is S= 10*18.4644 = 18.4644 bits.

The H function does not look for any patterns in data or check if X was a random variable. For example, X=000000111111 gives the same calculated entropy in all senses as Y=010011100101. For most purposes it is usually more relevant to divide the gzip length by the length of the original data to get an informal measure of how much "order" was in the data.

Two other "entropies" are useful:

Normalized specific entropy:

which varies from 0 to 1 and it has units of "entropy/symbol" or just 1/symbol. For this example, Hn<\sub>= 0.923.

Normalized total (extensive) entropy:

which varies from 0 to N and does not have units. It is simply the "entropy", but it needs to be called "total normalized extensive entropy" so that it is not confused with Shannon's (specific) entropy or physical entropy. For this example, Sn<\sub>= 9.23.

Shannon himself is the reason his "entropy/symbol" H function is very confusingly called "entropy". That's like calling a function that returns a speed a "meter". See section 1.7 of his classic A Mathematical Theory of Communication and search on "per symbol" and "units" to see he always stated his entropy H has units of "bits/symbol" or "entropy/symbol" or "information/symbol". So it is legitimate to say entropy NH is "information".

In keeping with Landauer's limit, the physics entropy generated from erasing N bits is if the bit storage device is perfectly efficient. This can be solved for H2*N to (arguably) get the number of bits of information that a physical entropy represents.

Related tasks



11l

F entropy(source)
   DefaultDict[Char, Int] hist
   L(c) source
      hist[c]++
   V r = 0.0
   L(v) hist.values()
      V c = Float(v) / source.len
      r -= c * log2(c)
   R r

print(entropy(‘1223334444’))
Output:
1.84644

Ada

Uses Ada 2012.

with Ada.Text_IO, Ada.Float_Text_IO, Ada.Numerics.Elementary_Functions;

procedure Count_Entropy is

   package TIO renames Ada.Text_IO;

   Count: array(Character) of Natural := (others => 0);
   Sum:   Natural := 0;
   Line: String := "1223334444";

begin
   for I in Line'Range loop   -- count the characters
      Count(Line(I)) := Count(Line(I))+1;
      Sum := Sum + 1;
   end loop;

   declare   -- compute the entropy and print it
      function P(C: Character) return Float is (Float(Count(C)) / Float(Sum));
      use Ada.Numerics.Elementary_Functions, Ada.Float_Text_IO;
      Result: Float := 0.0;
   begin
      for Ch in Character loop
         Result := Result -
          (if P(Ch)=0.0 then 0.0 else P(Ch) * Log(P(Ch), Base => 2.0));
      end loop;
      Put(Result, Fore => 1, Aft => 5, Exp => 0);
   end;
end Count_Entropy;

Aime

integer c;
real h, v;
index x;
data s;

for (, c in (s = argv(1))) {
    x[c] += 1r;
}

h = 0;
for (, v in x) {
    v /= ~s;
    h -= v * log2(v);
}

o_form("/d6/\n", h);

Examples:

$ aime -a tmp/entr 1223334444
1.846439
$ aime -a tmp/entr 'Rosetta Code is the best site in the world!'
3.646513
$ aime -a tmp/entr 1234567890abcdefghijklmnopqrstuvwxyz
5.169925

ALGOL 68

BEGIN
    # calculate the shannon entropy of a string                                #
    PROC shannon entropy = ( STRING s )REAL:
    BEGIN
        INT string length = ( UPB s - LWB s ) + 1;
        # count the occurences of each character #
        [ 0 : max abs char ]INT char count;
        FOR char pos FROM LWB char count TO UPB char count DO
            char count[ char pos ] := 0
        OD;
        FOR char pos FROM LWB s TO UPB s DO
            char count[ ABS s[ char pos ] ] +:= 1
        OD;
        # calculate the entropy, we use log base 10 and then convert #
        # to log base 2 after calculating the sum                    #
        REAL entropy := 0;
        FOR char pos FROM LWB char count TO UPB char count DO
            IF char count[ char pos ] /= 0
            THEN
                # have a character that occurs in the string #
                REAL probability = char count[ char pos ] / string length;
                entropy -:= probability * log( probability )
            FI
        OD;
        entropy / log( 2 )
    END; # shannon entropy #

    # test the shannon entropy routine #
    print( ( shannon entropy( "1223334444" ), newline ) )

END
Output:
+1.84643934467102e  +0

ALGOL W

Translation of: ALGOL 68
begin
    % calculates the shannon entropy of a string                          %
    % strings are fixed length in algol W and the length is part of the   %
    % type, so we declare the string parameter to be the longest possible %
    % string length (256 characters) and have a second parameter to       %
    % specify how much is actually used                                   %
    real procedure shannon_entropy ( string(256) value s
                                   ; integer     value stringLength
                                   );
    begin

        real    probability, entropy;

        % algol W assumes there are 256 possible characters %
        integer MAX_CHAR;
                MAX_CHAR := 256;

        % declarations must preceed statements, so we start a new         %
        % block here so we can use MAX_CHAR as an array bound             %
        begin

            % increment an integer variable                               %
            procedure incI ( integer value result a ) ; a := a + 1;

            integer array charCount( 1 :: MAX_CHAR );

            % count the occurances of each character in s                 %
            for charPos := 1 until MAX_CHAR do charCount( charPos ) := 0;
            for sPos := 0 until stringLength - 1 do incI( charCount( decode( s( sPos | 1 ) ) ) );

            % calculate the entropy, we use log base 10 and then convert  %
            % to log base 2 after calculating the sum                     %

            entropy := 0.0;
            for charPos := 1 until MAX_CHAR do
            begin
                if charCount( charPos ) not = 0
                then begin
                    % have a character that occurs in the string          %
                    probability := charCount( charPos ) / stringLength;
                    entropy     := entropy - ( probability * log( probability ) )
                end 
            end charPos

        end;

        entropy / log( 2 )
    end shannon_entropy ;

    % test the shannon entropy routine                                    %
    r_format := "A"; r_w := 12; r_d := 6; % set output to fixed format    %
    write( shannon_entropy( "1223334444", 10 ) )

end.
Output:
    1.846439

APL

      ENTROPY{-+/R×2R(+∘.=∪)÷⍴}

      ⍝ How it works:
      UNIQUEX'1223334444'
1234
      TABLE_OF_OCCURENCESX∘.=UNIQUE
1 0 0 0
0 1 0 0
0 1 0 0
0 0 1 0
0 0 1 0
0 0 1 0
0 0 0 1
0 0 0 1
0 0 0 1
0 0 0 1
      COUNT+TABLE_OF_OCCURENCES
1 2 3 4
      NX
10
      RATIOCOUNT÷N
0.1 0.2 0.3 0.4
      -+/RATIO×2RATIO
1.846439345
Output:
      ENTROPY X
1.846439345

Arturo

entropy: function [s][
    t: #[]
    loop s 'c [
        unless key? t c -> t\[c]: 0
        t\[c]: t\[c] + 1
    ]
    result: new 0
    loop values t 'x ->
        'result - (x//(size s)) * log x//(size s) 2

    return result
]

print entropy "1223334444"
Output:
1.846439344671015

AutoHotkey

MsgBox, % Entropy(1223334444)

Entropy(n)
{
    a := [], len := StrLen(n), m := n
    while StrLen(m)
    {
        s := SubStr(m, 1, 1)
        m := RegExReplace(m, s, "", c)
        a[s] := c
    }
    for key, val in a
    {
        m := Log(p := val / len)
        e -= p * m / Log(2)
    }
    return, e
}
Output:
1.846440

AWK

#!/usr/bin/awk -f
{
    N = length
    for (i = 1; i <= N; ++i)
        ++H[substr($0, i, 1)]
}

END {
    for (i in H)
        S += H[i] * log(H[i])
    print (log(N) - S / N) / log(2)
}
Usage:
 echo 1223334444 |./entropy.awk
1.84644

BASIC

Works with older (unstructured) Microsoft-style BASIC.

10 DEF FN L(X)=LOG(X)/LOG(2)
20 S$="1223334444"
30 U$=""
40 FOR I=1 TO LEN(S$)
50 K=0
60 FOR J=1 TO LEN(U$)
70 IF MID$(U$,J,1)=MID$(S$,I,1) THEN K=1
80 NEXT J
90 IF K=0 THEN U$=U$+MID$(S$,I,1)
100 NEXT I
110 DIM R(LEN(U$)-1)
120 FOR I=1 TO LEN(U$)
130 C=0
140 FOR J=1 TO LEN(S$)
150 IF MID$(U$,I,1)=MID$(S$,J,1) THEN C=C+1
160 NEXT J
170 R(I-1)=(C/LEN(S$))*FN L(C/LEN(S$))
180 NEXT I
190 E=0
200 FOR I=0 TO LEN(U$)-1
210 E=E-R(I)
220 NEXT I
230 PRINT E
Output:
1.84643935

QBasic

FUNCTION L (X)
    L = LOG(X) / LOG(2)
END FUNCTION

S$ = "1223334444"
U$ = ""
FOR I = 1 TO LEN(S$)
    K = 0
    FOR J = 1 TO LEN(U$)
        IF MID$(U$, J, 1) = MID$(S$, I, 1) THEN K = 1
    NEXT J
    IF K = 0 THEN U$ = U$ + MID$(S$, I, 1)
NEXT I
DIM R(LEN(U$) - 1)
FOR I = 1 TO LEN(U$)
    C = 0
    FOR J = 1 TO LEN(S$)
        IF MID$(U$, I, 1) = MID$(S$, J, 1) THEN C = C + 1
    NEXT J
    R(I - 1) = (C / LEN(S$)) * L(C / LEN(S$))
NEXT I
E = 0
FOR I = 0 TO LEN(U$) - 1
    E = E - R(I)
NEXT I
PRINT E
END

Sinclair ZX81 BASIC

Works with 1k of RAM.

 10 LET X$="1223334444"
 20 LET U$=""
 30 FOR I=1 TO LEN X$
 40 LET K=0
 50 FOR J=1 TO LEN U$
 60 IF U$(J)=X$(I) THEN LET K=K+1
 70 NEXT J
 80 IF K=0 THEN LET U$=U$+X$(I)
 90 NEXT I
100 DIM R(LEN U$)
110 FOR I=1 TO LEN U$
120 LET C=0
130 FOR J=1 TO LEN X$
140 IF U$(I)=X$(J) THEN LET C=C+1
150 NEXT J
160 LET R(I)=C/LEN X$*LN (C/LEN X$)/LN 2
170 NEXT I
180 LET E=0
190 FOR I=1 TO LEN U$
200 LET E=E-R(I)
210 NEXT I
220 PRINT E
Output:
1.8464393

uBasic/4tH

Translation of: QBasic

uBasic/4tH is an integer BASIC only. So, fixed point arithmetic is required go fulfill this task. Some loss of precision is unavoidable.

If Info("wordsize") < 64 Then Print "This program requires a 64-bit uBasic" : End

s := "1223334444"
u := ""
x := FUNC(_Fln(FUNC(_Ntof(2))))        ' calculate LN(2)

For i = 0 TO Len(s)-1
  k = 0
  For j = 0 TO Len(u)-1
    If Peek(u, j) = Peek(s, i) Then k = 1
  Next
  If k = 0 THEN u = Join(u, Char (Peek (s, i)))
Next

Dim @r(Len(u)-1)

For i = 0 TO Len(u)-1
  c = 0
  For J = 0 TO Len(s)-1
    If Peek(u, i) = Peek (s, j) Then c = c + 1
  Next
  q = FUNC(_Fdiv(c, Len(s)))
  @r(i) = FUNC(_Fmul(q, FUNC(_Fdiv(FUNC(_Fln(q)), x))))
Next

e = 0
For i = 0 To Len(u) - 1
    e = e - @r(i)
Next

Print Using "+?.####"; FUNC(_Ftoi(e))

End

_Fln Param (1) : Return (FUNC(_Ln(a@*4))/4)
_Fmul Param (2) : Return ((a@*b@)/16384)
_Fdiv Param (2) : Return ((a@*16384)/b@)
_Ntof Param (1) : Return (a@*16384)
_Ftoi Param (1) : Return ((10000*a@)/16384)

_Ln
  Param (1)
  Local (2)

  c@=681391
  If (a@<32768)      Then a@=SHL(a@, 16) : c@=c@-726817
  If (a@<8388608)    Then a@=SHL(a@, 8)  : c@=c@-363409
  If (a@<134217728)  Then a@=SHL(a@, 4)  : c@=c@-181704
  If (a@<536870912)  Then a@=SHL(a@, 2)  : c@=c@-90852
  If (a@<1073741824) Then a@=SHL(a@, 1)  : c@=c@-45426
  b@=a@+SHL(a@, -1) : If (AND(b@, 2147483648)) = 0 Then a@=b@ : c@=c@-26573
  b@=a@+SHL(a@, -2) : If (AND(b@, 2147483648)) = 0 Then a@=b@ : c@=c@-14624
  b@=a@+SHL(a@, -3) : If (AND(b@, 2147483648)) = 0 Then a@=b@ : c@=c@-7719
  b@=a@+SHL(a@, -4) : If (AND(b@, 2147483648)) = 0 Then a@=b@ : c@=c@-3973
  b@=a@+SHL(a@, -5) : If (AND(b@, 2147483648)) = 0 Then a@=b@ : c@=c@-2017
  b@=a@+SHL(a@, -6) : If (AND(b@, 2147483648)) = 0 Then a@=b@ : c@=c@-1016
  b@=a@+SHL(a@, -7) : If (AND(b@, 2147483648)) = 0 Then a@=b@ : c@=c@-510
  a@=2147483648-a@;
  c@=c@-SHL(a@, -15)
Return (c@)
Output:
1.8461

0 OK, 0:638

BBC BASIC

Translation of: APL
REM >entropy
PRINT FNentropy("1223334444")
END
:
DEF FNentropy(x$)
LOCAL unique$, count%, n%, ratio(), u%, i%, j%
unique$ = ""
n% = LEN x$
FOR i% = 1 TO n%
  IF INSTR(unique$, MID$(x$, i%, 1)) = 0 THEN unique$ += MID$(x$, i%, 1)
NEXT
u% = LEN unique$
DIM ratio(u% - 1)
FOR i% = 1 TO u%
  count% = 0
  FOR j% = 1 TO n%
    IF MID$(unique$, i%, 1) = MID$(x$, j%, 1) THEN count% += 1
  NEXT
  ratio(i% - 1) = (count% / n%) * FNlogtwo(count% / n%)
NEXT
= -SUM(ratio())
:
DEF FNlogtwo(n)
= LN n / LN 2
Output:
1.84643934

BQN

H  -(+´⊢×2)((+˝⊢=)÷≠)

H "1223334444"
Output:
1.8464393446710154

Burlesque

blsq ) "1223334444"F:u[vv^^{1\/?/2\/LG}m[?*++
1.8464393446710157

C

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <string.h>
#include <math.h>
 
#define MAXLEN 100 //maximum string length
 
int makehist(unsigned char *S,int *hist,int len){
	int wherechar[256];
	int i,histlen;
	histlen=0;
	for(i=0;i<256;i++)wherechar[i]=-1;
	for(i=0;i<len;i++){
		if(wherechar[(int)S[i]]==-1){
			wherechar[(int)S[i]]=histlen;
			histlen++;
		}
		hist[wherechar[(int)S[i]]]++;
	}
	return histlen;
}
 
double entropy(int *hist,int histlen,int len){
	int i;
	double H;
	H=0;
	for(i=0;i<histlen;i++){
		H-=(double)hist[i]/len*log2((double)hist[i]/len);
	}
	return H;
}
 
int main(void){
	unsigned char S[MAXLEN];
	int len,*hist,histlen;
	double H;
	scanf("%[^\n]",S);
	len=strlen(S);
	hist=(int*)calloc(len,sizeof(int));
	histlen=makehist(S,hist,len);
	//hist now has no order (known to the program) but that doesn't matter
	H=entropy(hist,histlen,len);
	printf("%lf\n",H);
	return 0;
}

Examples:

$ ./entropy
1223334444
1.846439
$ ./entropy
Rosetta Code is the best site in the world!
3.646513

C#

Translation of C++.

using System;
using System.Collections.Generic;
namespace Entropy
{
	class Program
	{
		public static double logtwo(double num)
		{
			return Math.Log(num)/Math.Log(2);
		}
		public static void Main(string[] args)
		{
		label1:
			string input = Console.ReadLine();
			double infoC=0;
			Dictionary<char,double> table = new Dictionary<char, double>();

			
			foreach (char c in input)
			{
				if (table.ContainsKey(c))
					table[c]++;
				    else
				    	table.Add(c,1);
	
			}
			double freq;
			foreach (KeyValuePair<char,double> letter in table)
			{
				freq=letter.Value/input.Length;
				infoC+=freq*logtwo(freq);
			}
			infoC*=-1;
			Console.WriteLine("The Entropy of {0} is {1}",input,infoC);
			goto label1;
		
		}
	}
}
Output:
The Entropy of 1223334444 is 1.84643934467102

Without using Hashtables or Dictionaries:

using System;
namespace Entropy
{
	 class Program
	{
		public static double logtwo(double num)
		{
			return Math.Log(num)/Math.Log(2);
		}
		static double Contain(string x,char k)
		{
			double count=0;
			foreach (char Y in x)
			{
				if(Y.Equals(k))
					count++;
			}
			return count;
		}
		public static void Main(string[] args)
		{
		label1:
			string input = Console.ReadLine();
			double infoC=0;
			double freq;
			string k="";
			foreach (char c1 in input)
			{
				if (!(k.Contains(c1.ToString())))
					k+=c1;
			}
			foreach (char c in k)
			{
				freq=Contain(input,c)/(double)input.Length;
				infoC+=freq*logtwo(freq);
			}
			infoC/=-1;
			Console.WriteLine("The Entropy of {0} is {1}",input,infoC);
			goto label1;
 
		}
	}
}

C++

#include <string>
#include <map>
#include <iostream>
#include <algorithm>
#include <cmath>

double log2( double number ) {
   return log( number ) / log( 2 ) ;
}

int main( int argc , char *argv[ ] ) {
   std::string teststring( argv[ 1 ] ) ;
   std::map<char , int> frequencies ;
   for ( char c : teststring )
     frequencies[ c ] ++ ;
   int numlen = teststring.length( ) ;
   double infocontent = 0 ;
   for ( std::pair<char , int> p : frequencies ) {
      double freq = static_cast<double>( p.second ) / numlen ;
      infocontent -= freq * log2( freq ) ;
   }
  
   std::cout << "The information content of " << teststring 
      << " is " << infocontent << std::endl ;
   return 0 ;
}
Output:
(entropy "1223334444")
The information content of 1223334444 is 1.84644

Clojure

(defn entropy [s]
  (let [len (count s), log-2 (Math/log 2)]
    (->> (frequencies s)
         (map (fn [[_ v]]
                (let [rf (/ v len)]
                  (-> (Math/log rf) (/ log-2) (* rf) Math/abs))))
         (reduce +))))
Output:
(entropy "1223334444")
1.8464393446710154

CLU

% NOTE: when compiling with Portable CLU,
% this program needs to be merged with 'useful.lib' to get log()
%
% pclu -merge $CLUHOME/lib/useful.lib -compile entropy.clu

shannon = proc (s: string) returns (real)
    % find the frequency of each character
    freq: array[int] := array[int]$fill(0, 256, 0)
    for c: char in string$chars(s) do
        i: int := char$c2i(c)
        freq[i] := freq[i] + 1
    end
    
    % calculate the component for each character
    h: real := 0.0
    rlen: real := real$i2r(string$size(s))
    for i: int in array[int]$indexes(freq) do
        if freq[i] ~= 0 then
            f: real := real$i2r(freq[i]) / rlen
            h := h - f * log(f) / log(2.0)
        end
    end
    return (h)
end shannon

start_up = proc ()
    po: stream := stream$primary_output()
    stream$putl(po, f_form(shannon("1223334444"), 1, 6))
end start_up
Output:
1.846439

CoffeeScript

entropy = (s) ->
    freq = (s) ->
        result = {}
        for ch in s.split ""
            result[ch] ?= 0
            result[ch]++
        return result

    frq = freq s
    n = s.length
    ((frq[f]/n for f of frq).reduce ((e, p) -> e - p * Math.log(p)), 0) * Math.LOG2E

console.log "The entropy of the string '1223334444' is #{entropy '1223334444'}"
Output:
The entropy of the string '1223334444' is 1.8464393446710157

Common Lisp

Not very Common Lisp-y version:

(defun entropy (string)
  (let ((table (make-hash-table :test 'equal))
        (entropy 0))
    (mapc (lambda (c) (setf (gethash c table) (+ (gethash c table 0) 1)))
          (coerce string 'list))
    (maphash (lambda (k v)
               (decf entropy (* (/ v (length input-string))
                                (log (/ v (length input-string)) 2))))
             table)
    entropy))

More like Common Lisp version:

(defun entropy (string &aux (length (length string)))
  (declare (type string string))
  (let ((table (make-hash-table)))
    (loop for char across string
          do (incf (gethash char table 0)))
    (- (loop for freq being each hash-value in table
             for freq/length = (/ freq length)
             sum (* freq/length (log freq/length 2))))))

Crystal

# Method to calculate sum of Float64 array
def sum(array : Array(Float64))
  res = 0
  array.each do |n|
    res += n
  end
  res
end

# Method to calculate which char appears how often
def histogram(source : String)
  hist = {} of Char => Int32
  l = 0
  source.each_char do |e|
    if !hist.has_key? e
      hist[e] = 0
    end
    hist[e] += 1
  end
  return Tuple.new(source.size, hist)
end

# Method to calculate entropy from histogram
def entropy(hist : Hash(Char, Int32), l : Int32)
  elist = [] of Float64
  hist.each do |el|
    v = el[1]
    c = v / l
    elist << (-c * Math.log(c, 2))
  end
  return sum elist
end

source = "1223334444"
hist_res = histogram source
l = hist_res[0]
h = hist_res[1]
puts ".[Results]."
puts "Length: " + l.to_s
puts "Entropy: " + (entropy h, l).to_s

D

import std.stdio, std.algorithm, std.math;

double entropy(T)(T[] s)
pure nothrow if (__traits(compiles, s.sort())) {
    immutable sLen = s.length;
    return s
           .sort()
           .group
           .map!(g => g[1] / double(sLen))
           .map!(p => -p * p.log2)
           .sum;
}

void main() {
    "1223334444"d.dup.entropy.writeln;
}
Output:
1.84644

Delphi

Library: StrUtils
Library: Math
Translation of: Pascal

Just fix Pascal code to run in Delphi.

program Entropytest;

uses
  StrUtils,
  Math;

type
  FArray = array of CARDINAL;

var
  strng: string = '1223334444';

// list unique characters in a string
function uniquechars(str: string): string;
var
  n: CARDINAL;
begin
  Result := '';
  for n := 1 to length(str) do
    if (PosEx(str[n], str, n) > 0) and (PosEx(str[n], Result, 1) = 0) then
      Result := Result + str[n];
end;

// obtain a list of character-frequencies for a string
//  given a string containing its unique characters
function frequencies(str, ustr: string): FArray;
var
  u, s, p, o: CARDINAL;
begin
  SetLength(Result, Length(ustr) + 1);
  p := 0;
  for u := 1 to length(ustr) do
    for s := 1 to length(str) do
    begin
      o := p;
      p := PosEx(ustr[u], str, s);
      if (p > o) then
        INC(Result[u]);
    end;
end;

// Obtain the Shannon entropy of a string
function entropy(s: string): EXTENDED;
var
  pf: FArray;
  us: string;
  i, l: CARDINAL;
begin
  us := uniquechars(s);
  pf := frequencies(s, us);
  l := length(s);
  Result := 0.0;
  for i := 1 to length(us) do
    Result := Result - pf[i] / l * log2(pf[i] / l);
end;

begin
  Writeln('Entropy of "', strng, '" is ', entropy(strng): 2: 5, ' bits.');
  readln;
end.

EasyLang

func entropy s$ .
   len d[] 255
   for c$ in strchars s$
      d[strcode c$] += 1
   .
   for cnt in d[]
      if cnt > 0
         prop = cnt / len s$
         entr -= (prop * log10 prop / log10 2)
      .
   .
   return entr
.
print entropy "1223334444"

EchoLisp

(lib 'hash)
;; counter: hash-table[key]++
(define (count++ ht k )
(hash-set ht k (1+ (hash-ref! ht k 0))))

(define (hi count n )
	(define pi (// count n))
	(- (* pi (log2 pi))))
	
;; (H [string|list]) → entropy (bits)
(define (H info) 
	(define S (if(string? info) (string->list info) info))
	(define ht (make-hash))
	(define n (length S))
	
	(for ((s S)) (count++ ht s))
	(for/sum ((s (make-set S)))  (hi (hash-ref ht s) n)))
Output:
;; by increasing entropy

(H "🔴")   0
(H "🔵🔴")  1
(H "1223334444")  1.8464393446710154
(H "♖♘♗♕♔♗♘♖♙♙♙♙♙♙♙♙♙")  2.05632607578088
(H "EchoLisp")   3
(H "Longtemps je me suis couché de bonne heure")  3.860828877124944
(H "azertyuiopmlkjhgfdsqwxcvbn")  4.700439718141092
(H (for/list ((i 1000)) (random 1000)))   9.13772704467521
(H (for/list ((i 100_000)) (random 100_000)))  15.777516877140766
(H (for/list ((i 1000_000)) (random 1000_000)))  19.104028424596976

Elena

Translation of: C#

ELENA 6.x :

import system'math;
import system'collections;
import system'routines;
import extensions;
 
extension op
{
    logTwo()
        = self.ln() / 2.ln();
}
 
public program()
{
    var input := console.readLine();
    var infoC := 0.0r;
    var table := Dictionary.new();
 
    input.forEach::(ch)
    {
        var n := table[ch];
        if (nil == n)
        {
            table[ch] := 1
        }
        else
        {
            table[ch] := n + 1
        }
    };
 
    var freq := 0;
    table.forEach::(letter)
    {
        freq := letter.toInt().realDiv(input.Length);
 
        infoC += (freq * freq.logTwo())
    };
 
    infoC *= -1;
 
    console.printLine("The Entropy of ", input, " is ", infoC)
}
Output:
The Entropy of 1223334444 is 1.846439344671

Elixir

Works with: Erlang/OTP version 18

:math.log2 was added in OTP 18.

defmodule RC do
  def entropy(str) do
    leng = String.length(str)
    String.graphemes(str)
    |> Enum.group_by(&(&1))
    |> Enum.map(fn{_,value} -> length(value) end)
    |> Enum.reduce(0, fn count, entropy ->
         freq = count / leng
         entropy - freq * :math.log2(freq)
       end)
  end
end

IO.inspect RC.entropy("1223334444")
Output:
1.8464393446710154

Emacs Lisp

(defun shannon-entropy (input)
  (let ((freq-table (make-hash-table))
	(entropy 0)
	(length (+ (length input) 0.0)))
    (mapcar (lambda (x)
	      (puthash x
		       (+ 1 (gethash x freq-table 0))
		       freq-table))
	    input)
    (maphash (lambda (k v)
	       (set 'entropy (+ entropy
			     (* (/ v length)
				(log (/ v length) 2)))))
	     freq-table)
  (- entropy)))
Output:

After adding the above to the emacs runtime, you can run the function interactively in the scratch buffer as shown below (type ctrl-j at the end of the first line and the output will be placed by emacs on the second line).

(shannon-entropy "1223334444")
1.8464393446710154

Erlang

-module( entropy ).

-export( [shannon/1, task/0] ).

shannon( String ) -> shannon_information_content( lists:foldl(fun count/2, dict:new(), String), erlang:length(String) ).

task() -> shannon( "1223334444" ).



count( Character, Dict ) -> dict:update_counter( Character, 1, Dict ).

shannon_information_content( Dict, String_length ) ->
	{_String_length, Acc} = dict:fold( fun shannon_information_content/3, {String_length, 0.0}, Dict ),
	Acc / math:log( 2 ).

shannon_information_content( _Character, How_many, {String_length, Acc} ) ->
        Frequency = How_many / String_length,
	{String_length, Acc - (Frequency * math:log(Frequency))}.
Output:
24> entropy:task().
1.8464393446710157

Euler Math Toolbox

>function entropy (s) ...
$  v=strtochar(s);
$  m=getmultiplicities(unique(v),v);
$  m=m/sum(m);
$  return sum(-m*logbase(m,2))
$endfunction
>entropy("1223334444")
 1.84643934467

Excel

This solution uses the LAMBDA, LET, and MAP functions introduced into the Microsoft 365 version of Excel in 2021. The LET function is able to use functions as first class citizens. Taking advantage of this makes the solution much simpler. The solution below looks for the string in cell A1.

=LET(
_MainS,A1,
_N,LEN(_MainS),
_Chars,UNIQUE(MID(_MainS,SEQUENCE(LEN(_MainS),1,1,1),1)),
calcH,LAMBDA(_c,(_c/_N)*LOG(_c/_N,2)),
getCount,LAMBDA(_i,LEN(_MainS)-LEN(SUBSTITUTE(_MainS,_i,""))),
_CharMap,MAP(_Chars,LAMBDA(a, calcH(getCount(a)))),
-SUM(_CharMap)
)

_Chars uses the SEQUENCE function to split the text into an array. The UNIQUE function then returns a list of unique characters in the string.

calcH applies the calculation described at the top of the page that will then be summed for each character

getCount uses the SUBSTITUTE method to count the occurrences of a character within the string.

If you needed to re-use this calculation then you could wrap it in a LAMBDA function within the name manager, changing A1 to a variable name (e.g. String):

ShannonEntropyH2=LAMBDA(String,LET(_MainS,String,_N,LEN(_MainS),_Chars,UNIQUE(MID(_MainS,SEQUENCE(LEN(_MainS),1,1,1),1)),calcH,LAMBDA(_c,(_c/_N)*LOG(_c/_N,2)),getCount,LAMBDA(_i,LEN(_MainS)-LEN(SUBSTITUTE(_MainS,_i,""))),_CharMap,MAP(_Chars,LAMBDA(a, calcH(getCount(a)))),-SUM(_CharMap)))

Then you can just use the named lambda. E.g. If A1 = 1223334444 then:

=ShannonEntropyH2(A1)

Returns 1.846439345


F#

open System

let ld x = Math.Log x / Math.Log 2.

let entropy (s : string) =
    let n = float s.Length
    Seq.groupBy id s
    |> Seq.map (fun (_, vals) -> float (Seq.length vals) / n)
    |> Seq.fold (fun e p -> e - p * ld p) 0. 

printfn "%f" (entropy "1223334444")
Output:
1.846439

Factor

USING: assocs kernel math math.functions math.statistics
prettyprint sequences ;
IN: rosetta-code.entropy

: shannon-entropy ( str -- entropy )
    [ length ] [ histogram >alist [ second ] map ] bi
    [ swap / ] with map
    [ dup log 2 log / * ] map-sum neg ;
    
"1223334444" shannon-entropy .
"Factor is my favorite programming language." shannon-entropy .
Output:
1.846439344671015
4.04291723248433

Forth

: flog2 ( f -- f ) fln 2e fln f/ ;

create freq 256 cells allot

: entropy ( str len -- f )
  freq 256 cells erase
  tuck
  bounds do
    i c@ cells freq +
    1 swap +!
  loop
  0e
  256 0 do
    i cells freq + @ ?dup if
      s>f dup s>f f/
      fdup flog2 f* f-
    then
  loop
  drop ;

s" 1223334444" entropy f.     \ 1.84643934467102  ok

Fortran

Please find the GNU/linux compilation instructions along with sample run among the comments at the start of the FORTRAN 2008 source. This program acquires input from the command line argument, thereby demonstrating the fairly new get_command_argument intrinsic subroutine. The expression of the algorithm is a rough translated of the j solution. Thank you.

!-*- mode: compilation; default-directory: "/tmp/" -*-
!Compilation started at Tue May 21 21:43:12
!
!a=./f && make $a && OMP_NUM_THREADS=2 $a 1223334444
!gfortran -std=f2008 -Wall -ffree-form -fall-intrinsics f.f08 -o f
! Shannon entropy of 1223334444 is    1.84643936    
!
!Compilation finished at Tue May 21 21:43:12

program shannonEntropy
  implicit none
  integer :: num, L, status
  character(len=2048) :: s
  num = 1
  call get_command_argument(num, s, L, status)
  if ((0 /= status) .or. (L .eq. 0)) then
    write(0,*)'Expected a command line argument with some length.'
  else
    write(6,*)'Shannon entropy of '//(s(1:L))//' is ', se(s(1:L))
  endif

contains
  !     algebra
  !
  ! 2**x = y
  ! x*log(2) = log(y)
  ! x = log(y)/log(2)

  !   NB. The j solution
  !   entropy=:  +/@:-@(* 2&^.)@(#/.~ % #)
  !   entropy '1223334444'
  !1.84644
  
  real function se(s)
    implicit none
    character(len=*), intent(in) :: s
    integer, dimension(256) :: tallies
    real, dimension(256) :: norm
    tallies = 0
    call TallyKey(s, tallies)
    ! J's #/. works with the set of items in the input.
    ! TallyKey is sufficiently close that, with the merge, gets the correct result.
    norm = tallies / real(len(s))
    se = sum(-(norm*log(merge(1.0, norm, norm .eq. 0))/log(2.0)))
  end function se

  subroutine TallyKey(s, counts)
    character(len=*), intent(in) :: s
    integer, dimension(256), intent(out) :: counts
    integer :: i, j
    counts = 0
    do i=1,len(s)
      j = iachar(s(i:i))
      counts(j) = counts(j) + 1
    end do
  end subroutine TallyKey

end program shannonEntropy

FreeBASIC

' version 25-06-2015
' compile with: fbc -s console

Sub calc_entropy(source As String, base_ As Integer)

    Dim As Integer i, sourcelen = Len(source), totalchar(255)
    Dim As Double prop, entropy

    For i = 0 To sourcelen -1
        totalchar(source[i]) += 1
    Next

    Print "Char    count"
    For i = 0 To 255
        If totalchar(i) = 0 Then Continue For
        Print "   "; Chr(i); Using "   ######"; totalchar(i)
        prop = totalchar(i) / sourcelen
        entropy = entropy - (prop * Log (prop) / Log(base_))
    Next

    Print : Print "The Entropy of "; Chr(34); source; Chr(34); " is"; entropy

End Sub

' ------=< MAIN >=------

calc_entropy("1223334444", 2)
Print

' empty keyboard buffer
While InKey <> "" : Wend
Print : Print "hit any key to end program"
Sleep
End
Output:
Char    count
   1        1
   2        2
   3        3
   4        4

The Entropy of "1223334444" is 1.846439344671015

friendly interactive shell

Sort of hacky, but friendly interactive shell isn't really optimized for mathematic tasks (in fact, it doesn't even have associative arrays).

function entropy
    for arg in $argv
        set name count_$arg
        if not count $$name > /dev/null
            set $name 0
            set values $values $arg
        end
        set $name (math $$name + 1)
    end
    set entropy 0
    for value in $values
        set name count_$value
        set entropy (echo "
            scale = 50
            p = "$$name" / "(count $argv)"
            $entropy - p * l(p)
        " | bc -l)
    end
    echo "$entropy / l(2)" | bc -l
end
entropy (echo 1223334444 | fold -w1)
Output:
1.84643934467101549345

Fōrmulæ

Fōrmulæ programs are not textual, visualization/edition of programs is done showing/manipulating structures but not text. Moreover, there can be multiple visual representations of the same program. Even though it is possible to have textual representation —i.e. XML, JSON— they are intended for storage and transfer purposes more than visualization and edition.

Programs in Fōrmulæ are created/edited online in its website.

In this page you can see and run the program(s) related to this task and their results. You can also change either the programs or the parameters they are called with, for experimentation, but remember that these programs were created with the main purpose of showing a clear solution of the task, and they generally lack any kind of validation.

Solution

Test case

Go

Go: Slice version

package main

import (
    "fmt"
    "math"
    "strings"
)

func main(){
    fmt.Println(H("1223334444"))
}

// for ASCII strings
func H(data string) (entropy float64) {
    if data == "" {
        return 0
    }
    for i := 0; i < 256; i++ {
        px := float64(strings.Count(data, string(byte(i)))) / float64(len(data))
        if px > 0 {
	    entropy += -px * math.Log2(px)
	}
    }
    return entropy
}
Output:
1.8464393446710154

Go: Map version

package main

import (
    "fmt"
    "math"
)

func main() {
    const s = "1223334444"

    l := float64(0)
    m := map[rune]float64{}
    for _, r := range s {
        m[r]++
        l++
    }
    var hm float64 
    for _, c := range m {
        hm += c * math.Log2(c)
    }
    fmt.Println(math.Log2(l) - hm/l)
}
Output:
1.8464393446710152

Groovy

String.metaClass.getShannonEntrophy = {
    -delegate.inject([:]) { map, v -> map[v] = (map[v] ?: 0) + 1; map }.values().inject(0.0) { sum, v ->
        def p = (BigDecimal)v / delegate.size()
        sum + p * Math.log(p) / Math.log(2)
    }
}

Testing

[ '1223334444': '1.846439344671',
  '1223334444555555555': '1.969811065121',
  '122333': '1.459147917061',
  '1227774444': '1.846439344671',
  aaBBcccDDDD: '1.936260027482',
  '1234567890abcdefghijklmnopqrstuvwxyz': '5.169925004424',
  'Rosetta Code': '3.084962500407' ].each { s, expected ->

    println "Checking $s has a shannon entrophy of $expected"
    assert sprintf('%.12f', s.shannonEntrophy) == expected
}
Output:
Checking 1223334444 has a shannon entrophy of 1.846439344671
Checking 1223334444555555555 has a shannon entrophy of 1.969811065121
Checking 122333 has a shannon entrophy of 1.459147917061
Checking 1227774444 has a shannon entrophy of 1.846439344671
Checking aaBBcccDDDD has a shannon entrophy of 1.936260027482
Checking 1234567890abcdefghijklmnopqrstuvwxyz has a shannon entrophy of 5.169925004424
Checking Rosetta Code has a shannon entrophy of 3.084962500407

Haskell

import Data.List

main = print $ entropy "1223334444"

entropy :: (Ord a, Floating c) => [a] -> c  
entropy = sum . map lg . fq . map genericLength . group . sort 
  where lg c = -c * logBase 2 c
        fq c = let sc = sum c in map (/ sc) c


Or, inlining with an applicative expression (turns out to be fractionally faster):

import Data.List (genericLength, group, sort)

entropy
  :: (Ord a, Floating c)
  => [a] -> c
entropy =
  sum .
  map (negate . ((*) <*> logBase 2)) .
  (map =<< flip (/) . sum) . map genericLength . group . sort

main :: IO ()
main = print $ entropy "1223334444"
Output:
1.8464393446710154

Icon and Unicon

Hmmm, the 2nd equation sums across the length of the string (for the example, that would be the sum of 10 terms). However, the answer cited is for summing across the different characters in the string (sum of 4 terms). The code shown here assumes the latter and works in Icon and Unicon. This assumption is consistent with the Wikipedia description.

procedure main(a)
    s := !a | "1223334444"
    write(H(s))
end

procedure H(s)
    P := table(0.0)
    every P[!s] +:= 1.0/*s
    every (h := 0.0) -:= P[c := key(P)] * log(P[c],2)
    return h
end
Output:
->en
1.846439344671015
->

J

Solution:

   entropy=:  +/@(-@* 2&^.)@(#/.~ % #)
Example:
   entropy '1223334444'
1.84644
   entropy i.256
8
   entropy 256$9
0
   entropy 256$0 1
1
   entropy 256$0 1 2 3
2

So it looks like entropy is roughly the number of bits which would be needed to distinguish between each item in the argument (for example, with perfect compression). Note that in some contexts this might not be the same thing as information because the choice of the items themselves might matter. But it's good enough in contexts with a fixed set of symbols.

Java

Translation of: NetRexx
Translation of: REXX
Works with: Java version 7+
import java.lang.Math;
import java.util.Map;
import java.util.HashMap;

public class REntropy {

  @SuppressWarnings("boxing")
  public static double getShannonEntropy(String s) {
    int n = 0;
    Map<Character, Integer> occ = new HashMap<>();

    for (int c_ = 0; c_ < s.length(); ++c_) {
      char cx = s.charAt(c_);
      if (occ.containsKey(cx)) {
        occ.put(cx, occ.get(cx) + 1);
      } else {
        occ.put(cx, 1);
      }
      ++n;
    }

    double e = 0.0;
    for (Map.Entry<Character, Integer> entry : occ.entrySet()) {
      char cx = entry.getKey();
      double p = (double) entry.getValue() / n;
      e += p * log2(p);
    }
    return -e;
  }

  private static double log2(double a) {
    return Math.log(a) / Math.log(2);
  }
  public static void main(String[] args) {
    String[] sstr = {
      "1223334444",
      "1223334444555555555", 
      "122333", 
      "1227774444",
      "aaBBcccDDDD",
      "1234567890abcdefghijklmnopqrstuvwxyz",
      "Rosetta Code",
    };

    for (String ss : sstr) {
      double entropy = REntropy.getShannonEntropy(ss);
      System.out.printf("Shannon entropy of %40s: %.12f%n", "\"" + ss + "\"", entropy);
    }
    return;
  }
}
Output:
Shannon entropy of                             "1223334444": 1.846439344671
Shannon entropy of                    "1223334444555555555": 1.969811065278
Shannon entropy of                                 "122333": 1.459147917027
Shannon entropy of                             "1227774444": 1.846439344671
Shannon entropy of                            "aaBBcccDDDD": 1.936260027532
Shannon entropy of   "1234567890abcdefghijklmnopqrstuvwxyz": 5.169925001442
Shannon entropy of                           "Rosetta Code": 3.084962500721

JavaScript

Works with: ECMAScript 2015

Calculate the entropy of a string by determining the frequency of each character, then summing each character's probability multiplied by the log base 2 of that same probability, taking the negative of the sum.

// Shannon entropy in bits per symbol.
function entropy(str) {
  const len = str.length

  // Build a frequency map from the string.
  const frequencies = Array.from(str)
    .reduce((freq, c) => (freq[c] = (freq[c] || 0) + 1) && freq, {})

  // Sum the frequency of each character.
  return Object.values(frequencies)
    .reduce((sum, f) => sum - f/len * Math.log2(f/len), 0)
}

console.log(entropy('1223334444'))        // 1.8464393446710154
console.log(entropy('0'))                 // 0
console.log(entropy('01'))                // 1
console.log(entropy('0123'))              // 2
console.log(entropy('01234567'))          // 3
console.log(entropy('0123456789abcdef'))  // 4
Output:
1.8464393446710154
0
1
2
3
4
Another variant
const entropy = (s) => {
  const split = s.split('');
  const counter = {};
  split.forEach(ch => {
    if (!counter[ch]) counter[ch] = 1;
    else counter[ch]++;
  });


  const lengthf = s.length * 1.0;
  const counts = Object.values(counter);
  return -1 * counts
    .map(count => count / lengthf * Math.log2(count / lengthf))
    .reduce((a, b) => a + b);
};
Output:
console.log(entropy("1223334444")); // 1.8464393446710154

jq

For efficiency with long strings, we use a hash (a JSON object) to compute the frequencies.

The helper function, counter, could be defined as an inner function of entropy, but for the sake of clarity and because it is independently useful, it is defined separately.

# Input: an array of strings.
# Output: an object with the strings as keys, the values of which are the corresponding frequencies.
def counter:
  reduce .[] as $item ( {}; .[$item] += 1 ) ;

# entropy in bits of the input string
def entropy:
  (explode | map( [.] | implode ) | counter
    | [ .[] | . * log ] | add) as $sum
  | ((length|log) - ($sum / length)) / (2|log) ;
Example:
"1223334444" | entropy # => 1.8464393446710154

Jsish

From Javascript entry.

/* Shannon entropy, in Jsish */

function values(obj:object):array {
    var vals = [];
        for (var key in obj) vals.push(obj[key]);
    return vals;
}

function entropy(s) {
    var split = s.split('');
    var counter = {};
    split.forEach(function(ch) {
        if (!counter[ch]) counter[ch] = 1;
        else counter[ch]++;
    });

    var lengthf = s.length * 1.0;
    var counts = values(counter);
    return -1 * counts.map(function(count) {
        return count / lengthf * (Math.log(count / lengthf) / Math.log(2));
        })
        .reduce(function(a, b) { return a + b; }
    );
};

if (Interp.conf('unitTest')) {
;    entropy('1223334444');
;    entropy('Rosetta Code');
;    entropy('password');
}
Output:
prompt$ jsish --U entropy.jsi
entropy('1223334444') ==> 1.84643934467102
entropy('Rosetta Code') ==> 3.08496250072116
entropy('password') ==> 2.75

Julia

Works with: Julia version 0.6
entropy(s) = -sum(x -> x * log(2, x), count(x -> x == c, s) / length(s) for c in unique(s))
@show entropy("1223334444")
@show entropy([1, 2, 3, 1, 2, 1, 2, 3, 1, 2, 3, 4, 5])
Output:
entropy("1223334444") = 1.8464393446710154
entropy([1, 2, 3, 1, 2, 1, 2, 3, 1, 2, 3, 4, 5]) = 2.103909910282364

K

Works with: ngn/k
entropy: {(`ln[#x]-(+/{x*`ln@x}@+/{x=\:?x}x)%#x)%`ln@2}

entropy "1223334444"
Output:
1.8464393446710161

Kotlin

// version 1.0.6

fun log2(d: Double) = Math.log(d) / Math.log(2.0)

fun shannon(s: String): Double {
    val counters = mutableMapOf<Char, Int>() 
    for (c in s) {
        if (counters.containsKey(c)) counters[c] = counters[c]!! + 1
        else counters.put(c, 1)
    }
    val nn = s.length.toDouble()
    var sum = 0.0
    for (key in counters.keys) {      
       val term = counters[key]!! / nn
       sum += term * log2(term)
    }
    return -sum
}

fun main(args: Array<String>) {
    val samples = arrayOf(
        "1223334444",
        "1223334444555555555", 
        "122333", 
        "1227774444",
        "aaBBcccDDDD",
        "1234567890abcdefghijklmnopqrstuvwxyz",
        "Rosetta Code"
    )
    println("            String                             Entropy")
    println("------------------------------------      ------------------")
    for (sample in samples) println("${sample.padEnd(36)}  ->  ${"%18.16f".format(shannon(sample))}")
}
Output:
            String                             Entropy
------------------------------------      ------------------
1223334444                            ->  1.8464393446710154
1223334444555555555                   ->  1.9698110652780971
122333                                ->  1.4591479170272448
1227774444                            ->  1.8464393446710154
aaBBcccDDDD                           ->  1.9362600275315274
1234567890abcdefghijklmnopqrstuvwxyz  ->  5.1699250014423095
Rosetta Code                          ->  3.0849625007211556

Ksh

Works with: ksh93
function entropy {
    typeset -i i len=${#1}
    typeset -X13 r=0
    typeset -Ai counts

    for ((i = 0; i < len; ++i))
    do
        counts[${1:i:1}]+=1
    done
    for i in "${counts[@]}"
    do
        r+='i * log2(i)'
    done
    r='log2(len) - r / len'
    print -r -- "$r"
}

printf '%g\n' "$(entropy '1223334444')"
Output:
1.84644

Lambdatalk

{def entropy

 {def entropy.count
  {lambda {:s :c :i}
   {let { {:c {/ {A.get :i :c} {A.length :s}}}
        } {* :c {log2 :c}}}}}

 {def entropy.sum
  {lambda {:s :c} 
   {- {+ {S.map {entropy.count :s :c}
                {S.serie 0 {- {A.length :c} 1}}}}}}}

 {lambda {:s}
         {entropy.sum {A.split :s} {cdr {W.frequency :s}}}}} 
-> entropy

The W.frequency function is explained in rosettacode.org/wiki/Letter_frequency#Lambdatalk

{def txt 1223334444} 
-> txt
{def F {W.frequency {txt}}} 
-> F 
characters:  {car {F}} -> [1,2,3,4]
frequencies: {cdr {F}} -> [1,2,3,4]
{entropy {txt}}
-> 1.8464393446710154

{entropy 0}
-> 0
{entropy 00000000000000}
-> 0
{entropy 11111111111111}
-> 0
{entropy 01}
-> 1 
{entropy Lambdatalk}
-> 2.8464393446710154
{entropy entropy}
-> 2.807354922057604
{entropy abcdefgh}
-> 3
{entropy Rosetta Code}
-> 3.084962500721156
{entropy Longtemps je me suis couché de bonne heure}
-> 3.8608288771249444
{entropy abcdefghijklmnopqrstuvwxyz}
-> 4.70043971814109 
{entropy abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz}
-> 4.70043971814109

Lang5

: -rot rot rot ; [] '__A set : dip swap __A swap 1 compress append '__A
set execute __A -1 extract nip ; : nip swap drop ; : sum '+ reduce ;
: 2array 2 compress ; : comb "" split ; : lensize length nip ;
: <group> #( a -- 'a )
    grade subscript dup 's dress distinct strip
    length 1 2array reshape swap
    'A set
    : `filter(*)  A in A swap select ;
    '`filter apply
    ;

: elements(*)  lensize ;
: entropy #( s -- n )
    length "<group> 'elements apply" dip /
    dup neg swap log * 2 log / sum ;

"1223334444" comb entropy . # 1.84643934467102

Liberty BASIC

dim countOfChar( 255) ' all possible one-byte ASCII chars

    source$    ="1223334444"
    charCount  =len( source$)
    usedChar$  =""

    for i =1 to len( source$)   '   count which chars are used in source
        ch$             =mid$( source$, i, 1)
        if not( instr( usedChar$, ch$)) then usedChar$ =usedChar$ +ch$
        'currentCh$      =mid$(
        j               =instr( usedChar$, ch$)
        countOfChar( j) =countOfChar( j) +1
    next i

    l =len( usedChar$)
    for i =1 to l
        probability =countOfChar( i) /charCount
        entropy     =entropy -( probability *logBase( probability, 2))
    next i

    print " Characters used and the number of occurrences of each "
    for i =1 to l
        print " '"; mid$( usedChar$, i, 1); "'", countOfChar( i)
    next i

    print " Entropy of '"; source$; "' is  "; entropy; " bits."
    print " The result should be around 1.84644 bits."

    end
    function logBase( x, b) '   in LB log() is base 'e'.
        logBase =log( x) /log( 2)
    end function
Output:
 Characters used and the number of occurrences of each
 '1'          1
 '2'          2
 '3'          3
 '4'          4
 Entropy of '1223334444' is  1.84643934 bits.
 The result should be around 1.84644 bits.

Lua

function log2 (x) return math.log(x) / math.log(2) end

function entropy (X)
    local N, count, sum, i = X:len(), {}, 0
    for char = 1, N do
        i = X:sub(char, char)
        if count[i] then
            count[i] = count[i] + 1
        else
            count[i] = 1
        end
    end
    for n_i, count_i in pairs(count) do
        sum = sum + count_i / N * log2(count_i / N)
    end
    return -sum
end

print(entropy("1223334444"))

Mathematica / Wolfram Language

shE[s_String] := -Plus @@ ((# Log[2., #]) & /@ ((Length /@ Gather[#])/
         Length[#]) &[Characters[s]])
Example:
 shE["1223334444"]
1.84644
shE["Rosetta Code"]
3.08496

MATLAB / Octave

This version allows for any input vectors, including strings, floats, negative integers, etc.

function E = entropy(d)
	if ischar(d), d=abs(d); end;
        [Y,I,J] = unique(d); 	
	H = sparse(J,1,1);
	p = full(H(H>0))/length(d);
	E = -sum(p.*log2(p));
end;
Usage:
> entropy('1223334444')
ans =  1.8464

MiniScript

entropy = function(s)
    count = {}
    for c in s
        if count.hasIndex(c) then count[c] = count[c]+1 else count[c] = 1
    end for
    sum = 0
    for x in count.values
        countOverN = x / s.len
        sum = sum + countOverN * log(countOverN, 2)
    end for
    return -sum
end function

print entropy("1223334444")
Output:
1.846439

Modula-2

MODULE Entropy;
FROM InOut IMPORT WriteString, WriteLn;
FROM RealInOut IMPORT WriteReal;
FROM Strings IMPORT Length;
FROM MathLib IMPORT ln;

PROCEDURE entropy(s: ARRAY OF CHAR): REAL;
    VAR freq: ARRAY [0..255] OF CARDINAL;
        i, length: CARDINAL;
        h, f: REAL;       
BEGIN
    (* the entropy of the empty string is zero *)
    length := Length(s);
    IF length = 0 THEN RETURN 0.0; END;
    
    (* find the frequency of each character *)
    FOR i := 0 TO 255 DO freq[i] := 0; END;
    FOR i := 0 TO length-1 DO
        INC(freq[ORD(s[i])]);
    END;
    
    (* calculate the component for each character *)
    h := 0.0;
    FOR i := 0 TO 255 DO
        IF freq[i] # 0 THEN
            f := FLOAT(freq[i]) / FLOAT(length);
            h := h - f * (ln(f) / ln(2.0));
        END;
    END;
    RETURN h;
END entropy;

BEGIN
    WriteReal(entropy("1223334444"), 14);
    WriteLn;
END Entropy.
Output:
 1.8464394E+00

NetRexx

Translation of: REXX
/* NetRexx */
options replace format comments java crossref savelog symbols

runSample(Arg)
return

-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/* REXX ***************************************************************
 * 28.02.2013 Walter Pachl
 **********************************************************************/
method getShannonEntropy(s = "1223334444") public static
--trace var occ c chars n cn i e p pl
  Numeric Digits 30
  occ = 0
  chars = ''
  n = 0
  cn = 0
  Loop i = 1 To s.length()
    c = s.substr(i, 1)
    If chars.pos(c) = 0 Then Do
      cn = cn + 1
      chars = chars || c
      End
    occ[c] = occ[c] + 1
    n = n + 1
    End i
  p = ''
  Loop ci = 1 To cn
    c = chars.substr(ci, 1)
    p[c] = occ[c] / n
    End ci
  e = 0
  Loop ci = 1 To cn
    c = chars.substr(ci, 1)
    pl = log2(p[c])
    e = e + p[c] * pl
    End ci
  Return -e

-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
method log2(a = double) public static binary returns double
  return Math.log(a) / Math.log(2)

-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
method runSample(Arg) public static
  parse Arg sstr
  if sstr = '' then
    sstr = '1223334444' -
           '1223334444555555555' -
           '122333' -
           '1227774444' -
           'aaBBcccDDDD' -
           '1234567890abcdefghijklmnopqrstuvwxyz' -
           'Rosetta_Code'
  say 'Calculating Shannon''s entropy for the following list:'
  say '['(sstr.space(1, ',')).changestr(',', ', ')']'
  say
  entropies = 0
  ssMax = 0
  -- This crude sample substitutes a '_' character for a space in the input strings
  loop w_ = 1 to sstr.words()
    ss = sstr.word(w_)
    ssMax = ssMax.max(ss.length())
    ss_ = ss.changestr('_', ' ')
    entropy = getShannonEntropy(ss_)
    entropies[ss] = entropy
    end w_
  loop report = 1 to sstr.words()
    ss = sstr.word(report)
    ss_ = ss.changestr('_', ' ')
    Say 'Shannon entropy of' ('"'ss_'"').right(ssMax + 2)':' entropies[ss].format(null, 12)
    end report
  return
Output:
Calculating Shannon's entropy for the following list:
[1223334444, 1223334444555555555, 122333, 1227774444, aaBBcccDDDD, 1234567890abcdefghijklmnopqrstuvwxyz, Rosetta_Code]

Shannon entropy of                           "1223334444": 1.846439344671
Shannon entropy of                  "1223334444555555555": 1.969811065278
Shannon entropy of                               "122333": 1.459147917027
Shannon entropy of                           "1227774444": 1.846439344671
Shannon entropy of                          "aaBBcccDDDD": 1.936260027532
Shannon entropy of "1234567890abcdefghijklmnopqrstuvwxyz": 5.169925001442
Shannon entropy of                         "Rosetta Code": 3.084962500721

Nim

import tables, math

proc entropy(s: string): float =
  var t = initCountTable[char]()
  for c in s: t.inc(c)
  for x in t.values: result -= x/s.len * log2(x/s.len)

echo entropy("1223334444")

Objeck

use Collection;

class Entropy {
  function : native : GetShannonEntropy(result : String) ~ Float {
    frequencies := IntMap->New();

    each(i : result) {
      c := result->Get(i);

      if(frequencies->Has(c)) {
        count := frequencies->Find(c)->As(IntHolder);
        count->Set(count->Get() + 1);
      }
      else {
        frequencies->Insert(c, IntHolder->New(1));
      };
    };

    length := result->Size();
    entropy := 0.0;

    counts := frequencies->GetValues(); 
    each(i : counts) {
      count := counts->Get(i)->As(IntHolder)->Get();
      freq := count->As(Float) / length;
      entropy += freq * (freq->Log() / 2.0->Log());
    };

    return -1 * entropy;
  }

  function : Main(args : String[]) ~ Nil {
    inputs := [
      "1223334444",
      "1223334444555555555", 
      "122333", 
      "1227774444",
      "aaBBcccDDDD",
      "1234567890abcdefghijklmnopqrstuvwxyz",
      "Rosetta Code"];

    each(i : inputs) {
      input := inputs[i];
      "Shannon entropy of '{$input}': "->Print();
      GetShannonEntropy(inputs[i])->PrintLine();
    };
  }  
}

Output:

Shannon entropy of '1223334444': 1.84644
Shannon entropy of '1223334444555555555': 1.96981
Shannon entropy of '122333': 1.45915
Shannon entropy of '1227774444': 1.84644
Shannon entropy of 'aaBBcccDDDD': 1.93626
Shannon entropy of '1234567890abcdefghijklmnopqrstuvwxyz': 5.16993
Shannon entropy of 'Rosetta Code': 3.08496

OCaml

By using a map, purely functional
module CharMap = Map.Make(Char)

let entropy s =
  let count map c =
    CharMap.update c (function Some n -> Some (n +. 1.) | None -> Some 1.) map
  and calc _ n sum =
    sum +. n *. Float.log2 n
  in
  let sum = CharMap.fold calc (String.fold_left count CharMap.empty s) 0.
  and len = float (String.length s) in
  Float.log2 len -. sum /. len

let () =
  entropy "1223334444" |> string_of_float |> print_endline
By using a mutable Hashtbl
(* pre-bake & return an inner-loop function to bin & assemble a character frequency map *)
let get_fproc (m: (char, int) Hashtbl.t) :(char -> unit)  =
  (fun (c:char) -> try
                     Hashtbl.replace m c ( (Hashtbl.find m c) + 1) 
                   with Not_found -> Hashtbl.add m c 1)


(* pre-bake and return an inner-loop function to do the actual entropy calculation *)
let get_calc (slen:int) :(float -> float) = 
  let slen_float = float_of_int slen in
  let log_2 = log 2.0 in

  (fun v -> let pt = v /. slen_float in
                pt *. ((log pt) /. log_2) )


(* main function, given a string argument it:
       builds a (mutable) frequency map (initial alphabet size of 255, but it's auto-expanding), 
       extracts the relative probability values into a list, 
       folds-in the basic entropy calculation and returns the result. *)
let shannon (s:string) :float  = 
  let freq_hash = Hashtbl.create 255 in
  String.iter (get_fproc freq_hash) s;

  let relative_probs = Hashtbl.fold (fun k v b -> (float v)::b) freq_hash [] in
  let calc = get_calc (String.length s) in

   -1.0 *. List.fold_left (fun b x -> b +. calc x) 0.0 relative_probs
Output:
1.84643934467

Oforth

: entropy(s) -- f
| freq sz |
   s size dup ifZero: [ return ] asFloat ->sz
   ListBuffer initValue(255, 0) ->freq
   s apply( #[ dup freq at 1+ freq put ] )
   0.0 freq applyIf( #[ 0 <> ], #[ sz / dup ln * - ] ) Ln2 / ;
 
entropy("1223334444") .
Output:
1.84643934467102

ooRexx

Translation of: REXX
/* REXX */
Numeric Digits 16
Parse Arg s
If s='' Then
  s="1223334444"
occ.=0
chars=''
n=0
cn=0
Do i=1 To length(s)
  c=substr(s,i,1)
  If pos(c,chars)=0 Then Do
    cn=cn+1
    chars=chars||c
    End
  occ.c=occ.c+1
  n=n+1
  End
do ci=1 To cn
  c=substr(chars,ci,1)
  p.c=occ.c/n
  /* say c p.c */
  End
e=0
Do ci=1 To cn
  c=substr(chars,ci,1)
  e=e+p.c*rxcalclog(p.c)/rxcalclog(2)
  End
Say s 'Entropy' format(-e,,12)
Exit

::requires 'rxmath' LIBRARY
Output:
1223334444 Entropy 1.846439344671

PARI/GP

entropy(s)=s=Vec(s);my(v=vecsort(s,,8));-sum(i=1,#v,(x->x*log(x))(sum(j=1,#s,v[i]==s[j])/#s))/log(2)
>entropy("1223334444")
%1 = 1.8464393446710154934341977463050452232

Pascal

Free Pascal (http://freepascal.org).

PROGRAM entropytest;

USES StrUtils, Math;

TYPE FArray = ARRAY of CARDINAL;

VAR	 strng: STRING = '1223334444';
	
// list unique characters in a string
FUNCTION uniquechars(str: STRING): STRING;
	VAR n: CARDINAL;
	BEGIN
		uniquechars := '';
		FOR n := 1 TO length(str) DO
			IF (PosEx(str[n],str,n)>0) 
				AND (PosEx(str[n],uniquechars,1)=0) 
					THEN uniquechars += str[n];
	END;
	
// obtain a list of character-frequencies for a string
//  given a string containing its unique characters
FUNCTION frequencies(str,ustr: STRING): FArray;
	VAR u,s,p,o: CARDINAL;
	BEGIN
		SetLength(frequencies, Length(ustr)+1);
		p := 0;
		FOR u := 1 TO length(ustr) DO
			FOR s := 1 TO length(str) DO BEGIN
				o := p;	p := PosEx(ustr[u],str,s);
				IF (p>o) THEN INC(frequencies[u]);
			END;
	END;

// Obtain the Shannon entropy of a string
FUNCTION entropy(s: STRING): EXTENDED;
	VAR pf : FArray;
		us : STRING;
		i,l: CARDINAL;
	BEGIN
		us := uniquechars(s);
		pf := frequencies(s,us);
		l  := length(s);
		entropy := 0.0;
		FOR i := 1 TO length(us) DO
			entropy -= pf[i]/l * log2(pf[i]/l);
	END;

BEGIN
	Writeln('Entropy of "',strng,'" is ',entropy(strng):2:5, ' bits.');
END.
Output:
Entropy of "1223334444" is 1.84644 bits.

Perl

sub entropy {
    my %count; $count{$_}++ for @_;
    my $entropy = 0;
    for (values %count) {
        my $p = $_/@_;
        $entropy -= $p * log $p;
    }
    $entropy / log 2
}
 
print entropy split //, "1223334444";

Phix

with javascript_semantics
function entropy(sequence s)
    sequence symbols = {},
             counts = {}
    integer N = length(s)
    for i=1 to N do
        object si = s[i]
        integer k = find(si,symbols)
        if k=0 then
            symbols = append(symbols,si)
            counts = append(counts,1)
        else
            counts[k] += 1
        end if
    end for
    atom H = 0
    integer n = length(counts)
    for i=1 to n do
        atom ci = counts[i]/N
        H -= ci*log2(ci)
    end for
    return H
end function
 
?entropy("1223334444")
Output:
1.846439345

PHP

<?php

function shannonEntropy($string) {
    $h = 0.0;
    $len = strlen($string);
    foreach (count_chars($string, 1) as $count) {
        $h -= (double) ($count / $len) * log((double) ($count / $len), 2);
    }
    return $h;
}

$strings = array(
    '1223334444',
    '1225554444',
    'aaBBcccDDDD',
    '122333444455555',
    'Rosetta Code',
    '1234567890abcdefghijklmnopqrstuvwxyz',
);

foreach ($strings AS $string) {
    printf(
        '%36s : %s' . PHP_EOL,
        $string,
        number_format(shannonEntropy($string), 6)
    );
}
Output:
                          1223334444 : 1.846439
                          1225554444 : 1.846439
                         aaBBcccDDDD : 1.936260
                     122333444455555 : 2.149255
                        Rosetta Code : 3.084963
1234567890abcdefghijklmnopqrstuvwxyz : 5.169925

Picat

go =>
  ["1223334444",
   "Rosetta Code is the best site in the world!",
   "1234567890abcdefghijklmnopqrstuvwxyz",
   "Picat is fun"].map(entropy).println(),
  nl.

% probabilities of each element/character in L
entropy(L) = Entropy => 
  Len = L.length,
  Occ = new_map(), % # of occurrences
  foreach(E in L)
    Occ.put(E, Occ.get(E,0) + 1)
  end,
  Entropy = -sum([P2*log2(P2) : _C=P in Occ, P2 = P/Len]).
Output:
[1.846439344671016,3.646513010214172,5.169925001442309,3.251629167387823]

PicoLisp

PicoLisp only supports fixed point arithmetic, but it does have the ability to call libc transcendental functions (for log)

(scl 8)
(load "@lib/math.l")

(setq LN2 0.693147180559945309417)

(de tabulate-chars (Str)
   (let Map NIL
      (for Ch (chop Str)
         (if (assoc Ch Map)
            (con @ (inc (cdr @)))
            (setq Map (cons (cons Ch 1) Map))))
   Map))

(de entropy (Str)
   (let (
      Sz    (length Str)
      Hist  (tabulate-chars Str)
   )
   (*/
      (sum
         '((Pair)
            (let R (*/ (cdr Pair) 1. Sz)
               (- (*/ R (log R) 1.))))
         Hist)
      1. LN2)))
Output:
: (format (entropy "1223334444") *Scl)
-> "1.84643934"

PL/I

*process source xref attributes or(!);
 /*--------------------------------------------------------------------
 * 08.08.2014 Walter Pachl  translated from REXX version 1
 *-------------------------------------------------------------------*/
 ent: Proc Options(main);
 Dcl (index,length,log2,substr) Builtin;
 Dcl sysprint Print;
 Dcl occ(100) Bin fixed(31) Init((100)0);
 Dcl (n,cn,ci,i,pos) Bin fixed(31) Init(0);
 Dcl chars Char(100) Var Init('');
 Dcl s Char(100) Var Init('1223334444');
 Dcl c Char(1);
 Dcl (occf,p(100)) Dec Float(18);
 Dcl e Dec Float(18) Init(0);
 Do i=1 To length(s);
   c=substr(s,i,1);
   pos=index(chars,c);
   If pos=0 Then Do;
     pos=length(chars)+1;
     cn+=1;
     chars=chars!!c;
     End;
   occ(pos)+=1;
   n+=1;
   End;
  do ci=1 To cn;
    occf=occ(ci);
    p(ci)=occf/n;
    End;
  Do ci=1 To cn;
    e=e+p(ci)*log2(p(ci));
    End;
  Put Edit('s='''!!s!!''' Entropy=',-e)(Skip,a,f(15,12));
  End;
Output:
s='1223334444' Entropy= 1.846439344671

PowerShell

function entropy ($string) {
    $n = $string.Length
    $string.ToCharArray() | group | foreach{
        $p = $_.Count/$n
        $i = [Math]::Log($p,2)
        -$p*$i
    } | measure -Sum | foreach Sum
}
entropy "1223334444"

Output:

1.84643934467102

Prolog

Works with: Swi-Prolog version 7.3.3

This solution calculates the run-length encoding of the input string to get the relative frequencies of its characters.

:-module(shannon_entropy, [shannon_entropy/2]).

%!	shannon_entropy(+String, -Entropy) is det.
%
%	Calculate the Shannon Entropy of String.
%
%	Example query:
%	==
%	?- shannon_entropy(1223334444, H).
%	H = 1.8464393446710154.
%	==
%
shannon_entropy(String, Entropy):-
	atom_chars(String, Cs)
	,relative_frequencies(Cs, Frequencies)
	,findall(CI
		,(member(_C-F, Frequencies)
		 ,log2(F, L)
		 ,CI is F * L
		 )
		,CIs)
	,foldl(sum, CIs, 0, E)
	,Entropy is -E.

%!	frequencies(+Characters,-Frequencies) is det.
%
%	Calculates the relative frequencies of elements in the list of
%	Characters.
%
%	Frequencies is a key-value list with elements of the form:
%	C-F, where C a character in the list and F its relative
%	frequency in the list.
%
%	Example query:
%	==
%	?- relative_frequencies([a,a,a,b,b,b,b,b,b,c,c,c,a,a,f], Fs).
%	Fs = [a-0.3333333333333333, b-0.4, c-0.2,f-0.06666666666666667].
%	==
%
relative_frequencies(List, Frequencies):-
	run_length_encoding(List, Rle)
        % Sort Run-length encoded list and aggregate lengths by element
	,keysort(Rle, Sorted_Rle)
	,group_pairs_by_key(Sorted_Rle, Elements_Run_lengths)
	,length(List, Elements_in_list)
	,findall(E-Frequency_of_E
		,(member(E-RLs, Elements_Run_lengths)
                 % Sum the list of lengths of runs of E
		 ,foldl(plus, RLs, 0, Occurences_of_E)
		 ,Frequency_of_E is Occurences_of_E / Elements_in_list
		 )
		,Frequencies).


%!	run_length_encoding(+List, -Run_length_encoding) is det.
%
%	Converts a list to its run-length encoded form where each "run"
%	of contiguous repeats of the same element is replaced by that
%	element and the length of the run.
%
%	Run_length_encoding is a key-value list, where each element is a
%	term:
%
%	Element:term-Repetitions:number.
%
%	Example query:
%	==
%       ?- run_length_encoding([a,a,a,b,b,b,b,b,b,c,c,c,a,a,f], RLE).
%	RLE = [a-3, b-6, c-3, a-2, f-1].
%	==
%
run_length_encoding([], []-0):-
	!. % No more results needed.

run_length_encoding([Head|List], Run_length_encoded_list):-
	run_length_encoding(List, [Head-1], Reversed_list)
	% The resulting list is in reverse order due to the head-to-tail processing
	,reverse(Reversed_list, Run_length_encoded_list).

%!	run_length_encoding(+List,+Initialiser,-Accumulator) is det.
%
%	Business end of run_length_encoding/3. Calculates the run-length
%	encoded form of a list and binds the result to the Accumulator.
%	Initialiser is a list [H-1] where H is the first element of the
%	input list.
%
run_length_encoding([], Fs, Fs).

% Run of F consecutive occurrences of C
run_length_encoding([C|Cs],[C-F|Fs], Acc):-
        % Backtracking would produce successive counts
	% of runs of C at different indices in the list.
	!
	,F_ is F + 1
	,run_length_encoding(Cs, [C-F_| Fs], Acc).

% End of a run of consecutive identical elements.
run_length_encoding([C|Cs], Fs, Acc):-
	run_length_encoding(Cs,[C-1|Fs], Acc).


/* Arithmetic helper predicates */

%!	log2(N, L2_N) is det.
%
%	L2_N is the logarithm with base 2 of N.
%
log2(N, L2_N):-
	L_10 is log10(N)
	,L_2 is log10(2)
	,L2_N is L_10 / L_2.

%!	sum(+A,+B,?Sum) is det.
%
%	True when Sum is the sum of numbers A and B.
%
%	Helper predicate to allow foldl/4 to do addition. The following
%	call will raise an error (because there is no predicate +/3):
%	==
%	foldl(+, [1,2,3], 0, Result).
%	==
%
%	This will not raise an error:
%	==
%	foldl(sum, [1,2,3], 0, Result).
%	==
%
sum(A, B, Sum):-
	must_be(number, A)
	,must_be(number, B)
	,Sum is A + B.

Example query:

?- shannon_entropy(1223334444, H).
H = 1.8464393446710154.

PureBasic

#TESTSTR="1223334444"
NewMap uchar.i() : Define.d e

Procedure.d nlog2(x.d) : ProcedureReturn Log(x)/Log(2) : EndProcedure

Procedure countchar(s$, Map uchar())
  If Len(s$)
    uchar(Left(s$,1))=CountString(s$,Left(s$,1))
    s$=RemoveString(s$,Left(s$,1))
    ProcedureReturn countchar(s$, uchar())
  EndIf
EndProcedure

countchar(#TESTSTR,uchar())

ForEach uchar()  
  e-uchar()/Len(#TESTSTR)*nlog2(uchar()/Len(#TESTSTR))  
Next

OpenConsole()
Print("Entropy of ["+#TESTSTR+"] = "+StrD(e,15))
Input()
Output:
Entropy of [1223334444] = 1.846439344671015

Python

Python: Longer version

from __future__ import division
import math

def hist(source):
    hist = {}; l = 0;
    for e in source:
        l += 1
        if e not in hist:
            hist[e] = 0
        hist[e] += 1
    return (l,hist)

def entropy(hist,l):
    elist = []
    for v in hist.values():
        c = v / l
        elist.append(-c * math.log(c ,2))
    return sum(elist)

def printHist(h):
    flip = lambda (k,v) : (v,k)
    h = sorted(h.iteritems(), key = flip)
    print 'Sym\thi\tfi\tInf'
    for (k,v) in h:
        print '%s\t%f\t%f\t%f'%(k,v,v/l,-math.log(v/l, 2))
    
    

source = "1223334444"
(l,h) = hist(source);
print '.[Results].'
print 'Length',l
print 'Entropy:', entropy(h, l)
printHist(h)
Output:
.[Results].
Length 10
Entropy: 1.84643934467
Sym	hi	fi	Inf
1	1.000000	0.100000	3.321928
2	2.000000	0.200000	2.321928
3	3.000000	0.300000	1.736966
4	4.000000	0.400000	1.321928

Python: More succinct version

The Counter module is only available in Python >= 2.7.

from math import log2
from collections import Counter

def entropy(s):
    p, lns = Counter(s), float(len(s))
    return log2(lns) - sum(count * log2(count) for count in p.values()) / lns

print(entropy("1223334444"))
Output:
1.8464393446710154

Uses Python 2

def Entropy(text):
    import math
    log2=lambda x:math.log(x)/math.log(2)
    exr={}
    infoc=0
    for each in text:
        try:
            exr[each]+=1
        except:
            exr[each]=1
    textlen=len(text)
    for k,v in exr.items():
        freq  =  1.0*v/textlen
        infoc+=freq*log2(freq)
    infoc*=-1
    return infoc

while True:
    print Entropy(raw_input('>>>'))

R

entropy <- function(str) {
  vec   <- strsplit(str, "")[[1]]
  N     <- length(vec)
  p_xi  <- table(vec) / N
  
  -sum(p_xi * log(p_xi, 2))
}
Output:
> entropy("1223334444")
[1] 1.846439

Racket

#lang racket
(require math)
(provide entropy hash-entropy list-entropy digital-entropy)

(define (hash-entropy h)
  (define (log2 x) (/ (log x) (log 2)))
  (define n (for/sum [(c (in-hash-values h))] c))
  (- (for/sum ([c (in-hash-values h)] #:unless (zero? c))
       (* (/ c n) (log2 (/ c n))))))

(define (list-entropy x) (hash-entropy (samples->hash x)))

(define entropy         (compose list-entropy string->list))
(define digital-entropy (compose entropy number->string))

(module+ test
  (require rackunit)
  (check-= (entropy "1223334444") 1.8464393446710154 1E-8)
  (check-= (digital-entropy 1223334444) (entropy "1223334444") 1E-8)
  (check-= (digital-entropy 1223334444) 1.8464393446710154 1E-8)
  (check-= (entropy "xggooopppp") 1.8464393446710154 1E-8))

(module+ main (entropy "1223334444"))
Output:
 1.8464393446710154

Raku

(formerly Perl 6)

Works with: rakudo version 2015-09-09
sub entropy(@a) {
    [+] map -> \p { p * -log p }, bag(@a).values »/» +@a;
}

say log(2) R/ entropy '1223334444'.comb;
Output:
1.84643934467102

In case we would like to add this function to Raku's core, here is one way it could be done:

use MONKEY-TYPING;
augment class Bag {
    method entropy {
	[+] map -> \p { - p * log p },
	self.values »/» +self;
    }
}

say '1223334444'.comb.Bag.entropy / log 2;

REXX

version 1

/* REXX ***************************************************************
* 28.02.2013 Walter Pachl
* 12.03.2013 Walter Pachl  typo in log corrected. thanx for testing
* 22.05.2013 -"- extended the logic to accept other strings
* 25.05.2013 -"- 'my' log routine is apparently incorrect
* 25.05.2013 -"- problem identified & corrected
**********************************************************************/
Numeric Digits 30
Parse Arg s
If s='' Then
  s="1223334444"
occ.=0
chars=''
n=0
cn=0
Do i=1 To length(s)
  c=substr(s,i,1)
  If pos(c,chars)=0 Then Do
    cn=cn+1
    chars=chars||c
    End
  occ.c=occ.c+1
  n=n+1
  End
do ci=1 To cn
  c=substr(chars,ci,1)
  p.c=occ.c/n
  /* say c p.c */
  End
e=0
Do ci=1 To cn
  c=substr(chars,ci,1)
  e=e+p.c*log(p.c,30,2)
  End
Say 'Version 1:' s 'Entropy' format(-e,,12)
Exit

log: Procedure
/***********************************************************************
* Return log(x) -- with specified precision and a specified base
* Three different series are used for the ranges  0 to 0.5
*                                                 0.5 to 1.5
*                                                 1.5 to infinity
* 03.09.1992 Walter Pachl
* 25.05.2013 -"- 'my' log routine is apparently incorrect
* 25.05.2013 -"- problem identified & corrected
***********************************************************************/
  Parse Arg x,prec,b
  If prec='' Then prec=9
  Numeric Digits (2*prec)
  Numeric Fuzz   3
  Select
    When x<=0 Then r='*** invalid argument ***'
    When x<0.5 Then Do
      z=(x-1)/(x+1)
      o=z
      r=z
      k=1
      Do i=3 By 2
        ra=r
        k=k+1
        o=o*z*z
        r=r+o/i
        If r=ra Then Leave
        End
      r=2*r
      End
    When x<1.5 Then Do
      z=(x-1)
      o=z
      r=z
      k=1
      Do i=2 By 1
        ra=r
        k=k+1
        o=-o*z
        r=r+o/i
        If r=ra Then Leave
        End
      End
    Otherwise /* 1.5<=x */ Do
      z=(x+1)/(x-1)
      o=1/z
      r=o
      k=1
      Do i=3 By 2
        ra=r
        k=k+1
        o=o/(z*z)
        r=r+o/i
        If r=ra Then Leave
        End
      r=2*r
      End
    End
  If b<>'' Then
    r=r/log(b,prec)
  Numeric Digits (prec)
  r=r+0
  Return r


version 2

REXX doesn't have a BIF for   LOG   or   LN,   so the subroutine (function)   LOG2   is included herein.

The   LOG2   subroutine is only included here for functionality, not to document how to calculate   LOG2   using REXX.

/*REXX program calculates the   information entropy   for a specified character string. */
numeric digits length( e() ) % 2   -  length(.)  /*use 1/2 of the decimal digits of  E. */
parse arg $;   if $=''  then $= 1223334444       /*obtain the optional input from the CL*/
#=0;           @.= 0;        L= length($)        /*define handy-dandy REXX variables.   */
$$=                                              /*initialize the   $$   list.          */
       do j=1  for L;        _= substr($, j, 1)  /*process each character in  $  string.*/
       if @._==0  then do;   #= # + 1            /*Unique?  Yes, bump character counter.*/
                             $$= $$ || _         /*add this character to the  $$  list. */
                       end
       @._= @._ + 1                              /*keep track of this character's count.*/
       end   /*j*/
sum= 0                                           /*calculate info entropy for each char.*/
       do i=1  for #;        _= substr($$, i, 1) /*obtain a character from unique list. */
       sum= sum  -   @._/L * log2(@._/L)         /*add (negatively) the char entropies. */
       end   /*i*/
say ' input string: '   $
say 'string length: '   L
say ' unique chars: '   #;          say
say 'the information entropy of the string ──► '         format(sum,,12)          " bits."
exit                                             /*stick a fork in it,  we're all done. */
/*──────────────────────────────────────────────────────────────────────────────────────*/
e: e= 2.718281828459045235360287471352662497757247093699959574966967627724076630; return e
/*──────────────────────────────────────────────────────────────────────────────────────*/
log2: procedure;  parse arg x 1 ox;     ig= x>1.5;     ii= 0;         is= 1 - 2 * (ig\==1)
      numeric digits digits()+5;        call e   /*the precision of E must be≥digits(). */
        do  while  ig & ox>1.5 | \ig&ox<.5;       _= e;       do j=-1;   iz= ox * _ ** -is
        if j>=0 & (ig & iz<1 | \ig&iz>.5)  then leave;    _= _ * _;    izz= iz;  end /*j*/
        ox=izz;  ii=ii+is*2**j;  end /*while*/;   x= x * e** -ii -1;   z= 0;  _= -1;  p= z
          do k=1;   _= -_ * x;   z= z+_/k;        if z=p  then leave;  p= z;    end  /*k*/
        r= z + ii;               if arg()==2  then return r;         return r / log2(2, .)
output   when using the default input of:     1223334444
 input string:  1223334444
string length:  10
 unique chars:  4

the information entropy of the string ──►  1.846439344671  bits.
output   when using the input of:     Rosetta Code
 input string:  Rosetta Code
string length:  12
 unique chars:  9

the information entropy of the string ──►  3.084962500721  bits.

Ring

decimals(8)
entropy = 0
countOfChar = list(255) 
 
source="1223334444"
charCount  =len( source)
usedChar  =""
 
for i =1 to len( source)  
     ch =substr(source, i, 1)
     if not(substr( usedChar, ch)) usedChar =usedChar +ch ok
     j  =substr( usedChar, ch)
    countOfChar[j] =countOfChar[j] +1
next 
 
l =len(usedChar)
for i =1 to l
     probability =countOfChar[i] /charCount
     entropy =entropy - (probability *logBase(probability, 2))
next 
 
see "Characters used and the number of occurrences of each " + nl
for i =1 to l
      see "'" + substr(usedChar, i, 1) + "' " + countOfChar[i] + nl
next 
 
see " Entropy of " + source + " is  " + entropy + " bits." + nl
see " The result should be around 1.84644 bits." + nl
 
func logBase (x, b) 
        logBase =log( x) /log( 2)
        return logBase

Output:

Characters used and the number of occurrences of each 
'1' 1
'2' 2
'3' 3
'4' 4
Entropy of 1223334444 is  1.84643934 bits.
The result should be around 1.84644 bits.

RPL

Works with: Halcyon Calc version 4.2.7
Code Comments
≪ 
   DUP SIZE 2 LN → str len log2
   ≪ { 255 } 0 CON
      1 len FOR j
         str j DUP SUB 
         NUM DUP2 GET 1 + PUT
      NEXT
      0 1 255 FOR j
         IF OVER j GET 
         THEN LAST len / DUP LN log2 / * + END
      NEXT
     NEG SWAP DROP
≫ ≫ 'NTROP' STO
NTROP ( "string" -- entropy )
Initialize local variables
Initialize a vector with 255 counters
For each character in the string...
 
... increase the counter according to ASCII code


For each non-zero counter
   calculate term

Change sign and forget the vector

The following line of code delivers what is required:

"1223334444" NTROP
Output:
1:       1.84643934467

Ruby

def entropy(s)
  counts = s.chars.tally
  leng = s.length.to_f
	
  counts.values.reduce(0) do |entropy, count|
    freq = count / leng
    entropy - freq * Math.log2(freq)
  end
end

p entropy("1223334444")
Output:
1.8464393446710154

Run BASIC

dim chrCnt( 255) 			' possible ASCII chars

source$		= "1223334444"
numChar		= len(source$)

for i = 1 to len(source$)   		' count which chars are used in source
	ch$	= mid$(source$,i,1)
	if not( instr(chrUsed$, ch$)) then chrUsed$ = chrUsed$ + ch$
	j	= instr(chrUsed$, ch$)
	chrCnt(j) =chrCnt(j) +1
next i

lc	= len(chrUsed$)
for i = 1 to lc
	odds	= chrCnt(i) /numChar
	entropy	= entropy - (odds * (log(odds) / log(2)))
next i

print " Characters used and times used of each "
for i = 1 to lc
	print " '"; mid$(chrUsed$,i,1); "'";chr$(9);chrCnt(i)
next i

print " Entropy of '"; source$; "' is  "; entropy; " bits."

end
Characters used and times used of each 
 '1'	1
 '2'	2
 '3'	3
 '4'	4
 Entropy of '1223334444' is  1.84643939 bits.

Rust

fn entropy(s: &[u8]) -> f32 {
    let mut histogram = [0u64; 256];

    for &b in s {
        histogram[b as usize] += 1;
    }

    histogram
        .iter()
        .cloned()
        .filter(|&h| h != 0)
        .map(|h| h as f32 / s.len() as f32)
        .map(|ratio| -ratio * ratio.log2())
        .sum()
}

fn main() {
    let arg = std::env::args().nth(1).expect("Need a string.");
    println!("Entropy of {} is {}.", arg, entropy(arg.as_bytes()));
}
Output:
$ ./entropy 1223334444
Entropy of 1223334444 is 1.8464394.

Scala

import scala.math._

def entropy( v:String ) = { v
  .groupBy (a => a)
  .values
  .map( i => i.length.toDouble / v.length )
  .map( p => -p * log10(p) / log10(2))
  .sum
}

// Confirm that "1223334444" has an entropy of about 1.84644
assert( math.round( entropy("1223334444") * 100000 ) * 0.00001 == 1.84644 )

scheme

A version capable of calculating multidimensional entropy.

(define (entropy input) 
  (define (close? a b)
    (define (norm x y)
      (define (infinite_norm m n)
        (define (absminus p q)
             (cond ((null? p) '())
                (else (cons (abs (- (car p) (car q))) (absminus (cdr p) (cdr q))))))
        (define (mm l)
             (cond ((null? (cdr l)) (car l))
                   ((> (car l) (cadr l)) (mm (cons (car l) (cddr l))))
                   (else (mm (cdr l)))))
        (mm (absminus m n)))
      (if (pair? x) (infinite_norm x y) (abs (- x y))))
    (let ((epsilon 0.2))
      (< (norm a b) epsilon)))
  (define (freq-list x)
    (define (f x)
      (define (count a b)
        (cond ((null? b) 1)
              (else (+ (if (close? a (car b)) 1 0) (count a (cdr b))))))
      (let ((t (car x)) (tt (cdr x)))
        (count t tt)))
    (define (g x)
      (define (filter a b)
        (cond ((null? b) '())
              ((close? a (car b)) (filter a (cdr b)))
              (else (cons (car b) (filter a (cdr b))))))
      (let ((t (car x)) (tt (cdr x)))
        (filter t tt)))  
    (cond ((null? x) '())
          (else (cons (f x) (freq-list (g x))))))
  (define (scale x)
    (define (sum x)
      (if (null? x) 0.0 (+ (car x) (sum (cdr x)))))
    (let ((z (sum x)))
      (map (lambda(m) (/ m z)) x)))
  (define (cal x)
    (if (null? x) 0 (+ (* (car x) (/ (log (car x)) (log 2))) (cal (cdr x)))))
  (- (cal (scale (freq-list input)))))

(entropy (list 1 2 2 3 3 3 4 4 4 4)) 
(entropy (list (list 1 1) (list 1.1 1.1) (list 1.2 1.2) (list 1.3 1.3) (list 1.5 1.5) (list 1.6 1.6)))
Output:
1.8464393446710154 bits

1.4591479170272448 bits

Scilab

function E = entropy(d)
    d=strsplit(d);
    n=unique(string(d));
    N=size(d,'r');
    
    count=zeros(n);
    n_size = size(n,'r');
    for i = 1:n_size
       count(i) = sum ( d == n(i) );
    end
    
    E=0;
    for i=1:length(count)
        E = E - count(i)/N * log(count(i)/N) / log(2);
    end
endfunction

word ='1223334444';
E = entropy(word);
disp('The entropy of '+word+' is '+string(E)+'.');
Output:
 The entropy of 1223334444 is 1.8464393.

Seed7

$ include "seed7_05.s7i";
  include "float.s7i";
  include "math.s7i";

const func float: entropy (in string: stri) is func
  result
    var float: entropy is 0.0;
  local
    var hash [char] integer: count is (hash [char] integer).value;
    var char: ch is ' ';
    var float: p is 0.0;
  begin
    for ch range stri do
      if ch in count then
        incr(count[ch]);
      else
        count @:= [ch] 1;
      end if;
    end for;
    for key ch range count do
      p := flt(count[ch]) / flt(length(stri));
      entropy -:= p * log(p) / log(2.0);
    end for;
  end func ;

const proc: main is func
  begin
    writeln(entropy("1223334444") digits 5);
  end func;
Output:
1.84644

SETL

program shannon_entropy;
    print(entropy "1223334444");

    op entropy(symbols);
        hist := {};
        loop for symbol in symbols do
            hist(symbol) +:= 1;
        end loop;
        h := 0.0;
        loop for count = hist(symbol) do
            f := count / #symbols;
            h -:= f * log f / log 2;
        end loop;
        return h;
    end op;
end program;
Output:
1.84643934467102

Sidef

func entropy(s) {
  var counts = Hash.new;
  s.each { |c| counts{c} := 0 ++ };
  var len = s.len;
  [0, counts.values.map {|count|
    var freq = count/len; freq * freq.log2 }...
  ]«-»;
}
 
say entropy("1223334444");
Output:
1.846439344671015493434197746305045223237

Standard ML

val Entropy = fn input =>
 let
   val   N     = Real.fromInt (String.size input) ;
   val  term   = fn a => Math.ln (a/N) * a  /  ( Math.ln 2.0 * N ) ;
   val   v0    = Vector.tabulate (255,fn i=>0)   ;
   val  freq   = Vector.map Real.fromInt                                         (* List.foldr:  count occurrences  *)
                   (List.foldr   (fn (i,v) => Vector.update( v, ord i, Vector.sub(v,ord i) + 1) ) v0 (explode input) )
 in
  
      ~ (Vector.foldr  (fn (a,s) => if a > 0.0 then term a  + s else s)  0.0  freq )

end ;
Entropy "1223334444" ;
val it = 1.846439345: real

Swift

import Foundation

func entropy(of x: String) -> Double {
  return x
    .reduce(into: [String: Int](), {cur, char in
      cur[String(char), default: 0] += 1
    })
    .values
    .map({i in Double(i) / Double(x.count) } as (Int) -> Double)
    .map({p in -p * log2(p) } as (Double) -> Double)
    .reduce(0.0, +)
}

print(entropy(of: "1223334444"))
Output:
1.8464393446710154

Tcl

proc entropy {str} {
    set log2 [expr log(2)]
    foreach char [split $str ""] {dict incr counts $char}
    set entropy 0.0
    foreach count [dict values $counts] {
	set freq [expr {$count / double([string length $str])}]
	set entropy [expr {$entropy - $freq * log($freq)/$log2}]
    }
    return $entropy
}

Demonstration:

puts [format "entropy = %.5f" [entropy "1223334444"]]
puts [format "entropy = %.5f" [entropy "Rosetta Code"]]
Output:
entropy = 1.84644
entropy = 3.08496

V (Vlang)

Vlang: Map version

import math
import arrays

fn hist(source string) map[string]int {
    mut hist := map[string]int{}
    for e in source.split('') {
        if e !in hist {
            hist[e] = 0
        }
        hist[e]+=1
    }
    return hist
}

fn entropy(hist map[string]int, l int) f64 {
    mut elist := []f64{}
    for _,v in hist {
        c := f64(v) / f64(l)
        elist << -c * math.log2(c)
    }
    return arrays.sum<f64>(elist) or {-1}
}

fn main(){
    input := "1223334444"
    h := hist(input)
    e := entropy(h, input.len)
    println(e)
}
Output:
1.8464393446710152

Wren

Translation of: Go
var s = "1223334444"
var m = {}
for (c in s) {
    var d = m[c]
    m[c] = (d) ? d + 1 : 1
}
var hm = 0
for (k in m.keys) {
    var c = m[k]
    hm = hm + c * c.log2
}
var l = s.count
System.print(l.log2 - hm/l)
Output:
1.846439344671

XPL0

code real RlOut=48, Ln=54;      \intrinsic routines
string 0;                       \use zero-terminated strings

func StrLen(A);                 \Return number of characters in an ASCIIZ string
char A;
int  I;
for I:= 0, -1>>1-1 do
    if A(I) = 0 then return I;

func real Entropy(Str);         \Return Shannon entropy of string
char Str;
int  Len, I, Count(128);
real Sum, Prob;
[Len:= StrLen(Str);
for I:= 0 to 127 do Count(I):= 0;
for I:= 0 to Len-1 do           \count number of each character in string
    Count(Str(I)):= Count(Str(I)) + 1;
Sum:= 0.0;
for I:= 0 to 127 do
    if Count(I) # 0 then        \(avoid Ln(0.0) error)
        [Prob:= float(Count(I)) / float(Len);   \probability of char in string
        Sum:= Sum + Prob*Ln(Prob);
        ];
return -Sum/Ln(2.0);
];

RlOut(0, Entropy("1223334444"))
Output:
    1.84644

Zig

const std = @import("std");
const math = std.math;

pub fn main() !void {
    const stdout = std.io.getStdOut().outStream();
    try stdout.print("{d:.12}\n", .{H("1223334444")});
}

fn H(s: []const u8) f64 {
    var counts = [_]u16{0} ** 256;
    for (s) |ch|
        counts[ch] += 1;

    var h: f64 = 0;
    for (counts) |c|
        if (c != 0) {
            const p = @intToFloat(f64, c) / @intToFloat(f64, s.len);
            h -= p * math.log2(p);
        };

    return h;
}
Output:
1.846439344671

zkl

Translation of: D
fcn entropy(text){
   text.pump(Void,fcn(c,freq){ c=c.toAsc(); freq[c]+=1; freq }
       .fp1( (0).pump(256,List,0.0).copy() )) // array[256] of 0.0
   .filter()		      // remove all zero entries from array
   .apply('/(text.len()))     // (num of char)/len
   .apply(fcn(p){-p*p.log()}) // |p*ln(p)|
   .sum(0.0)/(2.0).log();     // sum * ln(e)/ln(2) to convert to log2
}

entropy("1223334444").println(" bits");
Output:
1.84644 bits

ZX Spectrum Basic

Translation of: FreeBASIC
10 LET s$="1223334444": LET base=2: LET entropy=0
20 LET sourcelen=LEN s$
30 DIM t(255)
40 FOR i=1 TO sourcelen
50 LET number= CODE s$(i)
60 LET t(number)=t(number)+1
70 NEXT i
80 PRINT "Char";TAB (6);"Count"
90 FOR i=1 TO 255
100 IF t(i)<>0 THEN PRINT CHR$ i;TAB (6);t(i): LET prop=t(i)/sourcelen: LET entropy=entropy-(prop*(LN prop)/(LN base))
110 NEXT i
120 PRINT '"The Entropy of """;s$;""" is ";entropy