Arithmetic coding/As a generalized change of radix: Difference between revisions
Arithmetic coding/As a generalized change of radix (view source)
Revision as of 00:37, 31 January 2016
, 8 years agoJ: show several example implementations of the wikipedia spec, and discuss how they differ
m (→{{header|Perl}}: Optimized the decoding process) |
(J: show several example implementations of the wikipedia spec, and discuss how they differ) |
||
Line 215:
1224972022395235072 15</lang>
(See talk page.)
If, however, we ignore the wikipedia suggestion that letters have predefined values (e.g. C = 2), but instead use cumulative frequency (in order from least common to most common and leaving in order of appearance for letters with the same frequencies, we get this:
<lang J>aekc=:3 :0
b=. x:#y
i=. (~.i.])y
n=. #/.~y
f=. i{n
c=. i{(+/\0,}:/:~n)/:/:n
L=. b #. c**/\1,}:f
p=. */f
e=. x:<.10^.p
e,~<.(L+p)%10^e
)</lang>
With encoded values:
<lang J> aekc 'DABDDB'
251 2
aekc 'DABDDBBDDBA'
167351 6
aekc 'ABRACADABRA'
19022571 4
aekc 'TOBEORNOTTOBEORTOBEORNOT'
807796941974181890 15</lang>
If we instead use cumulative frequencies, cumulated strictly in alphabetic order, we get this:
<lang J>aeka=:3 :0
b=. x:#y
i=. (~.i.])y
n=. #/.~y
o=. /:~.y
f=. i{n
c=. i{(+/\0,}:o{n)/:o
L=. b #. c**/\1,}:f
p=. */f
e=. x:<.10^.p
e,~<.(L+p)%10^e
)</lang>
<lang J> aeka 'DABDDB'
251 2
aeka 'DABDDBBDDBA'
167351 6
aeka 'ABRACADABRA'
7954170 4
aeka 'TOBEORNOTTOBEORTOBEORNOT'
1150764267498783364 15</lang>
Note that this last variant appears to match some other implementations of this task, though the algorithm itself conflicts with the specification currently visible at [[wp:Arithmetic_coding#Arithmetic_coding_as_a_generalized_change_of_radix|wikipedia]] in several respects:
# The wikipedia description currently specifies "The cumulative frequency is the total of all frequencies below it in a frequency distribution (a running total of frequencies)" but that is not alphabetic order,
# The wikipedia description currently specifies for the "DABDDB" example the encoding "A = 0, B = 1, C = 2, D = 3" but that encoding is clearly not the specified cumulative frequencies.
=={{header|Perl}}==
|