Talk:Arithmetic coding/As a generalized change of radix: Difference between revisions

no edit summary
No edit summary
No edit summary
Line 42:
:::::::: After all, the Wikipedia page seems to be contradictory, as it mentions both mapping the string to digits (not saying which digits), and bellow referring to <math>\scriptstyle C_i</math> as the cumulative frequency of the current symbol. I agree, it's a mess...
:::::::: (Off-topic: to enable math markup, go to [[Special:Preferences#mw-prefsection-rendering]] and select ''MathML with SVG or PNG fallback'' at the bottom of the page) --[[User:Trizen|Trizen]] ([[User talk:Trizen|talk]]) 21:04, 28 January 2016 (UTC)
::::::::: Thank you, (and actually someone else adjusted that setting on my behalf), that worked around the problem with PNG rendering. --[[User:Rdm|Rdm]] ([[User talk:Rdm|talk]]) 18:45, 30 January 2016 (UTC)
 
:::::::: I don't understand what your confusion is. 0, 1, and 3 are the cumulative frequencies for A, B, and D respectively. The cumulative frequency = the sum of the frequencies of the letters accumulated before it. So if we choose to accumulate in the order A -> B -> D (in order of increasing character code), then for A, the cumulative frequency is 0 (since there's nothing before it); for B, it's the sum of the frequencies for A = 1; for D, it's the sum of the frequencies for A and B = 1 + 2 = 3. If there were another one after that, say Q, its cumulative frequency would be the sum of the frequencies for A, B, and D = 1 + 2 + 3 = 6. That is cumulative frequency. The letters themselves are irrelevant; it's just the frequencies of each and order that we accumulate them that is relevant. It has ''nothing'' to do with "letter encoding values".
:::::::: The purpose of all this is to divide the interval from 0 to 1 (or in this case, the interval from 0 to (length of string) since they want to use all integers) fully into intervals for each of the characters, where the length of the interval for each character is proportional to its frequency. So we want an interval of length 1 for A, length 2 for B, and length 3 for D. The implementations here accomplish that by assigning them from 0 upwards to the characters in ascending character order. So A gets the interval 0-1, B gets the interval 1-3, and D gets the interval 3-6. So each character gets an interval represented by the start being its "cumulative frequency", and the length of the interval being its frequency. --[[User:Spoon!|Spoon!]] ([[User talk:Spoon!|talk]]) 01:15, 29 January 2016 (UTC)
::::::::: I do not think you read what I wrote.
::::::::: That said, using cumulative frequencies in place of letter values for your encoding means words like THE, ONE, TWO, DOG, BAD, etc. etc. all encode to the same number. --[[User:Rdm|Rdm]] ([[User talk:Rdm|talk]]) 18:45, 30 January 2016 (UTC)
:::::::::: Yup, they SHOULD all encode to the same number. That's the point, because they have the same set of frequencies. It's the dictionary that allows you to decode them. In Huffman Coding, all of those should also similarly encode to the same binary code, and it's the dictionary that allows you to decode. --[[User:Spoon!|Spoon!]] ([[User talk:Spoon!|talk]]) 21:35, 30 January 2016 (UTC)
:::::::: The purpose of all this is to divide the interval from 0 to 1 (or in this case, the interval from 0 to (length of string) since they want to use all integers) fully into intervals for each of the characters, where the length of the interval for each character is proportional to its frequency. So we want an interval of length 1 for A, length 2 for B, and length 3 for D. The implementations here accomplish that by assigning them from 0 upwards to the characters in ascending character order. So A gets the interval 0-1, B gets the interval 1-3, and D gets the interval 3-6. So each character gets an interval represented by the start being its "cumulative frequency", and the length of the interval being its frequency. --[[User:Spoon!|Spoon!]] ([[User talk:Spoon!|talk]]) 01:15, 29 January 2016 (UTC)
::::::::: Thank you, (and actually someone else adjusted that setting on my behalf), that worked around the problem with PNG rendering. --[[User:Rdm|Rdm]] ([[User talk:Rdm|talk]]) 18:45, 30 January 2016 (UTC)
Anonymous user