Talk:Entropy: Difference between revisions

From Rosetta Code
Content added Content deleted
Line 3: Line 3:
I wonder if there is a way to make a smarter way of calculating the entropy. Repetitive patterns should reduce the entropy, for instance. Basically, the entropy should be the number of bits returned by the best possible ever compression program or something. Even better: the size of the smallest computing program that outputs the sequence.--[[User:Grondilu|Grondilu]] 21:11, 21 February 2013 (UTC)
I wonder if there is a way to make a smarter way of calculating the entropy. Repetitive patterns should reduce the entropy, for instance. Basically, the entropy should be the number of bits returned by the best possible ever compression program or something. Even better: the size of the smallest computing program that outputs the sequence.--[[User:Grondilu|Grondilu]] 21:11, 21 February 2013 (UTC)
: The entropy is calculated based on the probability of a symbol to occur. Therefore the entropy of ''aab'' is the same as ''aba''. The higher the probabilty of a symbol to occur the less information it contains and therefore requires less bits to encode. The entropy of ''aaab'' is less than the entropy of ''aab''. The entropy states how much bits per symbol is required. If there is a code with an average code length of exactly the entropy it is considered a perfect code (there can't be any better loss-less code). The entropy of ''a'' is zero. The entropy of ''aa'' is also zero. (Because a has probability p = 1). -- [[User:Mroman|Mroman]] 21:43, 21 February 2013 (UTC)
: The entropy is calculated based on the probability of a symbol to occur. Therefore the entropy of ''aab'' is the same as ''aba''. The higher the probabilty of a symbol to occur the less information it contains and therefore requires less bits to encode. The entropy of ''aaab'' is less than the entropy of ''aab''. The entropy states how much bits per symbol is required. If there is a code with an average code length of exactly the entropy it is considered a perfect code (there can't be any better loss-less code). The entropy of ''a'' is zero. The entropy of ''aa'' is also zero. (Because a has probability p = 1). -- [[User:Mroman|Mroman]] 21:43, 21 February 2013 (UTC)
:: Sure, but the entropy of aaabbb should be the same as ab since we could change the symbols aaa to a and bbb to b. Also, it aaabbb should not have the same entropy as say aababb.--[[User:Grondilu|Grondilu]] 22:22, 21 February 2013 (UTC)
:: Sure, but the entropy of "aaabbb" should be the same as "ab" since we could change the symbols "aaa" to a and "bbb" to b. Also, "aaabbb" should not have the same entropy as say "aababb".--[[User:Grondilu|Grondilu]] 22:22, 21 February 2013 (UTC)

Revision as of 22:24, 21 February 2013

Naïve implementation?

I wonder if there is a way to make a smarter way of calculating the entropy. Repetitive patterns should reduce the entropy, for instance. Basically, the entropy should be the number of bits returned by the best possible ever compression program or something. Even better: the size of the smallest computing program that outputs the sequence.--Grondilu 21:11, 21 February 2013 (UTC)

The entropy is calculated based on the probability of a symbol to occur. Therefore the entropy of aab is the same as aba. The higher the probabilty of a symbol to occur the less information it contains and therefore requires less bits to encode. The entropy of aaab is less than the entropy of aab. The entropy states how much bits per symbol is required. If there is a code with an average code length of exactly the entropy it is considered a perfect code (there can't be any better loss-less code). The entropy of a is zero. The entropy of aa is also zero. (Because a has probability p = 1). -- Mroman 21:43, 21 February 2013 (UTC)
Sure, but the entropy of "aaabbb" should be the same as "ab" since we could change the symbols "aaa" to a and "bbb" to b. Also, "aaabbb" should not have the same entropy as say "aababb".--Grondilu 22:22, 21 February 2013 (UTC)