Talk:Bitwise IO

From Rosetta Code
Revision as of 01:07, 20 December 2008 by rosettacode>ShinTakezou (→‎Real intention: no endianness issues I can see, + explanation)

Real intention

The real intention of this task and code was to have a bunch of functions to test the LZW compressor / decompressor on a binary real output (instead of array or similar output); this way compression ratio statistics for the LZW of the task LZW compression can be done. --ShinTakezou 16:57, 19 December 2008 (UTC)

Hmm, strictly speaking it is not bit-oriented. It is rather a bit-stream interface to a byte-oriented I/O. Exactly because it is not bit I/O (as for instance a serial I/O is) you have an endianness issue here. So in your task you should specify whether it is big or little endian encoding of bits into bytes. Your code looks like big endian. Was it your intention? --Dmitry-kazakov 18:28, 19 December 2008 (UTC)
It is bit-oriented, as common software (not my intention to drive serial hardware this way!) I/O can be, i.e. you think you are writing one or more bits, indeed they are sent only grouped by eight, since we can only write bytes (and no more of one byte per time, or we have surely endianness issues). For the endianness while computing, it could be a point, but here I can't see how, since endianness issues are related to how bytes are stored into memory. Let us take a 0xFF packed into a 16 bit word. It will be written into memory, on little endian arch, as 0xFF 0x00. But when you take it, you can consider it logically (I love big endian!) as 0x00FF and you can be sure that if you perform a left shift, you will obtain 0x01FE... If you write it into memory, you have again 0xFE 0x01 on LE arch. But, if you shift left 9 time, you will obtain 0xFE00 with a carry (or whatever the status flag for a bit slipped away from left is called). Again, iff you write it into memory, and pretend to read it as sequencial bytes instead of word, you get an issue. Again, into memory it is 0x00 0xFE for LE arch. Luckly even LE processor handle data into registers in a more logical way!
You can object that a variable such d is stored into memory somewhere, so it is LE encoded. But all operations in a processor are rather endianness-logical (!)! So when I left-shift a 0x00FF that is stored into memory as 0xFF 0x00, the processor first load the datum, so that now you can think of it as 0x00FF, then perform a left shift, obtaining e.g. 0xFF00, then storing into memory, as 0x00 0xFF. If I'd read from memory byte by byte, loading and shifting to create a datum longer than one byte, I should have considered endianness seriously. But that's also why read and write operation are performed byte by byte rather than accumulating into a 16 or 32 bit word.
To say it briefly, I can't see any endianness issues. I am not interested how the processor stores the unsigned int I use to take the bits from; the fact is that when I do a datum << (32-4) for a 4 bit datum of 0xF, what I obtain is (luckly) 0xF0000000 (stored as 0x00 0x00 0x00 0xF0). When I shift again 1 bit left, I expect to have 0xE0000000, not 0xE0010000 (stored as 0x00 0x00 0x01 0xE0). I am telling it harder than it is. It's late for me... hopely tomorrow I will find better words. --ShinTakezou 01:07, 20 December 2008 (UTC)