Jump to content

Talk:Bitwise IO: Difference between revisions

bits and task specification issues (blah blah! :D)
(Bits and bytes)
(bits and task specification issues (blah blah! :D))
Line 55:
 
: But you don't need all this philosophy in order to unambiguously specify the task... (:-)) Merry Christmas! --[[User:Dmitry-kazakov|Dmitry-kazakov]] 09:51, 23 December 2008 (UTC)
 
 
==After task rewriting; but still on the task sense dialog==
 
I still can't get the point and how it is (was) related to the way the task is (was) specified. But '''to me''' it sounds a little bit confusing what you are saying. In common computerish world, a byte is a minimal addressable storage unit; nonetheless, it is '''meaningful''' to talk about bits ordering in it, and in fact many processors allow to handle single bits into a byte (or bigger "units"), once it is loaded into a processor register. E.g. motorola 680x0 family has bset, bclr and btst to set, clear or test a single bit into a register (and by convention the bit "labelled" as "0" is the LSB; it is not my way of saying it, it was the one of the ''engineers'' who wrote the manual —shipped from Motorola— where I've read about 680x0 instructions set). The same applies if you want to use logical bitwise operations that all the processors I know have; for instance ''and'', ''or'', ''exclusive or'', ''not''. To keep all the stuff coherent, you must "suppose" (define an already defined) bit ordering.
 
A bit ordering is always defined, or a lot of things would be meaningless when computers are used. The following is an excerpt from RFC 793.
 
<pre>
TCP Header Format
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Port | Destination Port |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Acknowledgment Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data | |U|A|P|R|S|F| |
| Offset| Reserved |R|C|S|S|Y|I| Window |
| | |G|K|H|T|N|N| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Checksum | Urgent Pointer |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Options | Padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| data |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
</pre>
 
Numbers on the top define a bit ordering of 32 bits (4 bytes). In order to give "universal" (shareable and portable) meaning to "numbers" that span more than a single byte, the big endian convention is used, and this must be stated, and in fact it is, somewhere (but again, I would refer to the big-endian convention as the "natural" one, since we are used to write numbers from left to right, from the most significant digit to the less significant digit &mdash;the unity). In order to check, get or set the flags URG, ACK, PSH, RST, SYN and FIN the bit ordering is essential; You may need to translate from a convention to another (e.g. if I would have loaded the 16 bit of the Source Port into a register of 680x0, which is a big endian processor, I would had the right "number" of the Source Port, since endianness matchs, but to refer to bit here labelled as 15, I should use 0 instead). Despite the bit ordering here given, I can access URG flags by an ''and'' mask, which I can build without considering that labelling: the mask would be simply 0000 0000 0010 0000 0000 0000 0000 0000, which a can write shortly as hex number 00200000, i.e. simply hex 200000, which I could also express (not conveniently but keeping the "meaning") as 2097152 (base 10). Big endianness apart (since the data are stored somewhere by computers that can have different conventions), no more specification is needed to build the mask.
 
The preferred ordering is the one in use on a particular system, and '''could''' be different. But it is not: since bytes are handled atomically, bit ordering into a byte, for all the ways common people are able to access them (i.e. using machine code or maybe assembly), is always the same. So that I can write the byte 01000001 (in hex 41), and be sure that it will be always the same, from "point" to "point" (i.e. transmission in the middle can change the thing, but some layer, at hw level or/and sw level, must "adjust" everything so that an "application" can read 0x41, and interpret it, e.g. as an ASCII "A").
 
Another example where nobody felt necessary to specify more than what it is obvious by exposing the matter itself, could be the MIPS instructions set you can read at [http://www.mrc.uidaho.edu/mrc/people/jff/digital/MIPSir.html MIPS reference]. In order to program an assembler for that MIPS, you don't need more information than the one that is already given in the page. Using "mine" implementation of ''bit'' ''oriented'' stream read/write functions, I could code the Add instruction in the following way:
 
<pre>
bits_write(0, 6, stdout);
bits_write(source, 5, stdout);
bits_write(target, 5, stdout);
bits_write(destreg, 5, stdout);
bits_write(0x20, 11, stdout);
</pre>
 
Where source, target and destreg are integers (only the first 5 bits are taken, so that the integer range is from 0 to 31). One could do it with ''and''s, ''or''s and ''shift''s, but then when writing the final 32 bit integer into memory, s/he must pay attention to endiannes, so again a bunch of and+shift could be used to "select" single bytes of the 32 bit integer and write the integer into memory in an endiannes-free way (a way which works on every processor, despite of its endianness). (Here I am also supposing that the processor in use is able to handle 32bit integers, i.e. is a 32 bit processor).
 
As the MIPS reference given above, I don't need to define how the bits define the bytes in the sequence. It is straightforward and "natural", as the MIPS bit encoding is. If I want to write the bit sequence 101010010101001001010101010, I've just to group it 8 by 8:
 
<pre>
1010 1001 0101 0010 0101 0101 010
\_______/ \_______/ \_______/ \____
</pre>
 
and "pad" the last bit "creating" fake 0s until we have a number of bits multiple of 8:
 
<pre>
1010 1001 0101 0010 0101 0101 010X XXXX
\_______/ \_______/ \_______/ \_______/
</pre>
 
So, the output bytes will be (in hex): A9 52 55 40. Also, to write that sequence as a whole, I must write <tt>bits_write(0x54A92AA, 27, stdout)</tt>, which could be a little bit confusing, but it is not, since if you write the hex number in base 2 you find exactly the sequence I wanted to write, right aligned (and only this one is a convention I decided in my code, and that could be different for different implementations, but this is also the most logical convention not to create code depending on the maximum size of a register: if the left-aligning would be left to the user of the function, he should code always it so that it takes into account the size of an "int" in that arch; in my implementation, this "count" is left to the functions, which in fact left align the bits taking into account the "real" size of the container &mdash;a #define also should allow to use the code on archs that have bytes made of a different number of bits, e.g. 9)
 
Another way of writing the same sequence, and obtain the same bytes as output, could be:
<pre>
bits_write(1, 1, stdout); bits_write(0, 1, stdout);
bits_write(1, 1, stdout); bits_write(0, 1, stdout);
</pre>
 
And so on. If the task is not well explained in these details, examples should clarify it. But maybe I am wrong. --[[User:ShinTakezou|ShinTakezou]] 01:14, 28 December 2008 (UTC)
Cookies help us deliver our services. By using our services, you agree to our use of cookies.