I do not understand some details in sections 9 and 10 of "A Painless Guide to CRC Error detection algorithms" - crc

The guide mentioned in the title (downloaded from http://ross.net/crc/download/crc_v3.txt) has perfectly helped me grasp the principle, and the manual implementation, of a CRC algorithm. In sections 9 and 10 I hit what probably are minor snags.
The detail I do not understand in section 9. A Table-Driven Implementation, where message bytes are shifted into a register from the right, is in this part (emphasis in quotations is mine):
• The top byte of the register now doesn't matter. No matter how many times and at what offset the poly is XORed to the top 8 bits, they will all be shifted out the right hand side during the next 8 iterations anyway.
• The remaining bits will be shifted left one position and the rightmost byte of the register will be shifted in the next byte
I would think "right hand side" should be "left hand side" in the first bullet point?
In section 10. A Slightly Mangled Table-Driven Implementation,
about dealing with the issue of appending zero bytes to the end of the message to be checked, I am confused by the following:
The trouble, you see, is that this loop operates upon the AUGMENTED message and in order to use this code, you have to append W/8 zero bytes to the end of the message before pointing p at it.
That is perfectly clear (the polynomial has width 32), but in the code that follows, and further down in the section, the author uses W/4:
for (i=0; i<W/4; i++) r = (r << 8) ^ t[(r >> 24) & 0xFF];
I would think that this should also have W/8?

There are typos in that text file. The top byte matters, since each non-zero bit results in the polynomial being xor'ed to the register, and the bits at that point are being shifted to the left. W/4 should be W/8, but it's not used in the later example code that xor's data with the top byte of the register to index the table, decrements len which is a count of bytes and increments p which is a pointer to a byte:
r=0; while (len--) r = (r<<8) ^ t[(r >> 24) ^ *p++];

Related

Explain Bit Test macro in C++

I'm trying to figure out how does this code work, but I can't manage to get a single answer.
#define testbit(x, y) ( ( ((const char*) & (x))[(y)>>3] & 0x80 >> ((y)&0x07)) >> (7-((y)&0x07) ) )
I'm new at pointers, so if you can figure out a way to explain this in simplified english, I would really appreciate it.
It belongs to a segment of code for an X-Plane Plug-in found at https://code.google.com/p/xplugins/source/browse/trunk/Xsaitekpanels/SwitchPanel.cpp?r=38 line=19
The macro tests the value of the y-th bit in x. You can't directly address bits, so the code starts by treating x as an array of bytes (the const char* cast).
It then looks up the byte where the bit lives. There are 8 bits in a byte, so it divides by 8. Chasing performance, instead of simply dividing by 8, the code uses the binary trick of shifting right 3 places. In general, for unsigned x and y, x >> y = x/2^y, and x << y = x*2^y.
At this point you need to test the bit within the byte, so you get the remainder of y/8. Yet another bit trick, using y & 7 instead of the clearer y % 8.
With this information you can make a mask, a single on bit, 0x80 and shift it into position to test the y%8-th bit. The mask is ANDed against the byte and a non-zero result here means the bit was set to 1, otherwise 0.
Completing #RhythmicFistman's answer
#RhythmicFistman's answer is missing one small part to it and that is the last step in the shifts.
The >> (7-((y)&0x07) step ensures that you only ever get a result of 1 or 0. With this code it is safe to do comparisons like:
if (testbit(varible, 6) == 1) {
// do something
}
Where without that step testbit would return a bit mask in which the 6th bit would be set to 1 or 0 and all the other bits are always set to 0. That is the intent but it is not implemented in what is considered a portable way, see Warning 3 below.
Possible issues with using this code
Now to add something to the other answers. The other answers have not pointed out some keywords that should be mentioned here and they are strict aliasing and shift arithmetic right. My elaboration will come in the form of warnings below.
Warning 1: Endianness
This code assumes that you are using a big endian architecture or only wish to get the correct bit from an array of chars.
The reason is that if you convert an int into an array of chars (bytes) you will get different results on a big endian machine vs a little endian machine.
Warning 2: Strict Aliasing
The macro makes use of a cast (const char*) &(x) which is designed to change the type, a.k.a. alias, of (x) so that it is easier to get to the correct bits.
This is dangerous and the reason why is explained beautifully in this SO answer. The short version is that if you compile this code with optimisations strange things can happen.
The wikipedia pages on Aliasing and Pointer Aliasing are also useful and should be read.
Warning 3: Shift Arithmetic Right
In addition to this there could be a potential issue with the way this code uses the right shift operator >>. This operator has two different behaviors depending on whether the variable it is operating on is signed or unsigned. So long as you never use negative numbers you will be safe but this code will not protect you against that mistake. I suspect though, that you're less likely to make such a mistake anyway so it should be ok to use it.
Also worth mentioning, you are using signed char and are shifting it right. Though this works I would prefer unsigned char which would improve portability because it will not risk generating an arithmetic shift right when char and int are the same width (which is almost never the case in practice, granted). This works because char is promoted to int for the shift, see this SO answer for an explanation.
What you see is a macro, that make the following job :
(In order)
Make a bit shift to y (value : 3)
That take the address of x and pick the character in position y (into the string x)
Make a binary operation between the selected char and 0x80
Make a bit shift to the previous result (value: result of binary operation between y and 0x7)
Make a bit shift ti the previous result (value: 7 - (result of binary operation between y and 0x7))
Well, this is help you? I don't think so!
Because this macro is clairly unproper, and kind of tricky.
Bit mask, Binary operation, Binary shift...
If you can explain more precisly what you want to understand in this, maybe i can be helpfull.

Bit Shifting in Cache Simulation

What is the formula for calculating the index and tag bits in
Direct Mapped Cache
Associative Cache
Set Associative Cache
I am currently using this formula for Direct Mapped:
#define BLOCK_SHIFT 5;
#define CACHE_SIZE 4096;
int index = (address >> BLOCK_SHIFT) & (CACHE_SIZE-1);
/* in the line above we want the "middle bits" that say where the block goes */
long tag = address >> BLOCK_SHIFT; /* the high order bits are the tag */
Please tell me how many bits are shifted in Associative and Set Associative Cache..
So, I think the concrete answer to your question is "zero", but that's simply because you are asking the wrong question.
Right, so a cache with a given size X, that is directly mapped, will simply use the lower part [or some other part(s)] of the address to form the index into the cache. So index is a value between 0 and (chace-size-1). In other words, "address modulo size". Since sizes of caches are nearly always 2n, we make use of the fact that both of these can be performed using simple bitwise "and" with (size-1) instead of using divide.
In your code, each cache entry (cache-line) holds a "BLOCK" of 32 bytes, so the address should be divided (shifted) down by the block-size. 25 = 32. This shift remains a constant for a constant cache-line size. Since there is no other shift in your example code, I presume you are misunderstanding what you should do.
In a set-associative cache, there are multiple sets of cache-lines that can be used for the same index. So instead of simply taking the lower part of the address as an index, we take a SMALLER part of the lower address. So, the index = address_of_block & (CACHE_SIZE-1) should become address_of_block & ((CACHE_SIZE-1) / ways. Since we are dealing with a 2n number again, we can use the old "shift instead of divide" trick - x / y where y is 2n can be done by x >> n.
So, now you just have to figure out what n is for your number of ways.
And of course, figure out how you determine which of the ways to use when replacing something in the cache, but that is certainly a completely different question.

Writing binary data in c++

I am in the process of building an assembler for a rather unusual machine that me and a few other people are building. This machine takes 18 bit instructions, and I am writing the assembler in C++.
I have collected all of the instructions into a vector of 32 bit unsigned integers, none of which is any larger than what can be represented with an 18 bit unsigned number.
However, there does not appear to be any way (as far as I can tell) to output such an unusual number of bits to a binary file in C++, can anyone help me with this.
(I would also be willing to use C's stdio and File structures. However there still does not appear to be any way to output such an arbitrary amount of bits).
Thank you for your help.
Edit: It looks like I didn't specify how the instructions will be stored in memory well enough.
Instructions are contiguous in memory. Say the instructions start at location 0 in memory:
The first instruction will be at 0. The second instruction will be at 18, the third instruction will be at 36, and so on.
There is no gaps, or no padding in the instructions. There can be a few superfluous 0s at the end of the program if needed.
The machine uses big endian instructions. So an instruction stored as 3 should map to: 000000000000000011
Keep an eight-bit accumulator.
Shift bits from the current instruction into to the accumulator until either:
The accumulator is full; or
No bits remain of the current instruction.
Whenever the accumulator is full:
Write its contents to the file and clear it.
Whenever no bits remain of the current instruction:
Move to the next instruction.
When no instructions remain:
Shift zeros into the accumulator until it is full.
Write its contents.
End.
For n instructions, this will leave (8 - 18n mod 8) zero bits after the last instruction.
There are a lot of ways you can achieve the same end result (I am assuming the end result is a tight packing of these 18 bits).
A simple method would be to create a bit-packer class that accepts the 32-bit words, and generates a buffer that packs the 18-bit words from each entry. The class would need to do some bit shifting, but I don't expect it to be particularly difficult. The last byte can have a few zero bits at the end if the original vector length is not a multiple of 4. Once you give all your words to this class, you can get a packed data buffer, and write it to a file.
You could maybe represent your data in a bitset and then write the bitset to a file.
Wouldn't work with fstreams write function, but there is a way that is described here...
The short answer: Your C++ program should output the 18-bit values in the format expected by your unusual machine.
We need more information, specifically, that format that your "unusual machine" expects, or more precisely, the format that your assembler should be outputting. Once you understand what the format of the output that you're generating is, the answer should be straightforward.
One possible format — I'm making things up here — is that we could take two of your 18-bit instructions:
instruction 1 instruction 2 ...
MSB LSB MSB LSB ...
bits → ABCDEFGHIJKLMNOPQR abcdefghijklmnopqr ...
...and write them in an 8-bits/byte file thus:
KLMNOPQR CDEFGHIJ 000000AB klmnopqr cdefghij 000000ab ...
...this is basically arranging the values in "little-endian" form, with 6 zero bits padding the 18-bit values out to 24 bits.
But I'm assuming: the padding, the little-endianness, the number of bits / byte, etc. Without more information, it's hard to say if this answer is even remotely near correct, or if it is exactly what you want.
Another possibility is a tight packing:
ABCDEFGH IJKLMNOP QRabcdef ghijklmn opqr0000
or
ABCDEFGH IJKLMNOP abcdefQR ghijklmn 0000opqr
...but I've made assumptions about where the corner cases go here.
Just output them to the file as 32 bit unsigned integers, just as you have in memory, with the endianness that you prefer.
And then, when the loader / eeprom writer / JTAG or whatever method you use to send the code to the machine, for each 32 bit word that is read, just omit the 14 more significant bits and send the real 18 bits to the target.
Unless, of course, you have written a FAT driver for your machine...

Defining a byte in C++

In http://www.parashift.com/c++-faq-lite/intrinsic-types.html#faq-26.6, it is wriiten that
"Another valid approach would be to define a "byte" as 9 bits, and simulate a char* by two words of memory: the first could point to the 36-bit word, the second could be a bit-offset within that word. In that case, the C++ compiler would need to add extra instructions when compiling code using char* pointers."
I couldn't understand what it meant by "simulating char* by two words" and further quote.
Could somebody please explain it by giving an example ?
I think this is what they were describing:
The PDP-10 referenced in the second paragraph had 36-bit words and was unable to address anything inside of those words. The following text is a description of one way that this problem could have been solved while fitting within the restrictions of the C++ language spec (that are included in the first paragraph).
Let's assume that you want to make 9-bit-long bytes (for some reason). By the spec, a char* must be able to address individual bytes. The PDP-10 can't do this, because it can't address anything smaller than a 36-bit word.
One way around the PDP-10's limitations would be to simulate a char* using two words of memory. The first word would be a pointer to the 36-bit word containing the char (this is normally as precise as the PDP-10's pointers allow). The second word would indicate an offset (in bits) within that word. Now, the char* can access any byte in the system and complies with the C++ spec's limitations.
ASCII-art visual aid:
| Byte 1 | Byte 2 | Byte 3 | Byte 4 | Byte 5 | Byte 6 | Byte 7 | Byte 8 |
-------------------------------------------------------------------------
| Word 1 | Word 2 |
| (Address) | (Offset) |
-------------------------------------------------------------------------
Say you had a char* with word1 = 0x0100 and word2 = 0x12. This would point to the 18th bit (the start of the third byte) of the 256th word of memory.
If this technique was really used to generate a conforming C++ implementation on the PDP-10, then the C++ compiler would have to do some extra work with juggling the extra bits required by this rather funky internal format.
The whole point of that article is to illustrate that a char isn't always 8 bits. It is at least 8 bits, but there is no defined maximum. The internal representation of data types is dependent on the platform architecture and may be different than what you expect.
Since the C++ spec says that a char* must point to individual bytes, and the PDP-6/10 does not allow addressing individual bytes in a word, you have a problem with char* (which is a byte pointer) on the PDP-6/10
So one work around is: define a byte as 9 bits, then you essentially have 4 bytes in a word (4 * 9 = 36 bits = 1 word).
You still can't have char* point to individual bytes on the PDP-6/10, so instead have char* be made up of 2 36-bit words. The lower word would be the actual address, and the upper word would be some byte-mask magic that the C++ compiler could use to point to the right 9bits in the lower word.
In this case,
sizeof(*int) (36bits) is different than sizeof(*char) (72bits).
It's just a contrived example that shows how the spec doesn't constrain primatives to specific bit/byte sizes.
data: [char1|char2|char3|char4]
To access char1:
ptrToChar = &data
index = 0
To access char2:
ptrToChar = &data
index = 9
To access char3:
ptrToChar = &data
index = 18
...
then to access a char, you would:
(*ptrToChar >> index) & 0x001ff
but ptrToChar and index would be saved in some sort of structure that the compiler creates so they would be associated with each other.
Actually, the PDP-10 can address (load, store) 'bytes', smaller than a (36-bit) word, with a single word pointer. On th -10, a byte pointer includes the word address containing the 'byte', the width (in bits) of the 'byte', and the position (in bits from the right) of the 'byte' within the word. Incrementing the pointer (with an explicit increment, or increment and load/deposit instruction), increments the position part (by the size part) and, handles overflow to the next word address. (No decrementing, though.) A byte pointer can e.g. address individual bits, but 6, 8, 9, 18(!) were probably common, as there were specially-formatted versions of byte pointers (global byte pointers) that made their use somewhat easier.
Supposing a PDP-10 implementation wanted to get as close to having 8-bit bytes as possible. The most reasonable to split up a 36-bit word (the smallest unit of memory that the machine's assembly langauge can address) is to divide the word up into four 9-bit bytes. To access a particular 9-bit byte, you need to know which word it's in (you'd use the machine's native addressing mode for that, using a pointer which takes up one word), and you'd need extra data to indicate which of the 4 bytes inside the word was the one you're interested. This extra data would be stored in a second machine word. The compiler would generate lots of extra instructions to use that extra data to pull the right byte out of the word, using the extra data stored in the second word.

Weird usage of "&" for a novice C++ programmer

I have some code here, and don't really understand the ">>" and the "&". Can someone clarify?
buttons[0] = indata[byteindex]&1;
buttons[1] = (indata[byteindex]>>1)&1;
rawaxes[7] = (indata[byteindex]>>4)&0xf;
These are bitwise operators, meaning they operate on the binary bits that make up a value. See Bitwise operation on Wikipedia for more detail.
& is for AND
If indata[byteindex] is the number 4, then in binary it would look like 00000100. ANDing this number with 1 gives 0, because bit 1 is not set:
00000100 AND 00000001 = 0
If the value is 5 however, then you will get this:
00000101 AND 00000001 = 1
Any bit matched with the mask is allowed through.
>> is for right-shifting
Right-shifting shifts bits along to the right!
00010000 >> 4 = 00000001
One of the standard patterns for extracting a bit field is (reg >> offset) & mask, where reg is the register (or other memory location) you're reading, offset is how many least-significant bits you skip over, and mask is the set of bits that matter. The >> offset step can be omitted if offset is 0. mask is usually equal to 2width-1, or (1 << width) - 1 in C, where width is the number of bits in the field.
So, looking at what you have:
buttons[0] = indata[byteindex]&1;
Here, offset is 0 (it was omitted) and mask is 1. So this gets just the least-significant bit in indata[byteindex]:
bit number -> 7 6 5 4 3 2 1 0
+-+-+-+-+-+-+-+-+
indata[byteindex] | | | | | | | |*|
+-+-+-+-+-+-+-+-+
|
\----> buttons[0]
Next:
buttons[1] = (indata[byteindex]>>1)&1;
Here, offset is 1 and width is 1...
bit number -> 7 6 5 4 3 2 1 0
+-+-+-+-+-+-+-+-+
indata[byteindex] | | | | | | |*| |
+-+-+-+-+-+-+-+-+
|
\------> buttons[1]
And, finally:
rawaxes[7] = (indata[byteindex]>>4)&0xf;
Here, offset is 4 and width is 4 (24-1 = 16 - 1 = 15 = 0xf):
bit number -> 7 6 5 4 3 2 1 0
+-+-+-+-+-+-+-+-+
indata[byteindex] |*|*|*|*| | | | |
+-+-+-+-+-+-+-+-+
| | | |
\--v--/
|
\---------------> rawaxes[7]
EDIT...
but I don't understand what the point of it is...
Mike pulls up a rocking chair and sits down.
Back in the old days of 8-bit CPUs, a computer typically had 64K (65 536 bytes) of address space. Now we wanted to do as much as we could with our fancy whiz-bang machines, so we would do things like buy 64K of RAM and map everything to RAM. Shazam, 64K of RAM and bragging rights all around.
But a computer that can only access RAM isn't much good. It needs some ROM for an OS (or at least a BIOS), and some addresses for I/O. (You in the back--siddown. I know Intel chips had separate address space for I/O, but it doesn't help here because the I/O space was much, much smaller than the memory space, so you ran into the same constraints.)
Address space used for ROM and I/O was space that wasn't accessible as RAM, so you wanted to minimize how much space wasn't used for RAM. So, for example, when your I/O peripheral had five different things whose status amounted to a single bit each, rather than give each one of those bits its own byte (and, hence, address), they got the brilliant idea of packing all five of those bits into one byte, leaving three bits that did nothing. Voila, the Interrupt Status Register was born.
The hardware designers were also impressed with how fewer addresses resulted in fewer address bits (since address bits is ceiling of log-base-2 of number of addresses), meaning fewer address pins on the chip, freeing pins for other purposes. (These were the days when 48-pin chips were considered large, and 64-pins huge, and grid array packages were out of the question because multi-layer circuit boards were prohibitively expensive. These were also the days before multiplexing the address and data on the same pins became commonplace.)
So the chips were taped out and fabricated, and hardware was built, and then it fell to the programmers to make the hardware work. And lo, the programmers said, "WTF? I just want to know if there is a byte to read in the bloody serial port, but there are all these other bits like "receiver overrun" in the way." And the hardware guys considered this, and said, "tough cookies, deal with it."
So the programmers went to the Guru, the guy who hadn't forgotten his Boolean algebra and was happy not to be writing COBOL. And the Guru said, "use the Bit AND operation to force those bits you don't care about to 0. If you need a number, and not just a zero-or-nonzero, use a logical shift right (LSR) on the result." And they tried it. It worked, and there was much rejoicing, though the wiser ones started wondering about things like race conditions in a read-modify-write cycle, but that's a story for another time.
And so the technique of packing loosely or completely unrelated bits into registers became commonplace. People developing protocols, which always want to use fewer bits, jumped on these techniques as well. And so, even today, with our gigabytes of RAM and gigabits of bandwidth, we still pack and unpack bitfields with expressions whose legibility borders on keyboard head banging.
(Yes, I know bit fields probably go back to the ENIAC, and maybe even the Difference Engine if Lady Ada needed to stuff two data elements into one register, but I haven't been alive that long, okay? I'm sticking with what I know.)
(Note to hardware designers out there: There really isn't much justification anymore for packing things like status flags and control bits that a driver writer will want to use independently. I've done several designs with one bit per 32-bit register in many cases. No bit shifting or masking, no races, driver code is simpler to write and understand, and the address decode logic is trivially more complex. If the driver software is complex, simplifying flag and bitfield handling can save you a lot of ROM and CPU cycles.)
(More random trivia: The Atmel AVR architecture (used in the Arduino, among many other places) has some specialized bit-set and bit-clear instructions. The avr-libc library used to provide macros for these instructions, but now the gcc compiler is smart enough to recognize that reg |= (1 << bitNum); is a bit set and reg &= ~(1 << bitNum); is a bit clear, and puts in the proper instruction. I'm sure other architectures have similar optimizations.)
These are bitwise operators.
& ands two arguments bit by bit.
'>>' shifts first argument's bit string to the right by second argument.
'<<' does the opposite. | is bitwise or and ^ is bitwise xor just like & is bitwise and.
In English, the first line is grabbing to lowest bit (bit 0) only out of Button[0]. Basically, if the value is odd, it will be 1, if even, it will be 0.
(bit 1)
The second is grabbing the second bit. If that bit is set, it returns 1, else 0. It could have also been written as
buttons[1] = (indata[byteindex]&2)>>1;
and it would have done the same thing.
The last (3rd) line is grabbing the 5th throuh 8th bits (bits 4-7). Basically, it will be a number from 0 to 15 when it is complete. It also could hav been written as
rawaxes[7] = (indata[byteindex]&0xf0) >> 4;
and done the same thing. I'd also guess from context that these arrays are unsigned char arrays. Just a guess though.
The '&' (in this case) is a bitwise AND operator and ">>" is the bit-shift operator (so x>>y yields x shifted right Y bits).
So, they're taking the least significant bit of indata[byteindex] and putting it into buttons[0]. They taking the next least significant bit and putting it into buttons[1].
The last one probably needs to be looked at in binary to make a lot of sense. 0xf is 11112, so they're taking the input, shifting it right 4 bits, then retaining the 4 least significant bits of that result.