How to bit shift in a bits where only a select bits are affected - bit-manipulation

I am really confused in a bit shifting thing, so the idea is I have 8 bits , the first 3 bits will stay constant no matter what, meanwhile all the wrapping around and bit shifting has to happen only in the remaining 5 bits, how do I go about implementing such a mapping using only bitwise and bit shift operators??

Related

bit-refected constant of PCLMULQDQ fast-crc

I am confused about how to calculate the bit-reflected constants in the white paper "Fast CRC Computation for Generic Polynomials Using PCLMULQDQ Instruction".
In the post Fast CRC with PCLMULQDQ NOT reflected and How the bit-reflect constant is calculated when we use CLMUL in CRC32, #rcgldr mentioned that "...are adjusted to compensate for the shift, so instead of x^(a) mod poly, it's (x^(a-32) mod poly)<<32...", but I do not understand what does this mean.
For example, constant k1=(x^(4*128+64)%P(x))=0x8833794c (on page 16) v.s. k1'=(x^(4*128+64-32)%P(x)<<32)'=(0x154442db4>>1) (on page 22), I can't see those two figures have any reflection relationship (10001000_00110011_01111001_01001100 v.s 10101010_00100010_00010110_11011010).
I guess my question is why the exponent needs to subtract 32 to compensate 32bits of left shift? and why k1 and (k1)' are not reflected?
Could you please help to interpret it? Thanks
I had carefully searched for the answer to this question on the internet, especially in StackOverflow, and I tried to understand the related posts but need some experts to interpret more.
I modded what was originally some Intel examples to work with Visual Studio | Windows, not-reflected and reflected for 16, 32, and 64 bit CRC, in this github repository.
https://github.com/jeffareid/crc
I added some missing comments and also added a program to generate the constants used in the assembly code for each of the 6 cases.
instead of x^(a) mod poly, it's (x^(a-32) mod poly)<<32
This is done for non-reflected CRC. The CRC is kept in the upper 32 bits, as well as the constants, so that the result of PCLMULQDQ ends up in the upper 64 bits, and then right shifted. Shifting the constants left 32 bits is the same as multiplying by 2^32, or with polynomial notation, x^32.
For reflected CRC, the CRC is kept in the lower 32 bits, which are logically the upper 32 bits of a reflected number. The issue is PCLMULQDQ multiplies the product by 2, right shifting the product by 1 bit, leaving bit 127 == 0 and the 127 bit product in bits 126 to 0. To compensate for that, the constants are (x^(a) mod poly) << 1 (left shift for reflected number is == divide by 2).
The example code at that github site includes crc32rg.cpp, which is the program to generate the constants used by crc32ra.asm.
Another issue occurs when doing 64 bit CRC. For non-reflected CRC, sometimes the constant is 65 bits (for example, if the divisor is 7), but only the lower 64 bits are stored, and the 2^64 bit handled with a few more instructions. For reflected 64 bit CRC, since the constants can't be shifted left, (x^(a-1) mod poly) is used instead.
#rcgldr I think I didn't catch your point tbh... probably I didn't make my question clear...
If my understanding of the code (reverse CRC32) is correct, take the simplest scenario as an example, the procedure of 1-fold 32byte block is shown here. I don't understand why the exponents used in the constants are not 128 and 192 (=128+64) respectively.

Is the msb of hex representation of binary the left side or right side?

In an answer to a bit padding related question here the respondent made the following statement
uint32 value of 69 is an integer value, if you want to pad it to have 32-byte
(the EVMnative length), then you have to consider the "endianness" (Read more here):
* Big-endian: 0x0000.....0045 (32 bytes)
* Little-endian: 0x4500.....00000 (32 bytes)
Apologies if this is a trite question, but what determines that
in case of big-endian the most significant bit is on the left side? hence padding goes on the left, but
in case of little-endian, the most significant bit is on right side? hence padding goes on the right?
Because my understanding is that with big-endian the most significant bits get read first into the least significant memory address but in little-endian, the least significant bits get read first into the least significant memory address.
The above padding scheme seems to suggest when represented in hex, the left side get's read in first with big-endian, but with little-endian, the right side get read in first
that most significant bit is on the left side, hence when dealing with big-endian, the padding goes on the left, but when dealing with little-endian it goes on the right, since again, the

Why are CRC Polynomials given as Normal, Reversed, etc.?

I'm learning about CRCs, and search engines and SO turn up nothing on this....
Why do we have "Normal" and "Reversed" and "Reciprocal" Polynomials? Does one favor Big Endian, Little Endian, or something else?
The classic definition of a CRC would use a non-reflected polynomial, which shifts the CRC left. If the word size being used for the calculation is larger than the CRC, then you would need an operation at the end to clear the high bits that were shifted into (e.g. & 0xffff for a 16-bit CRC).
You can flip the whole thing, use a reflected polynomial, and shift right instead of left. That gives the same CRC properties, but the bits from the message are effectively operated on from least to most significant bit, instead of most to least significant bit. Since you are shifting right, the extraneous bits get dropped off the bottom into oblivion, and there is no need for the additional operation. This may have been one of the early motivations to use a very slightly faster and more compact implementation.
Sometimes the specification from the original hardware is that the bits are processed from least to most significant, so then you have to use the reflected version.
No, none of this favors little or big endian. Either kind of CRC can be computed just as easily in little-endian or big-endian architectures.

Bitstream parsing and Endianness

I am trying to parse a bitstream, and I am having trouble getting my head around endianness. I have a byte buffer, and I need to be able to read bitfields out which are of varying lengths, anywhere from 1 bit to 8 bits mostly.
My problem comes with the endianness of the bytes. When I step through with a debugger, the bottom 4 bits appear to be in the top portion of the byte. That is, where I am expecting the first two bits to be 10 (they must be 10), however, the first byte in the bitstream is 0xA3, or 1010 0011, when checking with the debugger. Meaning, assuming that the bits are in the "correct" order, the first two bits are in fact 11 (reading right to left).
It would seem, however, that if the bits were not in the right order, and should be 0x3A, or 0011 1010, I then have 10 as my expected first two bits.
This confuses me, because it doesn't seem to be a matter of bit order, MSb to LSb/LSb to MSb, but rather nibble order. How does this happen? That seems to just be the way it came out of the file. There is a possibility this is an invalid bitstream, but I have seen this kind of thing before when reading files in Hex Editors, nibbles seemingly in the "wrong" order.
I am just confused and would like some help understanding what's going on. I don't often deal with things at this level.
You don't need to concern the bit order, because in C/C++ there is no way for you to iterate through the bits using pointer arithmetics. You can only manipulate the bits using bit-wise operators that are independent of the bit order of the local machine. What you mentioned in the OP is just a matter of visualization. Different debuggers may choose different ways to visualize the bits in a byte. There is no right or wrong for this matter. There is just preference. What really matters if the byte order.

Bit order in C/C++

I have to implement a protocol which defines data in 8bit words, which starts with the least significant bit (LSB) first. I want to realize this data with unsigned char, but I don't know what's the bit order of LSB and most significant bit (MSB) in C/C++, that could possible require swapping the bits.
Can anybody explain me how to find out an unsigned char is encoded: with MSB-LSB or LSB-MSB?
Example:
unsigned char b = 1;
MSB-LSB: 0000 0001
LSB-MSB: 1000 0000
Endian-ness is platform dependent. Anyway, you don't have to worry about actual bit order unless you are serializing the bytes, which you may be trying to do. In which case, you still don't need to worry about how individual bytes are stored while they're on the machine, since you will have to dig the bits out individually anyway. Fortunately, if you bitwise AND with 1, you get the LSB, regardless of storage order; bit-AND with 2 and you get the next most significant bit, and so on. The compiler will sort out what constants to generate in the machine code, so that level of detail is abstracted away.
There is no such thing in C/C++. The least significant bit is -- well -- the least significant bit. Since the bits don't have addresses, there is no other ordering.