Identifying polynomial terms of the CRC - crc

I was looking at this page and I saw that the terms of this polynomial:
0xad0424f3 = x^32 +x^30 +x^28 +x^27 +x^25 +x^19 +x^14 +x^11 +x^8 +x^7 +x^6 +x^5 +x^2 +x +1
which seems not correct since converting the Hex:
0xad0424f3 is 10101101000001000010010011110011
It would become:
x^31+ x^29+ x^27+ x^26+ x^24+ x^18+ x^13+ x^10+ x^7+ x^6+ x^5+ x^4+ x^1+ x^0
Can you help me understand which one is correct?
what about 64 bit ECMA polynomial,
0xC96C5795D7870F42
I want to know the number of terms in each polynomial 0xad0424f3 and 0xC96C5795D7870F42.

That page is on Koopman's web site, where he has his own notation for CRC polynomials. Since all CRC polynomials have a 1 term, he drops that term, divides the polynomial by x, and represents that in binary. That's what you're looking at.
The benefit is that with a 64-bit word, you can then represent all 64-bit and shorter CRC polynomials, with the length of the CRC denoted by the most significant 1 in the word.
The downside is that only Koopman uses that notation, as far as I know, resulting in some confusion by others. Like yourself.
As for your 64-bit CRC, that polynomial that you note is from the Wikipedia page is actually the reversed version, and is not in Koopman's notation. The expansion into a polynomial is shown right there, underneath the hex representation. It has 34 terms.

Related

CRC checksum calculation algorithm

Can anyone with good knowledge of CRC calculation verify that this code
https://github.com/psvanstrom/esphome-p1reader/blob/main/p1reader.h#L120
is actually calculating crc according to this description?
CRC is a CRC16 value calculated over the preceding characters in the data message (from
“/” to “!” using the polynomial: x16+x15+x2
+1). CRC16 uses no XOR in, no XOR out and is
computed with least significant bit first. The value is represented as 4 hexadecimal characters (MSB first).
There's nothing in the linked code about where it starts and ends, and how the result is eventually represented, but yes, that code implements that specification.

Software implementation of floating point division, issues with rounding

As a learning project I am implementing floating point operations (add, sub, mul, div) in software using c++. The goal is to be more comfortable with the underlying details of floating point behavior.
I am trying to match my processor operations to the exact bit, meaning IEEE 754 standard. So far it has been working great, add, sub and mult behave perfectly, I tested it on around 110 million random operations and got the same exact result to what the processor does in hardware. (Although did not take into account edge cases, overflow etc).
After that, I started moving to the last operation, division. It works fine and achieves the wanted result, but from time to time, I get the last mantissa bit wrong, not rounded up. I am having a bit of hard time understanding why.
The main reference I have been using is the great talk from John Farrier (the time stamp is at the point where it shows how to round):
https://youtu.be/k12BJGSc2Nc?t=1153
That rounding has been working really well for all operation but is giving me troubles for the division.
Let me give you a specific example.
I am trying to divide 645.68011474609375 by 493.20962524414063
The final result I get is :
mine : 0-01111111-01001111001000111100000
c++_ : 0-01111111-01001111001000111100001
As you can see everything matches except for the last bit. The way I am computing the division is based on this video:
https://www.youtube.com/watch?v=fi8A4zz1d-s
Following this, I compute 28 bits off accuracy 24 of mantissa ( hidden one + 23 mantissa) and the 3 bits for guard, round sticky plus an extra one for the possible shift.
Using the algorithm of the video, I can at maximum get a normalization shift of 1, that s why I have an extra bit at the end in case gets shifted in in the normalization, so will be available in the rounding. Now here is the result I get from the division algorithm:
010100111100100011110000 0100
------------------------ ----
^ grs^
|__ to be normalized |____ extra bit
As you can see I get a 0 in the 24th position, so I will need to shift on the left by one to get the correct normalization.
This mean I will get:
10100111100100011110000 100
Based on the video of John Farrier, in the case of 100 grs bits, I only normalize if the LSB of the mantissa is a 1. In my case is a zero, and that is why I do not round up my result.
The reason why I am a bit lost is that I am sure my algorithm is computing the right mantissa, I have double checked it with online calculators, the rounding strategy works for all the other operations. Also, computing in this way, triggers the normalization, which yields, in the end, the correct exponent.
Am I missing something ? a small detail somewhere?
One thing that strikes me as odd is the sticky bits, in the addition and multiplication you get a different degree of shifting, which leads to higher chances of the sticky bits to trigger, in this case here, I shift only by one maximum which puts the sticky bits as to be not really sticky.
I do hope I gave enough details to make my problem understood. Here you can find at the bottom my division implementation, is a bit filled with prints I am using for debugging but should give an idea of what I am doing, the code starts at line 374:
https://gist.github.com/giordi91/1388504fadcf94b3f6f42103dfd1f938
PS: meanwhile I am going through the "everything scientist should know about floating point numbers" in order to see if I missed something.
The result you get from the division algorithm is inadequate. You show:
010100111100100011110000 0100
------------------------ ----
^ grs^
|__ to be normalized |____ extra bit
The mathematically exact quotient continues:
010100111100100011110000 0100 110000111100100100011110…
Thus, the residue at the point where you are rounding exceeds ½ ULP, so it should be rounded up. I did not study your code in detail, but it looks like you may have just calculated an extra bit or two of the significand1. You actually need to know that the residue is non-zero, not just whether its next bit or two is zero. The final sticky bit should be one if any of the bits at or beyond that position in the exact mathematical result would be non-zero.
Footnote
1 “Significand” is the preferred term. “Mantissa” is a legacy term for the fraction portion of a logarithm. The significand of a floating-point value is linear. A mantissa is logarithmic.

why using a modulo-2 arithmetic in crc?

I learned an error detection technique called crc. crc calculations are done in modulo-2 arithmetic without carries in addition or borrows in subtraction. I wonder the reason why crc takes modulo-2 arithmetic rather than regular binary arithmetic. Is it easier to be implemented in digital circuit?
Maybe better late than never for an answer. A CRC treats the data as a string of 1 bit coefficients of a polynomial, since the coefficients are numbers modulo 2. From a math perspective, for an n bit CRC, the data polynomial is multiplied by x^n, effectively adding n 0 bit coefficients to the data, then dividing that data + zeroes by a n+1 bit CRC polynomial, resulting in a n bit remainder, which is the CRC. If "encoding" the data with a CRC, the remainder is "subtracted" from the data + zeroes, but for single bit coefficients, adding and subtracting are both xor, so the CRC is just appended to the data, replacing the zeroes that were appended to the data to generate the CRC.
The reason for no carries or borrows across coefficients is because it's polynomial math.
Although not done for CRC, Reed Solomon codes are somewhat similar, but the polynomial coefficients may be numbers modulo some prime number other than 2, such as 929, and for prime numbers other than 2, it's important to keep track of when addition or subtraction modulo a prime number other than 2 is used.
https://en.wikipedia.org/wiki/Reed%E2%80%93Solomon_error_correction

Why is the CRC32 generating polynomial 33 bits long?

First off, if there's a better site to ask this question then please do migrate this or close it and let me know where to go.
Secondly, we're discussing CRC in one of my classes, and neither us nor the professor understand why CRC polynomials are one bit longer than the name (or resulting checksum) suggest. I've done some searching, but nothing seems to discuss why it's one bit longer.
A CRC is the remainder after dividing the message by the polynomial. By definition, the remainder has to be less than the length of the polynomial. Hence the CRC for a "33-bit" polynomial is 32 bits.
Note that the largest exponent of a "33-bit" polynomial is 32 (the lowest term has exponent zero), so the degree of the polynomial, as well as the length of the CRC is 32.

Conversion Big Integer <-> double in C++

I am writing my own long arithmetic library in C++ for fun and it is already pretty finished, I even implemented several Cryptogrphic algorithms with that library, but one important thing is still missing: I want to convert doubles (and floats/long doubles) into my number and vice versa. My numbers are represented as a variable sized array of unsigned long ints plus a sign bit.
I tried to find the answer with google, but the problem is that people rarely ever implement such things themselves, so I only find things about how to use Java BigInteger etc.
Conceptually, it is rather easy: I take the mantissa, shift it by the number of bits dictated by the exponent and set the sign. In the other direction I truncate it so that it fits into the mantissa and set the exponent depending on my log2 function.
But I am having a hard time to figure out the details, I could either play around with some bit patterns and cast it to a double, but I didn't find an elegant way to achieve that or I could "calculate" it by starting with 2, exponentiate, multiply etc, but that doesn't seem very efficient.
I would appreciate a solution that doesn't use any library calls because I am trying to avoid libraries for my project, otherwise I could just have used gmp, furthermore, I often have two solutions on several other occasions, one using inline assembler which is efficient and one that is more platform independent, so either answer is useful for me.
edit: I use uint64_t for my parts, but I would like to be able to change it depending on the machine, but I am willing to do some different implementations with some #ifdefs to achieve that.
I'm going to make non-portable assumption here: namely, that unsigned long long has more accurate digits than double. (This is true on all modern desktop systems that I know of.)
First, convert the most significant integer(s) into an unsigned long long. Then convert that to a double S. Let M be the number of integers less than those used in that first step. multiply S by(1ull << (sizeof(unsigned)*CHAR_BIT*M). (If shifting more than 63 bits, you will have to split those into seperate shifts and do some alrithmetic) Finally, if the original number was negative you multiply this result by -1.
This rounds a lot, but even with this rounding, due to the above assumption, no digits are lost that wouldn't be lost anyway with the conversion to a double. I think this is a similar process to what Mark Ransom said, but I'm not certain.
For converting from a double to a biginteger, first seperate the mantissa into a double M and the exponent into an int E, using frexp. Multiply M by UNSIGNED_MAX, and store that result in an unsigned R. If std::numeric_limits<double>::radix() is 2 (I don't know if it is or not for x86/x64), you can easily shift R left by E-(sizeof(unsigned)*CHAR_BIT) bits and you're done. Otherwise the result will instead beR*(E**(sizeof(unsigned)*CHAR_BIT)) (where ** means to the power of)
If performance is a concern, you can add an overload to your bignum class for multiplying by std::constant_integer<unsigned, 10>, which simply returns (LHS<<4)+(LHS<<2). You can similarly optimize other constants if you wish.
This blog post might help you Clarifying and optimizing Integer>>asFloat
Otherwise, you can yet have an idea of algorithm with this SO question Converting from unsigned long long to float with round to nearest even
You don't say explicitly, but I assume your library is integer only and the unsigned longs are 32 bit and binary (not decimal). The conversion to double is simple, so I'll tackle that first.
Start with a multiplier for the current piece; if the number is positive it will be 1.0, if negative it will be -1.0. For each of the unsigned long ints in your bignum, multiply by the current multiplier and add it to the result, then multiply your multiplier by pow(2.0, 32) (4294967296.0) for 32 bits or pow(2.0, 64) (18446744073709551616.0) for 64 bits.
You can optimize this process by working with only the 2 most significant values. You need to use 2 even if the number of bits in your integer type is larger than the precision of a double, since the number of used bits in the most significant value might only be 1. You can generate the multiplier by taking a power of 2 to the number of skipped bits, e.g. pow(2.0, most_significant_count*sizeof(bit_array[0])*8). You can't use a bit shift as given in another answer because it will overflow after the first value.
To convert from double, you can get the exponent and mantissa separated from each other with the frexp function. The mantissa will come as a floating point value between 0.5 and 1.0 so you'll want to multiply it by pow(2.0, 32) or pow(2.0, 64) to convert it to an integer, then adjust the exponent by -32 or -64 to compensate.
To go from a big integer to a double, just do it the same way you parse numbers. For example, you parse the number "531" as "1 + (3 * 10) + (5 * 100)". Compute each portion using doubles, starting with the least significant portion.
To go from a double to a big integer, do it the same way but in reverse starting with the most significant portion. So, to convert 531, you first see that it's more than 100 but less than 1000. You find the first digit by dividing by 100. Then you subtract to get the remainder of 31. Then find the next digit by dividing by 10. And so on.
Of course, you won't be using tens (unless you store your big integers as digits). Exactly how you break it apart depends on how your big integer class is constructed. For example, if it's uses 64-bit units, then you'll use powers of 2^64 instead of powers of 10.