How does 0x04C11DB7L represent a polynomial? - crc

I was looking at CRC, and stumbled upon: https://www.naaccr.org/wp-content/uploads/2017/12/C-API-Implementation-CRC32.c
I understand CRC needs to have a 'magic' polynomial to generate lookup tables for CRC. I see that the parameter for 'Poly-nomial' is:
0x04C11DB7L
How does this value represent a polynomial?

As for me, it is quite elegant way to represent a polynomial. It works in the following way:
You have a hexadecimal number, e.g. 0x04C11DB7. It can be converted into the binary number: 0b100110000010001110110110111.
Let's consider each bit to be a coefficient before the corresponding power of x. The least significant bit corresponds to the 0'th power (i.e. term 1). In addition, the greatest power is omitted in binary representation, in our case hex-number is 32 bits long, hence 32'nd power of x is omitted.
Now we can write down the polynomial that is represented via 0x04C11DB7:
x32 + x26 + x23 + x22 + x16 + x12 + x11 + x10 + x8 + x7 + x5 + x4 + x2 + x + 1

The constant you mention appears in the source as part of this comment:
/* The following CRC lookup table was generated automagically */
/* by the Rocksoft^tm Model CRC Algorithm Table Generation */
/* Program V1.0 using the following model parameters: */
/* */
/* Width : 4 bytes. */
/* Poly : 0x04C11DB7L */
/* Reverse : TRUE. */
/* */
/* For more information on the Rocksoft^tm Model CRC Algorithm, */
/* see the document titled "A Painless Guide to CRC Error */
/* Detection Algorithms" by Ross Williams */
Plugging that paper name and author into a search engine turned up the author's "CRC Pitstop" website (gloriously untouched since 1996!) which hosts a copy of the paper in question.
It describes the "polynomial arithmetic" that CRC algorithms use, but then points out that you don't actually have to understand that part of the theory to understand or implement the arithmetic itself.
The key operation is actually just a special form of division, the free parameter which needs to be chosen is the divisor, and the checksum is the remainder after dividing the input by that parameter.
The use of "Poly" as the parameter name in the table generation algorithm turns out not to be a general abbreviation, but a deliberate choice of term to downplay the role of polynomial arithmetic:
To perform a CRC calculation, we need to choose a divisor. In maths marketing speak the divisor is called the "generator polynomial" or simply the "polynomial", and is a key parameter of any CRC algorithm. It would probably be more friendly to call the divisor something else, but the poly talk is so deeply ingrained in the field that it would now be confusing to avoid it. As a compromise, we will refer to the CRC polynomial as the "poly". Just think of this number as a sort of parrot. "Hello poly!"
So, far from being "a magic polynomial used to derive the lookup tables", the parameter is simply a number which you're going to use in a division operation. As the paper goes on to describe, the lookup tables are simply a way of optimising the long division - essentially, the values are various copies of the "poly" bit-shifted and XOR'd against one another to handle multiple bits of the input at once.

Related

Identifying polynomial terms of the CRC

I was looking at this page and I saw that the terms of this polynomial:
0xad0424f3 = x^32 +x^30 +x^28 +x^27 +x^25 +x^19 +x^14 +x^11 +x^8 +x^7 +x^6 +x^5 +x^2 +x +1
which seems not correct since converting the Hex:
0xad0424f3 is 10101101000001000010010011110011
It would become:
x^31+ x^29+ x^27+ x^26+ x^24+ x^18+ x^13+ x^10+ x^7+ x^6+ x^5+ x^4+ x^1+ x^0
Can you help me understand which one is correct?
what about 64 bit ECMA polynomial,
0xC96C5795D7870F42
I want to know the number of terms in each polynomial 0xad0424f3 and 0xC96C5795D7870F42.
That page is on Koopman's web site, where he has his own notation for CRC polynomials. Since all CRC polynomials have a 1 term, he drops that term, divides the polynomial by x, and represents that in binary. That's what you're looking at.
The benefit is that with a 64-bit word, you can then represent all 64-bit and shorter CRC polynomials, with the length of the CRC denoted by the most significant 1 in the word.
The downside is that only Koopman uses that notation, as far as I know, resulting in some confusion by others. Like yourself.
As for your 64-bit CRC, that polynomial that you note is from the Wikipedia page is actually the reversed version, and is not in Koopman's notation. The expansion into a polynomial is shown right there, underneath the hex representation. It has 34 terms.

Why CRC seed is called polynomial?

Why is CRC seed is called polynomial? Is their any significance in related to the CRC algorithm by calling it a polynomial? Can't we just say n bits random binary number.
Think of the bits in a CRC of N bits as the coefficients of a polynomial of degree N-1. So the if we had a CRC of 1101 that would be x^3 + x^2 + 1. Usually they are much larger. And when working with message digests and similar algorithms, the texts on which they are applied are also considered polynomials of extremely high degree. It is simply a way of looking at them that lends itself to mathematical analysis.

why using a modulo-2 arithmetic in crc?

I learned an error detection technique called crc. crc calculations are done in modulo-2 arithmetic without carries in addition or borrows in subtraction. I wonder the reason why crc takes modulo-2 arithmetic rather than regular binary arithmetic. Is it easier to be implemented in digital circuit?
Maybe better late than never for an answer. A CRC treats the data as a string of 1 bit coefficients of a polynomial, since the coefficients are numbers modulo 2. From a math perspective, for an n bit CRC, the data polynomial is multiplied by x^n, effectively adding n 0 bit coefficients to the data, then dividing that data + zeroes by a n+1 bit CRC polynomial, resulting in a n bit remainder, which is the CRC. If "encoding" the data with a CRC, the remainder is "subtracted" from the data + zeroes, but for single bit coefficients, adding and subtracting are both xor, so the CRC is just appended to the data, replacing the zeroes that were appended to the data to generate the CRC.
The reason for no carries or borrows across coefficients is because it's polynomial math.
Although not done for CRC, Reed Solomon codes are somewhat similar, but the polynomial coefficients may be numbers modulo some prime number other than 2, such as 929, and for prime numbers other than 2, it's important to keep track of when addition or subtraction modulo a prime number other than 2 is used.
https://en.wikipedia.org/wiki/Reed%E2%80%93Solomon_error_correction

Bitwise calculation of an LFSR sequence using CRC-style notation

My question stems from the observation made that we can use a Linear Feedback Shift Register to perform a CRC check. Algebraically this normally is of the form;
S(x) = M(x) * x^k % G(x) ( gives the remainder, for a G(x) of order k)
The implementation of this is shown in this question, (and registers are all initialised to zero) and the mathematical bitwise calculation of the XOR division is shown in this question here.
I understand both of these - however, I also know that another common way of using an LFSR is to have no input, but instead preload the registers with non-zero values, and run (with zero as an input) to form a sequence of pseudo random numbers. This is shown in the image below
My question is, just as the CRC can be represented as a modulo-2 division both bitwise and algebraically, can we do the same for an LFSR sequence generator, given the generator polynomial and initial values? And if so, an example would be great!
Thanks very much, feel free to correct me if I've misrepresented or misunderstood a concept!

How to fix the position of binary point in an unsigned N-bit interger?

I am working on developing a fixed point algorithm in C++. I know that, for a N-bit integer, the fixed point binary integer is represented as U(a,b). For example, for an 8 bit Integer (i.e 256 samples), If we represent it in the form U(6,2), it means that the binary point is to the left of the 2nd bit starting from the right of the form:
b5 b4 b3 b2 b1 b0 . b(-1) b(-2)
Thus , it has 6 integer bits and 2 fractional bits. In C++, I know there are some bit shift operators I can use, but they are basically used for shifting the bits of the input stream, my question is, how to define a binary fixed point integer of the form, fix<6,2> or U(6,2). All the major processing operation will be carried out on the fractional part and I am just finding a way to do this fix in C++. Any help regarding this would be appreciated.Thanks!
Example : Suppose I have an input discrete signal with 1024 sample points on x-axis (For now just think this input signal is coming from some sensor). Each of this sample point has a particular amplitude. Say the sample at time 2(x-axis) has an amplitude of 3.67(y-axis). Now I have a variable "int *input;" that takes the sample 2, which in binary is 0000 0100. So basically I want to make this as 00000.100 by performing the U(5,3) on the sample 2 in C++. So that I can perform the interpolation operations on fractions of the input sampling period or time.
PS - I don't want to create a separate class or use external libraries for this. I just want to take each 8 bits from my input signal, perform the U(a,b) fix on it followed by rest of the operations are done on the fractional part.
Short answer: left shift.
Long answer:
Fixed point numbers are stored as integers, usually int, which is the fastest integer type for a particular platform.
Normal integer without fractional bits are usually called Q0, Q.0 or QX.0 where X is the total number of bits of underlying storage type(usually int).
To convert between different Q.X formats, left or right shift. For example, to convert 5 in Q0 to 5 in Q4, left shift it 4 bits, or multiply it by 16.
Usually it's useful to find or write a small fixed point library that does basic calculations, like a*b>>q and (a<<q)/b. Because you will do Q.X=Q.Y*Q.Z and Q.X=Q.Y/Q.Z a lot and you need to convert formats when doing calculations. As you may have observed, using normal * operator will give you Q.(X+Y)=Q.X*Q.Y, so in order to fit the result into Q.Z format, you need to right shift the result by (X+Y-Z) bits.
Division is similar, you get Q.(X-Y)=Q.X*Q.Y form the standard / operator, and to get the result in Q.Z format you shift the dividend before the division. What's different is that division is an expensive operation, and it's not trivial to write a fast one from scratch.
Be aware of double-word support of your platform, it will make your life a lot easier. With double word arithmetic, result of a*b can be twice the size of a or b, so that you don't lose range by doing a*b>>c. Without double word, you have to limit the input range of a and b so that a*b doesn't overflow. This is not obvious when you first start, but soon you will find you need more fractional bits or rage to get the job done, and you will finally need to dig into the reference manual of your processor's ISA.
example:
float a = 0.1;// 0.1
int aQ16 = a*65536;// 0.1 in Q16 format
int bQ16 = 4<<16// 4Q16
int cQ16 = a*b>>16 // result = 0.399963378906250Q16 = 26212,
// not 0.4Q16 = 26214 because of truncating error
If this is your question:
Q. Should I define my fixed-binary-point integer as a template, U<int a, int b>(int number), or not, U(int a, int b)
I think your answer to that is: "Do you want to define operators that take two fixed-binary-point integers? If so make them a template."
The template is just a little extra complexity if you're not defining operators. So I'd leave it out.
But if you are defining operators, you don't want to be able to add U<4, 4> and U<6, 2>. What would you define your result as? The templates will give you a compile time error should you try to do that.