Bitwise calculation of an LFSR sequence using CRC-style notation - bit-manipulation

My question stems from the observation made that we can use a Linear Feedback Shift Register to perform a CRC check. Algebraically this normally is of the form;
S(x) = M(x) * x^k % G(x) ( gives the remainder, for a G(x) of order k)
The implementation of this is shown in this question, (and registers are all initialised to zero) and the mathematical bitwise calculation of the XOR division is shown in this question here.
I understand both of these - however, I also know that another common way of using an LFSR is to have no input, but instead preload the registers with non-zero values, and run (with zero as an input) to form a sequence of pseudo random numbers. This is shown in the image below
My question is, just as the CRC can be represented as a modulo-2 division both bitwise and algebraically, can we do the same for an LFSR sequence generator, given the generator polynomial and initial values? And if so, an example would be great!
Thanks very much, feel free to correct me if I've misrepresented or misunderstood a concept!

Related

Identifying polynomial terms of the CRC

I was looking at this page and I saw that the terms of this polynomial:
0xad0424f3 = x^32 +x^30 +x^28 +x^27 +x^25 +x^19 +x^14 +x^11 +x^8 +x^7 +x^6 +x^5 +x^2 +x +1
which seems not correct since converting the Hex:
0xad0424f3 is 10101101000001000010010011110011
It would become:
x^31+ x^29+ x^27+ x^26+ x^24+ x^18+ x^13+ x^10+ x^7+ x^6+ x^5+ x^4+ x^1+ x^0
Can you help me understand which one is correct?
what about 64 bit ECMA polynomial,
0xC96C5795D7870F42
I want to know the number of terms in each polynomial 0xad0424f3 and 0xC96C5795D7870F42.
That page is on Koopman's web site, where he has his own notation for CRC polynomials. Since all CRC polynomials have a 1 term, he drops that term, divides the polynomial by x, and represents that in binary. That's what you're looking at.
The benefit is that with a 64-bit word, you can then represent all 64-bit and shorter CRC polynomials, with the length of the CRC denoted by the most significant 1 in the word.
The downside is that only Koopman uses that notation, as far as I know, resulting in some confusion by others. Like yourself.
As for your 64-bit CRC, that polynomial that you note is from the Wikipedia page is actually the reversed version, and is not in Koopman's notation. The expansion into a polynomial is shown right there, underneath the hex representation. It has 34 terms.

Why CRC seed is called polynomial?

Why is CRC seed is called polynomial? Is their any significance in related to the CRC algorithm by calling it a polynomial? Can't we just say n bits random binary number.
Think of the bits in a CRC of N bits as the coefficients of a polynomial of degree N-1. So the if we had a CRC of 1101 that would be x^3 + x^2 + 1. Usually they are much larger. And when working with message digests and similar algorithms, the texts on which they are applied are also considered polynomials of extremely high degree. It is simply a way of looking at them that lends itself to mathematical analysis.

crc32 hash default/invalid value?

I am building a simple string ID system using crc32 to generate 32 bit integer handles from my strings. I'd like to default the hash inside my StringID wrapper class to an invalid index by default, is there a value that crc32 will never generate? Will I have to use a separate flag?
Clarification: I am not interested in language specific answers. I'd simply like to know if there is an integer outside of the crc32 range that can be used to represent an unhashed value.
Thanks!
Is there a value that crc32 will never generate?
No, it will generate any/all values in the range of a 32-bit integer.
Will I have to use a separate flag?
Not necessarily.
If you decide that (e.g.) 0x00000000 means "CRC not set" and non-zero is the CRC value; then after calculating the CRC (but before storing it or checking the stored value) you can do if(CRCvalue == 0) CRCvalue = 0xFFFFFFFF;.
This weakens the CRC by an extremely tiny amount. Specifically, for 2 random pieces of data, for pure CRC32 there's 1 chance in 4294967296 of the CRCs matching, and with "zero means unset" there's 1 chance in 4294967295.000000000232830643654 of the CRCs matching.
There is an easy demonstration to the fact that you can generate any crc32 value, as it is de division mod P (where P is the generator polynomial) in a galois field (which happens to be a field, as real or complex numbers are), you can subtract (this is a XOR operation, so adding and subtracting are indeed the same thing) to your polynomial its modulus, giving a 0 remainder, then you can add to this multiple of the modulus any of all the possible crc32 values to it (as they are already remainders of divisions, their crc32 is just themselves) to get any of the 2^32 possible values.
It is a common practice to add as many zero bits as necessary to complete a full 32 bit word (this appears as a multiplication by a constant value x^32), and then subtract(xor) the remainder to that, making the result a multiple of of the modulus (remember that the addition and subtraction are the same ---a xor operation) and so making the crc32(pol) = 0x0000;
edit(easier to see)
Indeed, each of the possible 2^32 values for crc32, when divided by the generator polynomial, give themselves as a result (they are coprime with the generator polynomial, as are the numbers 1 .. N when doing arithmetic modulo N on integers) so they all are possible results of the crc32() operator.
The crc operation, as implemented in many places, is not that simple... as some implementations initialize the remainder register as 0xffffffff and look for 0xffffffff at termination(indeed, crc32 does this).... If you do the maths, you'll guess the reason for that: Initializing the register to 0x11111111 is equivalent to having a previous remainder of 0xffffffff in a longer string... and looking for 0xffffffff at the end is like appending 0xffffffff to the original string. This has the effect of concatenating the bit string 0xffffffff before and after your string, making the remainder sensible to appends of a string of zeros before and after the crc32 calculated string (altering the string of bits by appending zeros at either side). Anyway, this modification doesn't alter the original algorithm of calculating a polynomial remainder, so any of the 2**32 values are possible also in this case.
No. A CRC-32 can be any 32-bit value. You'll need to indicate an invalid index somewhere else.
My spoof code allows you to choose bit locations in a message to modify and the desired CRC, and will solve for which of those locations to flip to get exactly that CRC.

why using a modulo-2 arithmetic in crc?

I learned an error detection technique called crc. crc calculations are done in modulo-2 arithmetic without carries in addition or borrows in subtraction. I wonder the reason why crc takes modulo-2 arithmetic rather than regular binary arithmetic. Is it easier to be implemented in digital circuit?
Maybe better late than never for an answer. A CRC treats the data as a string of 1 bit coefficients of a polynomial, since the coefficients are numbers modulo 2. From a math perspective, for an n bit CRC, the data polynomial is multiplied by x^n, effectively adding n 0 bit coefficients to the data, then dividing that data + zeroes by a n+1 bit CRC polynomial, resulting in a n bit remainder, which is the CRC. If "encoding" the data with a CRC, the remainder is "subtracted" from the data + zeroes, but for single bit coefficients, adding and subtracting are both xor, so the CRC is just appended to the data, replacing the zeroes that were appended to the data to generate the CRC.
The reason for no carries or borrows across coefficients is because it's polynomial math.
Although not done for CRC, Reed Solomon codes are somewhat similar, but the polynomial coefficients may be numbers modulo some prime number other than 2, such as 929, and for prime numbers other than 2, it's important to keep track of when addition or subtraction modulo a prime number other than 2 is used.
https://en.wikipedia.org/wiki/Reed%E2%80%93Solomon_error_correction

CRC Procedure - Checking Efficiently

Let us get an m bit-message where the last n bits are the CRC bits. As far as I know, in order to check if it is received correctly or not, we should XOR all m bits with the polynomial of the specific CRC algorithm. If the result is all-zeros, we can say there are no errors.
Here are my questions:
1) What about calculating the n CRC bits using the first (m-n) bits and then compare it to the last n bits of the received message? This way we can say there are no errors if the received and calculated n bits are equal. Is this approach true?
2) If it is true, which is more efficient?
Your description of how to check a CRC doesn't really parse. But anyway, yes, the way that a CRC check is normally done is to calculate the CRC of the pre-CRC bits, and then to compare that to the CRC sent. It is very marginally more efficient that way. More importantly, it is more easily verifiable to be correct, since that is the way the CRC is generated and appended on the other end.
That method extends to any style of check value, where other check values do not have the mathematical property of getting zero if you run the CRC through the algorithm after the data that precedes it. Also CRCs with pre- and post-conditioning, which is most of them, won't have that property either. You would need to un-post-condition, and then compare the result with the pre-condition value.