Could somebody help me with CRC division? - crc

I am doing practice problems for midterms. The question is as follows:
Suppose we want to transmit a message 11001001 and protect it from error using the CRC polynomial x^3+1. Use polynomial long division to determine the message that should be transmitted (show all steps to get CRC bits and the complete message transmitted).
In this only solution I could find, the long division stops before the final zero. For the work I have done, there's an extra 1 in the quotient (pic attached). Why is the work in the online solution so different to mine?

Your solution is correct, the solution in the link is missing the 1 bit. In hex the example is dividing 648 / 9, quotient = d3, remainder = 3, which is what you have in your question. (It was recommended to post this as an answer so others browsing question would know it has an answer).

Related

Guideline of choosing a polynomial in CRC for a given message

I'm trying to code on performing cyclic redundancy checksum for a given input message with a polynomial.
For specific examples, I realize that for some polynomials, even if the message sent to the receiver side is wrong, the CRC will output as if there is no error.
What are general guidelines on choosing a polynomial that will detect errors and what factors determine it (e.g. is it dependent on the message size, does it have to do anything with parity, does longer polynomials help catch more errors)?
For example, I am given the message on the receiver side 1101 with the polynomial 10 where the CRC is generated according to the even parity.
First, I perform binary long division and get the remainder as 0.
Then, I append it to the message and send it as 11010
The problem is on the receiver side, where even if the received message is wrong, the CRC will not detect the error since the last digit 0 is always divisible by 2 regardless of the message. For example, 11110, 10000, ... etc will be undetected.
If by "polynomial 10" you mean the polynomial x, then that is not a valid CRC polynomial. A CRC polynomial must always end with a 1. The one-bit CRC polynomial is x+1, or 11 in your notation. x gives you a zero-bit CRC!
As for guidelines on choosing a polynomial, look at Koopman's research and resulting good performance CRCs for various message lengths.

how to correct single bit error with CRC?

If we are sure that we are in "single error" mode , how we can correct that error using CRC gotten and CRC expected? i know how to detect errors but how to correct?
Depending on the number of bits in a message (data + crc), and the CRC polynomial, a single bit error can be corrected. In order for this to work, every single bit error would have to produce a unique CRC. If there are any duplicates, it won't work, but a different CRC polynomial might solve the issue.
If the number of bits is not too large, a table can be used. Each entry in the table would contain a CRC and the bit index of the error. The table can be sorted by CRC so that a binary search can be used.
Another option is to compute the CRC, then reverse cycle the CRC until only the least significant bit is 1 while the rest are 0. This can be expanded to handle single burst correction, reverse cycling the CRC until more than half of the most significant bits are 0, depending on the CRC and message length.
http://www2.hawaii.edu/~tmandel/papers/CRCBurst.pdf
CRCBurst.pdf's algorithm is similar to reverse cycling a CRC, except it requires the least significant bit to be 1, which is an issue for a short burst at the beginning of a message, but if using reverse cycling, the CRC can be backwards cycled until least significant bit is 1, and the leading bits of the CRC that correspond to bits that would precede the message are ignored.
There is a 32 bit CRC that can correct up to 3 error bits for a message with 1024 bits (992 data, 32 CRC), but the table is huge (1.4 GB):
https://stackoverflow.com/a/62201417/3282056
Link to example code:
https://github.com/jeffareid/misc/blob/master/crccor3.c
An error correcting BCH code could be used for 992 bits of data and 30 parity bits for 3 error bit correction.
A CRC is not an error-correcting code, and does not have the information required in general to locate the error, even if you assume that there is only one bit in error. You don't even know if the bit in error is in the message or in the CRC. A CRC is an error-detecting code.
If you have a short enough message, there are ways to locate where the error may be. See https://stackoverflow.com/a/6169837/1180620
There are many error-correcting codes you could choose from. Reed-Solomon codes are commonly used, and can be tuned to your application with the choice of n and k.

Software implementation of floating point division, issues with rounding

As a learning project I am implementing floating point operations (add, sub, mul, div) in software using c++. The goal is to be more comfortable with the underlying details of floating point behavior.
I am trying to match my processor operations to the exact bit, meaning IEEE 754 standard. So far it has been working great, add, sub and mult behave perfectly, I tested it on around 110 million random operations and got the same exact result to what the processor does in hardware. (Although did not take into account edge cases, overflow etc).
After that, I started moving to the last operation, division. It works fine and achieves the wanted result, but from time to time, I get the last mantissa bit wrong, not rounded up. I am having a bit of hard time understanding why.
The main reference I have been using is the great talk from John Farrier (the time stamp is at the point where it shows how to round):
https://youtu.be/k12BJGSc2Nc?t=1153
That rounding has been working really well for all operation but is giving me troubles for the division.
Let me give you a specific example.
I am trying to divide 645.68011474609375 by 493.20962524414063
The final result I get is :
mine : 0-01111111-01001111001000111100000
c++_ : 0-01111111-01001111001000111100001
As you can see everything matches except for the last bit. The way I am computing the division is based on this video:
https://www.youtube.com/watch?v=fi8A4zz1d-s
Following this, I compute 28 bits off accuracy 24 of mantissa ( hidden one + 23 mantissa) and the 3 bits for guard, round sticky plus an extra one for the possible shift.
Using the algorithm of the video, I can at maximum get a normalization shift of 1, that s why I have an extra bit at the end in case gets shifted in in the normalization, so will be available in the rounding. Now here is the result I get from the division algorithm:
010100111100100011110000 0100
------------------------ ----
^ grs^
|__ to be normalized |____ extra bit
As you can see I get a 0 in the 24th position, so I will need to shift on the left by one to get the correct normalization.
This mean I will get:
10100111100100011110000 100
Based on the video of John Farrier, in the case of 100 grs bits, I only normalize if the LSB of the mantissa is a 1. In my case is a zero, and that is why I do not round up my result.
The reason why I am a bit lost is that I am sure my algorithm is computing the right mantissa, I have double checked it with online calculators, the rounding strategy works for all the other operations. Also, computing in this way, triggers the normalization, which yields, in the end, the correct exponent.
Am I missing something ? a small detail somewhere?
One thing that strikes me as odd is the sticky bits, in the addition and multiplication you get a different degree of shifting, which leads to higher chances of the sticky bits to trigger, in this case here, I shift only by one maximum which puts the sticky bits as to be not really sticky.
I do hope I gave enough details to make my problem understood. Here you can find at the bottom my division implementation, is a bit filled with prints I am using for debugging but should give an idea of what I am doing, the code starts at line 374:
https://gist.github.com/giordi91/1388504fadcf94b3f6f42103dfd1f938
PS: meanwhile I am going through the "everything scientist should know about floating point numbers" in order to see if I missed something.
The result you get from the division algorithm is inadequate. You show:
010100111100100011110000 0100
------------------------ ----
^ grs^
|__ to be normalized |____ extra bit
The mathematically exact quotient continues:
010100111100100011110000 0100 110000111100100100011110…
Thus, the residue at the point where you are rounding exceeds ½ ULP, so it should be rounded up. I did not study your code in detail, but it looks like you may have just calculated an extra bit or two of the significand1. You actually need to know that the residue is non-zero, not just whether its next bit or two is zero. The final sticky bit should be one if any of the bits at or beyond that position in the exact mathematical result would be non-zero.
Footnote
1 “Significand” is the preferred term. “Mantissa” is a legacy term for the fraction portion of a logarithm. The significand of a floating-point value is linear. A mantissa is logarithmic.

Division by Multiplication and Shifiting

Why when you use the multiplication/shift method of division (for instance multiply by 2^32/10, then shift 32 to the right) with negative numbers you get the expected result minus one?
For instance, if you do 99/10 you get 9, as expected, but if you do -99 / 10 you get -10.
I verified that this is indeed the case (I did this manually with bits) but I can't understand the reason behind it.
If anyone can explain why this happens in simple terms I would be thankful.
Why when you use the multiplication/shift method of division (for instance multiply by 2^32/10, then shift 32 to the right) with negative numbers you get the expected result minus one?
You get the expected result, rounded down.
-99/10 is -9.9 which is -10 rounded down.
Edit: Googled a bit more, this article mentions that you're supposed to handle negatives as a special case:
Be aware that in the debug mode the optimized code can be slower, especially if you have both negative and positive numbers and you have to handle the sign yourself.

How to correct a message using Hamming Code

So I want to work on this summer project to correct errors in a message transmission using Hamming Code, but I cannot figure out how it really works. I've read many articles online, but I don't really understand the algorithm. Can anybody explain it in simple terms?
Thanks.
It's all about Hamming distance.
The Hamming distance between two base-2 values is the number of bits at which they differ. So if you transmit A, but I receive B, then the number of bits which must have been switched in transmission is the Hamming distance between A and B.
Hamming codes are useful when the bits in each code word are transmitted somehow separately. We don't care whether they're serial or parallel, but they aren't for instance combined into an analogue value representing several bits, or compressed/encrypted after encoding.
Thus, each bit is independently (at random with some fixed probability), either received correctly, or flipped. Assuming the transmission is fairly reliable, most bits are received correctly. So errors in a small number of bits are more likely, and simultaneous errors in large numbers of bits are unlikely.
So, a Hamming code usually aims to correct 1-bit errors, and/or to detect 2-bit errors (see the Wikipedia article for details of the two main types). Codes which correct/detect bigger errors can be constructed, but AFAIK aren't used as much.
The code works by evenly spacing out the code points in "Hamming space", which in mathematical terms is the metric space consisting of all values of the relevant word size, with Hamming distance as the metric. Imagine that each code point is surrounded by a little "buffer zone" of invalid values. If a value is received that isn't a code point, then an error must have occurred, because only valid code points are ever transmitted.
If a value in the buffer zone is received, then on the assumption that a 1-bit error occurred, the value which was transmitted must be distance 1 from the value received. But because the code points are spread out, there is only one code point that close. So it's "corrected" to that code point, on grounds that a 1-bit error is more likely than the greater error that would be needed for any other code point to produce the value received. In probability terms, the conditional probability that you sent the nearby code point is greater than the conditional probability that you send any other code point, given that I received the value I did. So I guess that you sent the nearby one, with a certain confidence based on the reliability of the transmission and the number of bits in each word.
If an invalid value is received which is equidistant from two code points, then I can't say that one is more likely to be the true value than the other. So I detect the error, but I can't correct it.
Obviously 3-bit errors are not corrected by a SECDED Hamming code. The received value is further from the value you actually sent, than it is to some other code point, and I erroneously "correct" it to the wrong value. So you either need transmission reliable enough that you don't care about them, or else you need higher-level error detection as well (for example, a CRC over an entire message).
Specifically from Wikipedia, the algorithm is as follows:
Number the bits starting from 1: bit 1, 2, 3, 4, 5, etc.
Write the bit numbers in binary. 1, 10, 11, 100, 101, etc.
All bit positions that are powers of two (have only one 1 bit in the binary form of their position) are parity bits.
All other bit positions, with two or more 1 bits in the binary form of their position, are data bits.
Each data bit is included in a unique set of 2 or more parity bits, as determined by the binary form of its bit position.
Parity bit 1 covers all bit positions which have the least significant bit set: bit 1 (the parity bit itself), 3, 5, 7, 9, etc.
Parity bit 2 covers all bit positions which have the second least significant bit set: bit 2 (the parity bit itself), 3, 6, 7, 10, 11, etc.
Parity bit 4 covers all bit positions which have the third least significant bit set: bits 4–7, 12–15, 20–23, etc.
Parity bit 8 covers all bit positions which have the fourth least significant bit set: bits 8–15, 24–31, 40–47, etc.
In general each parity bit covers all bits where the binary AND of the parity position and the bit position is non-zero.
The wikipedia article explains it quite nicely.
If you don't understand a specific aspect of the algorithm, then you will need to rephrase (or detail) your question, so that someone can address your specific part of the problem.