Hamming code given a generator matrix question - error-correction

Can I just say from the outset that this isn't a homework question as I'm way to
old for that. But is related to an open source radio decoder project I'm working on ..
http://github.com/IanWraith/DMRDecode
Part of the radio protocol I'm interested uses a Hamming (7,4,3) code to protect
4 bits in a particular part of a data packet. So for every 4 bits of data it adds
3 parity bits which is easy enough for me even 20 years after I studied this at
technical college. The specification document just gives the Hamming generator matrix which is as follows
1000 101
0100 111
0010 110
0001 011
DDDD HHH
1234 210
Now my question is does this mean the following ..
H2 is the XORed product of D1 , D2 , D3
H1 is the XORed product of D2 , D3 , D4
H0 is the XORed product of D1 , D2 , D4
or have I got this horribly wrong ?
Thanks for your time.
Ian

For the generator matrix you give, your interpretation is correct. Your tables do mean:
H0 = D1 ^ D2 ^ D4
H1 = D2 ^ D3 ^ D4
H2 = D1 ^ D2 ^ D3
However, the normal Hamming(7,4) matrix, in the same notation would be
1000 011
0100 101
0010 110
0001 111
DDDD HHH
1234 210
Only H0 is the same among the two sets of matrices. The other two bits are
H1 = D1 ^ D3 ^ D4
H2 = D2 ^ D3 ^ D4
It would be handy to be sure that the specification actually matches what's done in practice.
Equally critical is the specification for the order of the bits in the transmitted word. For instance, for the typical Hamming(7,4) encoding, the order
H0, H1, D1, H2, D2, D3, D4
has the property that the XOR with the parity check matrix tells you either (1) that all bits seem to be correct (== {0,0,0}) or (2) one bit appears to be wrong and it is the one in the bit position given by the result of the parity check matrix. I.e., if the three bits returned from multiplying the received code by the parity check matrix are {1, 0, 1}, then the 5th bit (101 interpreted in base 2) has been flipped. In the above ordering, this means D2 has been flipped.

This article, Hamming(7,4), will tell you more than you want to know about how to construct the parity bits and where they are encoded into the output.

Related

6-bit CRC datasheet confusion STMicroelectronics L9963E

I’m working on the SPI communication between a microcontroller and the L9963E. The datasheet of the L9963E shows little information about the CRC calculation, but mentions:
a CRC of 6 bits,
a polynomial of X6 + X4 + X3 + 1 = 0b1011001
a seed of 0b111000
The documentation also mentions in the SPI protocol details that the CRC is a value between 0x00 and 0x3F, and is "calculated on the [39-7] field of the frame", see Table 22.
I'm wondering: What is meant by "field [39-7]"? The total frame length is 40bits, out of this the CRC makes up 6 bits. I would expect a CRC calculation over the remaining 34 bits, but field 39-7 would mean either 33 bits (field 7 inclusive) or 32 bits (excluding field 7).
Since I have access to a L9963E evaluation board, which includes pre-loaded firmware, I have hooked up a logic analyser. I have found the following example frames to be sent to the L9963E from the eval-board, I am assuming that these are valid, error-free frames.
0xC21C3ECEFD
0xC270080001
0xE330081064
0xC0F08C1047
0x827880800E
0xC270BFFFF9
0xC2641954BE
Could someone clear up the datasheet for me, and maybe advise me on how to implement this CRC calculation?
All of the special frames have CRC == 0:
(0000000016 xor E000000000) mod 59 = 00
(C1FCFFFC6C xor E000000000) mod 59 = 00
(C1FCFFFC87 xor E000000000) mod 59 = 00
(C1FCFFFCDE xor E000000000) mod 59 = 00
(C1FCFFFD08 xor E000000000) mod 59 = 00
and as noted by Mark Adler only one of the messages in the question works:
(C270BFFFF9 xor E000000000) mod 59 = 00
Bit ranges in datasheets like this are always inclusive.
I suspect that this is just a typo, or the person who wrote it temporarily forgot that the bits are numbered from zero.
Looking at the other bit-field boundaries in Table 19 of the document you linked it wouldn't make sense to exclude the bottom bit of the data field from the CRC, so I suspect the datasheet should say bits 39 to 6 inclusive.
There is a tool called pycrc that can generate C code to calculate a CRC with an arbitrary polynomial.
If a 40-bit message is in the low bits of uint64_t x;, then this:
x ^= (uint64_t)0x38 << 34;
for (int j = 0; j < 40; j++)
x = x & 0x8000000000 ? (x << 1) ^ ((uint64_t)0x19 << 34) : x << 1;
x &= 0xffffffffff;
gives x == 0 for all of the example messages in the document. But only for one of your examples in the question. You may not be extracting the data correctly.

how does CF(Carry flag) get set according to the computation t = a-b where a and b are unsigned integers

I'm new to x64-64, just a question on how does CF get set? I was reading a textbook which says:
CF: Carry flag is used when most recent operation generated a carry out of the most significant bit. Used to detect overflow for unsigned operations.
I have two questions:
Q1-suppose we used one of the add instructions to perform the equivalent of the C assignment t = a+b, where variables a, b, and t are integers (only 3 bits for simplicity), so for 011(a) + 101(b) = 1000 = 000, since we have a carry out bit 1 in the fourth digit, so CF flag will be set to 1, is my understanding correct?
Q2-if my understanding in Q1 is true, and suppose we used one of the sub instructions to perform the equivalent of the C assignment t = a-b, where a, b, and t are unsigned integers, since a, b are unsigned, we can't actually do a+(-b), and I don't get how we can make 011(a) - 101(b) carry out of the most significant bit?
The carry flag is often called "borrow" when performing a subtraction. After a subtraction, it set if a 1 had to be borrowed from the next bit (or would have been borrowed if you used the grade-school subtraction method). The borrow flag is like a -1 in that bit position:
011 -1 211
- 101 -> - 101
----- -----
B 110
You can get the same result by adding a zero to the arguments, and then the carry or borrow will be the high bit of the result:
0011 - 0101 = 0011 + (-0101) = 0011 + 1011 = 1110

uint8_t different decimal values for same binary

I have the following issue using an ARM® Cortex™-M4F CPU running mbedOS 5.9.
Say I have the binary value 10101000 and that I also have the following union/struct:
union InputWord_u
{
uint8_t all;
struct BitField_s
{
uint8_t start : 1; // D7
uint8_t select : 3; // D6, D5, D4
uint8_t payload : 4; // D3, D2, D1, D0
} bits;
};
I have a simple program where I access my word and assign the values as such:
InputWord_u word;
word.bits.start = 0b1;
word.bits.select = 0b010;
word.bits.payload = 0b1000;
Therefore, word.all == 10101000 and is a uint8_t.
If I print this as such printf("%u", word.all); then I receive the value of 133.
If I then define the following uint8_t:
uint8_t value = 0b10101000;
And print this using printf("%u", value); then I receive the value 168.
I expect both values to equal 168.
I appreciate that this is likely me grossly misunderstanding how a Struct is represented in memory. Nevertheless, could someone please explain what is exactly going on?
Thanks.
The standard guarantees hardly anything about the representation of bit-fields.
Therefore, word.all == 10101000
What you've tripped over here is that you've assumed that the bit-fields are packed starting from most significant bit to least significant.
However, it appears that your bit fields were stored in the reverse order, and in fact word.all == 1000'010'1. To get the result you expect, you could reorder the bit-fields:
struct BitField_s
{
uint8_t payload : 4; // D3, D2, D1, D0
uint8_t select : 3; // D6, D5, D4
uint8_t start : 1; // D7
} bits;
But be aware that bit-fieds are not portable: Other systems might not have the same order.
The problem is, you calculated value in reversed way, like
(start << 7) | (select << 4) | payload
And actual value is calculated like
(payload << 4) | (select << 1) | start
So your bitfield starts with less-significant parts of uint8. It has nothing to do with little-endianness of system, because little-endianness defines orders of bytes in uint16, uint32 e.t.c.
Order of bits of bitfield inside byte is defined by compiler. For example, MSVC uses low-to-high order, as in your example.
Binary value of 133 and 168
133 = 10000101
168 = 10101000
sums it that actual alignment is different from your assumed alignment.
It seems that it as arranging in the following manner:
---- --- -
all
payload select start
And you are assuming the following order
- --- ----
start all payload
I also think different compiler has different alignment.

Concatenation of prefixes of a boolean array

I have a boolean array A of size n that I want to transform in the following way : concatenate every prefix of size 1,2,..n.
For instance, for n=5, it will transform "abcde" into "aababcabcdabcde".
Of course, a simple approach can loop over every prefix and use elementary bit operations (shift, mask, plus); so the complexity of this approach is obviously O(n).
There are some well known tricks regarding bit manipulations like here.
My question is: is it possible to achieve a faster algorithm for the transformation described above with a complexity better than O(n) by using bit manipulations ?
I am aware that the interest of such an improvement may be just theoretical because the simple approach could be the fastest in practice, but I am still curious about the fact that there exists a theoretical improvement or not.
As a precision, I need to perform this transformation p times, where p can be much bigger than n; some pre-computations could be done for a given n and used later for the computation of the p transformations.
I'm not sure if this is what you're looking for, but here is a different algorithm, which may be interesting depending on your assumptions.
First compute two masks that depend only on n, so for any particular n these are just constants:
C (copy mask), a mask that has every n'th bit set and is n² bits long. So for n = 5, C = 0000100001000010000100001. This will be used to create n copies of A concatenated together.
E (extract mask), a mask that indicates which bits to take from the big concatenation, which is built up from n times a block of n bits, with values 1, 3, 7, 15 ... eg for n = 5, E = 1111101111001110001100001. Pad the left with zeroes if necessary.
Then the real computation that takes an A and constructs the concatenation of prefixes is:
pext(A * C, E)
Where pext is compress_right, discarding the bits for which the extraction mask is 0 and compacting the remaining bits to the right.
The multiplication can be replaced by a "doubling" technique like this: (which can also be used to compute the C mask)
l = n
while l < n²:
A = A | (A << l)
l = l * 2
Which in general produces too many concatenated copies of A but you can just pretend the excess isn't there by not looking at it (the pext drops any excess input anyway). Efficiently producing E for an unknown and arbitrarily large n seems harder, but that's not really a use case for this approach anyway.
The actual time complexity depends on your assumptions, but of course in the full "arbitrary bit count" setting both multiplication and compaction are heavyweight operations and the fact that the output has a size quadratic in input size really doesn't help. For small enough n such that eg n² <= 64 (depending on word size), so everything fits in a machine word, it all works out well (even for unknown n, since all the required masks can be precomputed). On the other hand, for such small n it is also feasible to table the entire thing, doing a lookup for a pair (n, A).
I may have found another way to proceed.
The idea is to use multiplication to propagate the initial input I to the correct position. The multiplication coefficient J is the vector whose bits are set to one at position i*(i-1)/2 for i in [1:n].
However, a direct multiplication I by J will provide many unwanted terms, so the idea is to
mask some bits from vectors I and J
multiply these masked vectors
remove some junk bits from the result.
We have thus several iterations to do; the final result is the sum of the intermediate results. We can write the result as "sum on i of ((I & Ai) * Bi) & Ci", so we have 2 masks and 1 multiplication per iteration (Ai, Bi and Ci are constants depending on n).
This algorithm seems to be O(log(n)), so it is better than O(log(n)^2) BUT it needs multiplication which may be costly. Note also that this algorithm requires registers of size n*(n+1)/2 which is better than n^2.
Here is an example for n=7
Input:
I = abcdefg
We set J = 1101001000100001000001
We also note temporary results:
Xi = I & Ai
Yi = Xi * Bi
Zi = Yi & Ci
iteration 1
----------------------------
1 A1
11 1 1 1 1 1 B1
11 1 1 1 1 1 C1
----------------------------
a X1
aa a a a a a Y1
aa a a a a a Z1
iteration 2
----------------------------
11 A2
1 1 1 1 1 1 B2
1 11 11 11 11 11 C2
----------------------------
bc X2
bcbc bc bc bc bc Y2
b bc bc bc bc bc Z2
iteration 3
----------------------------
1111 A3
1 1 1 1 B3
1 11 111 1111 C3
----------------------------
defg X3
defgdefg defg defg Y3
d de def defg Z3
FINAL SUM
----------------------------
aa a a a a a Z1
b bc bc bc bc bc Z2
d de def defg Z3
----------------------------
aababcabcdabcdeabcdefabcdefg SUM

Understanding two different ways of implementing CRC generation with LFSR

There are two ways of implementing CRC generation with linear feedback shift registers (LFSR), as shown in this figure . The coefficients of generator polynomial in this picture are 100111, and the red "+" circles are exclusive-or operators. The initialization register values are 00000 for both.
For example, if the input data bit stream is 10010011, both A and B will give CRC checksum of 1010. The difference is A finishes with 8 shifts, while B with 8+5=13 shifts because of the 5 zeros appended to the input data. I can understand B very easily since it closely mimics the modulo-2 division. However, I can not understand mathematically how A can give the same result with 5 less shifts. I heard people were talking A took advantage of the pre-appending zeros, but I didn't get it. Can anyone explain it to me? Thanks!
Here is my quick understanding.
Let M(x) be the input message of order m (i.e. has m+1 bits) and G(x) be the CRC polynomial of order n. CRC result for such a message is given by
C(x) = (M(x) * xn) % G(x)
This is what the circuit B is implementing. The additional 5 cycles it takes is because of the xn operation.
Instead of following this approach, circuit A tries to do something smarter. Its trying to solve the question
If C(x) is the CRC of M(x), what would be the CRC for message {M(x), D}
where D is the new bit. So its trying to solve one bit at a time instead of entire message as in case of circuit b. Hence circuit A will take just 8 cycles for a 8 bit message.
Now since you already understand why circuit B looks the way it does, lets look at circuit A. The math, specifically for your case, for the effect of adding bit D to message M(x) on CRC is as below
Let current CRC C(x) be {c4, c3, c2, c1, c0} and new bit that is shifted in be D
NewCRC = {M(x), D}*x5) % G(x) = (({M(x), 0} * x5) % G(x)) XOR ((D * x5) % G(x))
which is ({c3, c2, c1, c0, 0} XOR {0, 0, c4, c4, c4}) XOR ({0, 0, D, D, D})
which is {c3, c2, c1^c4^D, c0^c4^D, c4^D}
i.e. the LFSR circuit A.
You can say that architecture (A) is implementing the modulo division by aligning MSB of the polyn with MSB of Message, so it is implementing something like the following (in my example I have another crc polyn actually):
But in Architecture (B), you can say we try to predict the MSB of the Message, so we align MSB of CRC polyn with MSB-1 of the message, something like the following:
I can recommend details about this operation in this tutorial