Error detection and correction - hamming-code

My message bit is 10011010, so code word for this is 0110 and now the codeword is 011100101010.
Suppose the error is in 10th bits and it becomes 011100101110, so finding parity bits:
p1=1+3+5+7+9+11=010111=even number of 1 therefore=0
p2=2+3+6+7+10+11=110111=1
p4=4+5+6+7=1001=0
p8=8+9+10+11+12=01110=1
Comparing with the message the parity is false for 4 and 8 position ie 4+8=12, but in fact we have made errors in 10 bit. Where have I made a mistake?

It works a little bit different. When you check parity, you don't use parity bits to count it (you count them now). So:
p1 = 3+5+7+9+11 = 10111 = 0 (OK)
p2 = 3+6+7+10+11 = 10111 = 0 (WRONG)
p4 = 5+6+7 = 010 = 1 (OK)
p8 = 9+10+11+12 = 1110 = 1 (WRONG)
So 2+8 = 10.

Related

Using AND bitwise operator between a number, and its negative counterpart

I stumbled upon this simple line of code, and I cannot figure out what it does. I understand what it does in separate parts, but I don't really understand it as a whole.
// We have an integer(32 bit signed) called i
// The following code snippet is inside a for loop declaration
// in place of a simple incrementor like i++
// for(;;HERE){}
i += (i&(-i))
If I understand correctly it uses the AND binary operator between i and negative i and then adds that number to i. I first thought that this would be an optimized way of calculating the absolute value of an integer, however as I come to know, c++ does not store negative integers simply by flipping a bit, but please correct me if I'm wrong.
Assuming two's complement representation, and assuming i is not INT_MIN, the expression i & -i results in the value of the lowest bit set in i.
If we look at the value of this expression for various values of i:
0 00000000: i&(-i) = 0
1 00000001: i&(-i) = 1
2 00000010: i&(-i) = 2
3 00000011: i&(-i) = 1
4 00000100: i&(-i) = 4
5 00000101: i&(-i) = 1
6 00000110: i&(-i) = 2
7 00000111: i&(-i) = 1
8 00001000: i&(-i) = 8
9 00001001: i&(-i) = 1
10 00001010: i&(-i) = 2
11 00001011: i&(-i) = 1
12 00001100: i&(-i) = 4
13 00001101: i&(-i) = 1
14 00001110: i&(-i) = 2
15 00001111: i&(-i) = 1
16 00010000: i&(-i) = 16
We can see this pattern.
Extrapolating that to i += (i&(-i)), assuming i is positive, it adds the value of the lowest set bit to i. For values that are a power of two, this just doubles the number.
For other values, it rounds the number up by the value of that lowest bit. Repeating this in a loop, you eventually end up with a power of 2. As for what such an increment could be used for, that depends on the context of where this expression was used.

Walkthrough: sum 2 integers using bit manipulation

I am trying to understand the logic behind the following code which sums 2 integers using bit manipulation:
def sum(a, b):
while b != 0:
carry = a & b
a = a ^ b
b = carry << 1
return a
As an example I used: a = 11 and b = 7
11 in binary representation is 1011
7 in binary representation is 0111
Then I walked through the algorithm:
iter #1: a = 1011, b = 0111
carry = 0011 (3 decimal)
a = 1100 (12 decimal)
b = 0110 (6 decimal)
iter #2: a = 1100, b = 0110
carry = 0100 (4 decimal)
a = 1010 (10 decimal)
b = 1000 (8 decimal)
iter #3: a = 1010, b = 1000
carry = 1000 (8 decimal)
a = 00010 (2 decimal)
b = 10000 (16 decimal)
iter #4: a = 00010, b = 10000
carry = 00000 (0 decimal)
a = 10010 (18 decimal)
b = 00000 (0 decimal)
We Done (because b is now 0).
As we can see, in all iterations a+b is always 18 which is the right answer.
However I failed to understand what is actually happens here. The value of a is going down and down with each iteration until suddenly pops to 18 in the last iteration. Also, can we learn anything from the value of the carry during the process?
I would love to understand the intuition behind this.
Thanks to #WJS answer I think I got it.
let's add 11 and 7 as before, but let's do it in the following order:
First, calculate it without the carry.
Second, calculate only the carry.
Then add both parts.
01011
00111
-----
01100 (neglecting carry)
00110 (finding only the carry)
-----
10010 (sum)
Now, to find the first part, how can we get rid of the carry bits? with XOR.
To find the second part, we use AND and then shift it 1 bit left to place it "under" the right bit.
Now all we have to do is sum both parts. The whole point is not using + operator so how can we do that? Recursion!
We assign the first part to a and the second part to b and we repeat this process until b=0 which means we are done.
Perhaps if you take a simpler example it will help.
a = 11
b = 11
a & b == 11 since AND returns 1's where both bits in the same
position are 1. These are the carry bits.
Now get rid of the the carry locations using exclusive or
a = a ^ b == 00
But a `carry` would cause addition to add bits one position to
the left so shift the carry bits left by 1 bit.
b = carry << 1 = 110
now repeat the process
carry = a & b = 0 & 110 == 0 no more carries
b = carry << 1 == 0
done.
11 + 11 = 110 = 3 + 3 = 6
Understanding the roles of (AND) & and (XOR) ^ are key. Applying those to slightly more complex examples should help. But ignore the interim decimal values as they don't help much. Think only about what is happening in binary.
I think this is easy to understand if you look at what happens with individual bits.
First step is calculating carry which only happens in binary when both bits are 1, so a&b calculates that for every bit. Then bitwise addition is happening via XOR (ignoring carry), and XOR works because:
0+0=0 (==0^0)
1+0=1 (==1^0)
1+1=0 (==1^1, generates carry bit which we ignore)
Next step is to shift carry to the left (<<1), move it to b and repeat until carry is empty.

Optimal way to compress 60 bit string

Given 15 random hexadecimal numbers (60 bits) where there is always at least 1 duplicate in every 20 bit run (5 hexdecimals).
What is the optimal way to compress the bytes?
Here are some examples:
01230 45647 789AA
D8D9F 8AAAF 21052
20D22 8CC56 AA53A
AECAB 3BB95 E1E6D
9993F C9F29 B3130
Initially I've been trying to use Huffman encoding on just 20 bits because huffman coding can go from 20 bits down to ~10 bits but storing the table takes more than 9 bits.
Here is the breakdown showing 20 bits -> 10 bits for 01230
Character Frequency Assignment Space Savings
0 2 0 2×4 - 2×1 = 6 bits
2 1 10 1×4 - 1×2 = 2 bits
1 1 110 1×4 - 1×3 = 1 bits
3 1 111 1×4 - 1×3 = 1 bits
I then tried to do huffman encoding on all 300 bits (five 60bit runs) and here is the mapping given the above example:
Character Frequency Assignment Space Savings
---------------------------------------------------------
a 10 101 10×4 - 10×3 = 10 bits
9 8 000 8×4 - 8×3 = 8 bits
2 7 1111 7×4 - 7×4 = 0 bits
3 6 1101 6×4 - 6×4 = 0 bits
0 5 1100 5×4 - 5×4 = 0 bits
5 5 1001 5×4 - 5×4 = 0 bits
1 4 0010 4×4 - 4×4 = 0 bits
8 4 0111 4×4 - 4×4 = 0 bits
d 4 0101 4×4 - 4×4 = 0 bits
f 4 0110 4×4 - 4×4 = 0 bits
c 4 1000 4×4 - 4×4 = 0 bits
b 4 0011 4×4 - 4×4 = 0 bits
6 3 11100 3×4 - 3×5 = -3 bits
e 3 11101 3×4 - 3×5 = -3 bits
4 2 01000 2×4 - 2×5 = -2 bits
7 2 01001 2×4 - 2×5 = -2 bits
This yields a savings of 8 bits overall, but 8 bits isn't enough to store the huffman table. It seems because of the randomness of the data that the more bits you try to encode with huffman the less effective it works. Huffman encoding seemed to work best with 20 bits (50% reduction) but storing the table in 9 or less bits isnt possible AFAIK.
In the worst-case for a 60 bit string there are still at least 3 duplicates, the average case there are more than 3 duplicates (my assumption). As a result of at least 3 duplicates the most symbols you can have in a run of 60 bits is just 12.
Because of the duplicates plus the less than 16 symbols, I can't help but feel like there is some type of compression that can be used
If I simply count the number of 20-bit values with at least two hexadecimal digits equal, there are 524,416 of them. A smidge more than 219. So the most you could possibly save is a little less than one bit out of the 20.
Hardly seems worth it.
If I split your question in two parts:
How do I compress (perfect) random data: You can't. Every bit is some new entropy which can't be "guessed" by a compression algorithm.
How to compress "one duplicate in five characters": There are exactly 10 options where the duplicate can be (see table below). This is basically the entropy. Just store which option it is (maybe grouped for the whole line).
These are the options:
AAbcd = 1 AbAcd = 2 AbcAd = 3 AbcdA = 4 (<-- cases where first character is duplicated somewhere)
aBBcd = 5 aBcBd = 6 aBcdB = 7 (<-- cases where second character is duplicated somewhere)
abCCd = 8 abCdC = 9 (<-- cases where third character is duplicated somewhere)
abcDD = 0 (<-- cases where last characters are duplicated)
So for your first example:
01230 45647 789AA
The first one (01230) is option 4, the second 3 and the third option 0.
You can compress this by multiplying each consecutive by 10: (4*10 + 3)*10 + 0 = 430
And uncompress it by using divide and modulo: 430%10=0, (430/10)%10=3, (430/10/10)%10=4. So you could store your number like that:
1AE 0123 4567 789A
^^^ this is 430 in hex and requires only 10 bit
The maximum number for the three options combined is 1000, so 10 bit are enough.
Compared to storing these 3 characters normally you save 2 bit. As someone else already commented - this is probably not worth it. For the whole line it's even less: 2 bit / 60 bit = 3.3% saved.
If you want to get rid of the duplicates first, do this, then look at the links at the bottom of the page. If you don't want to get rid of the duplicates, then still look at the links at the bottom of the page:
Array.prototype.contains = function(v) {
for (var i = 0; i < this.length; i++) {
if (this[i] === v) return true;
}
return false;
};
Array.prototype.unique = function() {
var arr = [];
for (var i = 0; i < this.length; i++) {
if (!arr.contains(this[i])) {
arr.push(this[i]);
}
}
return arr;
}
var duplicates = [1, 3, 4, 2, 1, 2, 3, 8];
var uniques = duplicates.unique(); // result = [1,3,4,2,8]
console.log(uniques);
Then you would have shortened your code that you have to deal with. Then you might want to check out Smaz
Smaz is a simple compression library suitable for compressing strings.
If that doesn't work, then you could take a look at this:
http://ed-von-schleck.github.io/shoco/
Shoco is a C library to compress and decompress short strings. It is very fast and easy to use. The default compression model is optimized for english words, but you can generate your own compression model based on your specific input data.
Let me know if it works!

0 minus 0 gives carryout of 1 in adder-subtractor circuit

In this adder-subtractor design with the "M" input as the flag for subtraction, 0 minus 0 seems to provide the incorrect Cout. Let's assume that we're only using one full adder here (ignore A1/B1, A2/B2, A3/B3) for simplicity, and M=1, A0=0, A1=0:
The full adder will get the inputs of:
0 (B0) XOR 1 (M) = 1
0 (A0) = 0
1 (M) = 1
This results in 1+1=0, with Cout = 1 - but Cout should equal 0 for a full adder:
I think inverting the final Cout will provide the correct result, but everywhere I look online for this adder-subtractor circuit has no inverter for the final Cout. Is this circuit supposed to have an inverter at the final Cout to fix this problem?
The carry out equal to 1 is perfectly normal in this case.
When you work with unsigned logic the carry out is used as an overflow flag: assuming you're working with 4-bits operands, the operation:
a = 1000, b = 1001 (Decimal a = 8, b = 9)
1000 +
1001 =
--------
1 0001
produces a carry out of 1'b1 because the result of 8+9 cannot be represented on 4 bits.
On the other hand, when working with signed logic the carry out signal loses its 'overflow' meaning. Let's make an example:
a = 0111, b = 0010 (Decimal a = 7, b = 2)
0111 +
0010 =
--------
0 1001
In this case the result is 1001, that is -7 in two's complement. It's obvious that we had an overflow, since we added two positive numbers and we got a negative one. The carry out, anyway, is equal to 0. As a last case, if we consider:
a = 1111, b = 0001 (Decimal a = -1, b = 1)
1111 +
0001 =
--------
1 0000
we see that even though the result is correct -1+1=0, the carry out is set.
To conclude, if you work in signed logic and you need to understand whether there was an overflow, you need to check the sign of the two operands against the result's one.
Both operands positive (MSB = 0) and result negative (MSB = 1): overflow
Both operands negative (MSB = 1) and result positive (MSB = 0): overflow
Any other case: no overflow

What's the insight behind A XOR B in bitwise operation?

What I know for A XOR B operation is that the output is 1 if A != B, and 0 if A == B. However, I have no insight about this operation when A and B are not binary.
For example, if A = 1, B = 3, then A XOR B = 2; also, if A = 2, B = 3, then A XOR B = 1. Is there any pattern to the XOR operation for non-binary values?
I have a good understanding of boolean mathematics, so I already understand how XOR works. What I am asking is that how do you, for example, predict the outcome of A XOR B without going through the manual calculation, if A and B are not binaries? Let's pretend that 2 XOR 3 = 1 is not just a mathematical artifact.
Thanks!
Just look at the binary representations of the numbers, and perform the following rules on each bit:
0 XOR 0 = 0
0 XOR 1 = 1
1 XOR 0 = 1
1 XOR 1 = 0
So, 1 XOR 3 is:
1 = 001
3 = 011
XOR = 010 = 2
To convert a (decimal) number to binary, repeatedly divide by two until you get to 0, and then the remainders in reverse order is the binary number:
To convert it back, repeatedly subtract it by the largest power of two that's no bigger than it until you get to 0, having each position in the binary number corresponding to the powers you subtracted by set to 1 (the left-most position corresponds to the 0-th power):
(Images reference)
xor on integers and other data is simply xor of the individual bits:
A: 0|0|0|1 = 1
B: 0|0|1|1 = 3
=======
A^B: 0|0|1|0 = 2
^-- Each column is a single bit xor
When you use bit operations on numbers that are more than one bit, it simply performs the operation on each corresponding bit in the inputs, and that becomes the corresponding bit in the output. So:
A = 1 = 00000001
B = 3 = 00000011
--------
result= 00000010 = 2
A = 2 = 00000010
B = 3 = 00000011
--------
result= 00000001 = 1
The result has a 0 bit wherever the input bits were the same, a 1 bit wherever they were different.
You use the same method when performing AND and OR on integers.