Checksum and Bitshift - c++

I'm learning to create a raw packet and send it following this tutorial. Everything makes sense until i reach the code where the checksum is generated.
unsigned short csum (unsigned short *buf, int nwords)
{
unsigned long sum;
for (sum = 0; nwords > 0; nwords--)
sum += *buf++;
sum = (sum >> 16) + (sum & 0xffff);
sum += (sum >> 16);
return ~sum;
}
It looks like that he's summing up all the words in the buffer. but when I hit
sum = (sum >> 16) + (sum & 0xffff);
sum += (sum >> 16);
I get completely lost. Looks like he shifts all the bits right, essentially discarding all the bits except the carry over and then adding it back into the original sum? Why is the & 0xfff necessary? after all that, why does the add the carry out bits again? is it because there might be a second carry out?

The line:
sum = (sum >> 16) + (sum & 0xffff);
Adds the left and right 16-bit words in the 32-bit integer. It basically splits the number in half and adds the two halves together. sum>>16 gives you the left half, and sum & 0xffff gives you the right half.
Then when these 2 are added together, they could possible overflow. This line:
sum += (sum >> 16);
Adds the overflow back into the original number.

The checksum being computed is 16-bit (unsigned short is very often 16-bit), but the variable sum is unsigned long, and thus probably 32 bit.
So the operation sum >> 16 captures the high word of the sum, all the times that pairs of words have summed to more than 16 bit can hold. This is then mixed with sum & 0xffff which is just the low word of the sum.
This way, all bits of the sum are "folded in" so that they contribute to the final result.

Related

Convert every 5 bits into integer values in C++

Firstly, if anyone has a better title for me, let me know.
Here is an example of the process I am trying to automate with C++
I have an array of values that appear in this format:
9C07 9385 9BC7 00 9BC3 9BC7 9385
I need to convert them to binary and then convert every 5 bits to decimal like so with the last bit being a flag:
I'll do this with only the first word here.
9C07
10011 | 10000 | 00011 | 1
19 | 16 | 3
These are actually x,y,z coordinates and the final bit determines the order they are in a '0' would make it x=19 y=16 z=3 and '1' is x=16 y=3 z=19
I already have a buffer filled with these hex values, but I have no idea where to go from here.
I assume these are integer literals, not strings?
The way to do this is with bitwise right shift (>>) and bitwise AND (&)
#include <cstdint>
struct Coordinate {
std::uint8_t x;
std::uint8_t y;
std::uint8_t z;
constexpr Coordinate(std::uint16_t n) noexcept
{
if (n & 1) { // flag
x = (n >> 6) & 0x1F; // 1 1111
y = (n >> 1) & 0x1F;
z = n >> 11;
} else {
x = n >> 11;
y = (n >> 6) & 0x1F;
z = (n >> 1) & 0x1F;
}
}
};
The following code would extract the three coordinates and the flag from the 16 least significant bits of value (ie. its least significant word).
int flag = value & 1; // keep only the least significant bit
value >>= 1; // shift right by one bit
int third_integer = value & 0x1f; // keep only the five least significant bits
value >>= 5; // shift right by five bits
int second_integer = value & 0x1f; // keep only the five least significant bits
value >>= 5; // shift right by five bits
int first_integer = value & 0x1f; // keep only the five least significant bits
value >>= 5; // shift right by five bits (only useful if there are other words in "value")
What you need is most likely some loop doing this on each word of your array.

8-digit BCD check

I've a 8-digit BCD number and need to check it out to see if it is a valid BCD number. How can I programmatically (C/C++) make this?
Ex: 0x12345678 is valid, but 0x00f00abc isn't.
Thanks in advance!
You need to check each 4-bit quantity to make sure it's less than 10. For efficiency you want to work on as many bits as you can at a single time.
Here I break the digits apart to leave a zero between each one, then add 6 to each and check for overflow.
uint32_t highs = (value & 0xf0f0f0f0) >> 4;
uint32_t lows = value & 0x0f0f0f0f;
bool invalid = (((highs + 0x06060606) | (lows + 0x06060606)) & 0xf0f0f0f0) != 0;
Edit: actually we can do slightly better. It doesn't take 4 bits to detect overflow, only 1. If we divide all the digits by 2, it frees a bit and we can check all the digits at once.
uint32_t halfdigits = (value >> 1) & 0x77777777;
bool invalid = ((halfdigits + 0x33333333) & 0x88888888) != 0;
The obvious way to do this is:
/* returns 1 if x is valid BCD */
int
isvalidbcd (uint32_t x)
{
for (; x; x = x>>4)
{
if ((x & 0xf) >= 0xa)
return 0;
}
return 1;
}
This link tells you all about BCD, and recommends something like this asa more optimised solution (reworking to check all the digits, and hence using a 64 bit data type, and untested):
/* returns 1 if x is valid BCD */
int
isvalidbcd (uint32_t x)
{
return !!(((uint64_t)x + 0x66666666ULL) ^ (uint64_t)x) & 0x111111110ULL;
}
For a digit to be invalid, it needs to be 10-15. That in turn means 8 + 4 or 8+2 - the low bit doesn't matter at all.
So:
long mask8 = value & 0x88888888;
long mask4 = value & 0x44444444;
long mask2 = value & 0x22222222;
return ((mask8 >> 2) & ((mask4 >>1) | mask2) == 0;
Slightly less obvious:
long mask8 = (value>>2);
long mask42 = (value | (value>>1);
return (mask8 & mask42 & 0x22222222) == 0;
By shifting before masking, we don't need 3 different masks.
Inspired by #Mark Ransom
bool invalid = (0x88888888 & (((value & 0xEEEEEEEE) >> 1) + (0x66666666 >> 1))) != 0;
// or
bool valid = !((((value & 0xEEEEEEEEu) >> 1) + 0x33333333) & 0x88888888);
Mask off each BCD digit's 1's place, shift right, then add 6 and check for BCD digit overflow.
How this works:
By adding +6 to each digit, we look for an overflow * of the 4-digit sum.
abcd
+ 110
-----
*efgd
But the bit value of d does not contribute to the sum, so first mask off that bit and shift right. Now the overflow bit is in the 8's place. This all is done in parallel and we mask these carry bits with 0x88888888 and test if any are set.
0abc
+ 11
-----
*efg

checksum code in C++

can someone please explain what this code is doing? i have to interpret this code and use it as a checksum code, but i am not sure if it is absolutely correct. Especially how the overflows are working and what *cp, const char* cp and sum & 0xFFFF mean? The basic idea was to take an input as string from user, convert it to binary form 16 bits at a time. Then sum all the multiple 16 bits together (in binary) and get a 16 bit sum. If there is any overflow bit in the addition, add that to lsb of final sum. Then take a ones complement of the result.
How close is this code to doing the above?
unsigned int packet::calculateChecksum()
{
unsigned int c = 0;
int i;
string j;
int k;
cout<< "enter a message" << message;
getline(cin, message) ; // Some string.
//std::string message =
std::vector<uint16_t> bitvec;
const char* cp = message.c_str()+1;
while (*cp) {
uint16_t bits = *(cp-1)>>8 + *(cp);
bitvec.push_back(bits);
cp += 2;
}
uint32_t sum=0;
uint16_t overflow=0;
uint32_t finalsum =0;
// Compute the sum. Let overflows accumulate in upper 16 bits.
for(auto j = bitvec.begin(); j != bitvec.end(); ++j)
sum += *j;
// Now fold the overflows into the lower 16 bits. Loop until no overflows.
do {
sum = (sum & 0xFFFF) + (sum >> 16);
} while (sum > 0xFFFF);
// Return the 1s complement sum in finalsum
finalsum = 0xFFFF & sum;
//cout<< "the finalsum is" << c;
c = finalsum;
return c;
}
I see several issues in the code:
cp is a pointer to zero ended char array holding the input message. The while(*cp) will have problem as inside the while loop body cp is incremented by 2!!! So it's fairly easy to skip the ending \0 of the char array (e.g. the input message has 2 characters) and result in a segmentation fault.
*(cp) and *(cp-1) fetch the two neighbouring characters (bytes) in the input message. But why the two-bytes word is formed by *(cp-1)>>8 + *(cp)? I think it would make sense to formed the 16bits word by *(cp-1)<<8 + *(cp) i.e. the preceding character sits on the higher byte and the following character sits on the lower byte of the 16bits word.
To answer your question sum & 0xFFFF just means calculating a number where the higher 16 bits are zero and the lower 16 bits are the same as in sum. the 0xFFFF is a bit mask.
The funny thing is, even the above code might not doing the exact thing you mentioned as requirement, as long as the sending and receiving party are using the same piece of incorrect code, your checksum creation and verification will pass, as both ends are consistent with each other:)

Grabbing n bits from a byte

I'm having a little trouble grabbing n bits from a byte.
I have an unsigned integer. Let's say our number in hex is 0x2A, which is 42 in decimal. In binary it looks like this: 0010 1010. How would I grab the first 5 bits which are 00101 and the next 3 bits which are 010, and place them into separate integers?
If anyone could help me that would be great! I know how to extract from one byte which is to simply do
int x = (number >> (8*n)) & 0xff // n being the # byte
which I saw on another post on stack overflow, but I wasn't sure on how to get separate bits out of the byte. If anyone could help me out, that'd be great! Thanks!
Integers are represented inside a machine as a sequence of bits; fortunately for us humans, programming languages provide a mechanism to show us these numbers in decimal (or hexadecimal), but that does not alter their internal representation.
You should review the bitwise operators &, |, ^ and ~ as well as the shift operators << and >>, which will help you understand how to solve problems like this.
The last 3 bits of the integer are:
x & 0x7
The five bits starting from the eight-last bit are:
x >> 3 // all but the last three bits
& 0x1F // the last five bits.
"grabbing" parts of an integer type in C works like this:
You shift the bits you want to the lowest position.
You use & to mask the bits you want - ones means "copy this bit", zeros mean "ignore"
So, in you example. Let's say we have a number int x = 42;
first 5 bits:
(x >> 3) & ((1 << 5)-1);
or
(x >> 3) & 31;
To fetch the lower three bits:
(x >> 0) & ((1 << 3)-1)
or:
x & 7;
Say you want hi bits from the top, and lo bits from the bottom. (5 and 3 in your example)
top = (n >> lo) & ((1 << hi) - 1)
bottom = n & ((1 << lo) - 1)
Explanation:
For the top, first get rid of the lower bits (shift right), then mask the remaining with an "all ones" mask (if you have a binary number like 0010000, subtracting one results 0001111 - the same number of 1s as you had 0-s in the original number).
For the bottom it's the same, just don't have to care with the initial shifting.
top = (42 >> 3) & ((1 << 5) - 1) = 5 & (32 - 1) = 5 = 00101b
bottom = 42 & ((1 << 3) - 1) = 42 & (8 - 1) = 2 = 010b
You could use bitfields for this. Bitfields are special structs where you can specify variables in bits.
typedef struct {
unsigned char a:5;
unsigned char b:3;
} my_bit_t;
unsigned char c = 0x42;
my_bit_t * n = &c;
int first = n->a;
int sec = n->b;
Bit fields are described in more detail at http://www.cs.cf.ac.uk/Dave/C/node13.html#SECTION001320000000000000000
The charm of bit fields is, that you do not have to deal with shift operators etc. The notation is quite easy. As always with manipulating bits there is a portability issue.
int x = (number >> 3) & 0x1f;
will give you an integer where the last 5 bits are the 8-4 bits of number and zeros in the other bits.
Similarly,
int y = number & 0x7;
will give you an integer with the last 3 bits set the last 3 bits of number and the zeros in the rest.
just get rid of the 8* in your code.
int input = 42;
int high3 = input >> 5;
int low5 = input & (32 - 1); // 32 = 2^5
bool isBit3On = input & 4; // 4 = 2^(3-1)

Concatenate binary numbers of different lengths

So I have 3 numbers. One is a char, and the other two are int16_t (also known as shorts, but according to a table I found shorts won't reliably be 16 bits).
I'd like to concatenate them together. So say that the values of them were:
10010001
1111111111111101
1001011010110101
I'd like to end up with a long long containing:
1001000111111111111111011001011010110101000000000000000000000000
Using some solutions I've found online, I came up with this:
long long result;
result = num1;
result = (result << 8) | num2;
result = (result << 24) | num3;
But it doesn't work; it gives me very odd numbers when it's decoded.
In case there's a problem with my decoding code, here it is:
char num1 = num & 0xff;
int16_t num2 = num << 8 & 0xffff;
int16_t num3 = num << 24 & 0xffff;
What's going on here? I suspect it has to do with the size of a long long, but I can't quite wrap my head around it and I want room for more numbers in it later.
To get the correct bit-pattern as you requested, you shoud use:
result = num1;
result = (result << 16) | num2;
result = (result << 16) | num3;
result<<=24;
This will yield the exact bit pattern that you requested, 24 bits at the lsb-end left 0:
1001000111111111111111011001011010110101000000000000000000000000
For that last shift, you should only be shifting by 16, not by 24. 24 is the current length of your binary string, after the combination of num1 and num2. You need to make room for num3, which is 16 bits, so shift left by 16.
Edit:
Just realized the first shift is wrong too. That should be 16 also, for similar reasons.
Yes you are overflowing the value that can be stored in long. You can use a arbitrary precison library to store the big number like the GMP.
If I understand correctly what you are doing, I would use:
result = num1;
result = (result << 16) | num2;
result = (result << 16) | num3;
num1out = (result >> 32) & 0xff;
num2out = (result >> 16) & 0xffff;
num3out = result & 0xffff;
The left shift during building is by the width of the next number to insert. The right shift on extraction is by the total number of bits the field was left shifted during building.
I have tested the above code. long long is wide enough for this task with the g++ compiler, and I believe many others.