checksum code in C++ - c++

can someone please explain what this code is doing? i have to interpret this code and use it as a checksum code, but i am not sure if it is absolutely correct. Especially how the overflows are working and what *cp, const char* cp and sum & 0xFFFF mean? The basic idea was to take an input as string from user, convert it to binary form 16 bits at a time. Then sum all the multiple 16 bits together (in binary) and get a 16 bit sum. If there is any overflow bit in the addition, add that to lsb of final sum. Then take a ones complement of the result.
How close is this code to doing the above?
unsigned int packet::calculateChecksum()
{
unsigned int c = 0;
int i;
string j;
int k;
cout<< "enter a message" << message;
getline(cin, message) ; // Some string.
//std::string message =
std::vector<uint16_t> bitvec;
const char* cp = message.c_str()+1;
while (*cp) {
uint16_t bits = *(cp-1)>>8 + *(cp);
bitvec.push_back(bits);
cp += 2;
}
uint32_t sum=0;
uint16_t overflow=0;
uint32_t finalsum =0;
// Compute the sum. Let overflows accumulate in upper 16 bits.
for(auto j = bitvec.begin(); j != bitvec.end(); ++j)
sum += *j;
// Now fold the overflows into the lower 16 bits. Loop until no overflows.
do {
sum = (sum & 0xFFFF) + (sum >> 16);
} while (sum > 0xFFFF);
// Return the 1s complement sum in finalsum
finalsum = 0xFFFF & sum;
//cout<< "the finalsum is" << c;
c = finalsum;
return c;
}

I see several issues in the code:
cp is a pointer to zero ended char array holding the input message. The while(*cp) will have problem as inside the while loop body cp is incremented by 2!!! So it's fairly easy to skip the ending \0 of the char array (e.g. the input message has 2 characters) and result in a segmentation fault.
*(cp) and *(cp-1) fetch the two neighbouring characters (bytes) in the input message. But why the two-bytes word is formed by *(cp-1)>>8 + *(cp)? I think it would make sense to formed the 16bits word by *(cp-1)<<8 + *(cp) i.e. the preceding character sits on the higher byte and the following character sits on the lower byte of the 16bits word.
To answer your question sum & 0xFFFF just means calculating a number where the higher 16 bits are zero and the lower 16 bits are the same as in sum. the 0xFFFF is a bit mask.
The funny thing is, even the above code might not doing the exact thing you mentioned as requirement, as long as the sending and receiving party are using the same piece of incorrect code, your checksum creation and verification will pass, as both ends are consistent with each other:)

Related

Arduino code: shifting bits seems to change data type from int to long

on my Arduino, the following code produces output I don't understand:
void setup(){
Serial.begin(9600);
int a = 250;
Serial.println(a, BIN);
a = a << 8;
Serial.println(a, BIN);
a = a >> 8;
Serial.println(a, BIN);
}
void loop(){}
The output is:
11111010
11111111111111111111101000000000
11111111111111111111111111111010
I do understand the first line: leading zeros are not printed to the serial terminal. However, after shifting the bits the data type of a seems to have changed from int to long (32 bits are printed). The expected behaviour is that bits are shifted to the left, and that bits which are shifted "out" of the 16 bits an int has are simply dropped. Shifting the bits back does not turn the "32bit" variable to "16bit" again.
Shifting by 7 or less positions does not show this effect.
I probably should say that I am not using the Arduino IDE, but the Makefile from https://github.com/sudar/Arduino-Makefile.
What is going on? I almost expect this to be "normal", but I don't get it. Or is it something in the printing routine which simply adds 16 "1"'s to the output?
Enno
In addition to other answers, Integers might be stored in 16 bits or 32 bits depending on what arduino you have.
The function printing numbers in Arduino is defined in /arduino-1.0.5/hardware/arduino/cores/arduino/Print.cpp
size_t Print::printNumber(unsigned long n, uint8_t base) {
char buf[8 * sizeof(long) + 1]; // Assumes 8-bit chars plus zero byte.
char *str = &buf[sizeof(buf) - 1];
*str = '\0';
// prevent crash if called with base == 1
if (base < 2) base = 10;
do {
unsigned long m = n;
n /= base;
char c = m - base * n;
*--str = c < 10 ? c + '0' : c + 'A' - 10;
} while(n);
return write(str);
}
All other functions rely on this one, so yes your int gets promoted to an unsigned long when you print it, not when you shift it.
However, the library is correct. By shifting left 8 positions, the negative bit in the integer number becomes '1', so when the integer value is promoted to unsigned long the runtime correctly pads it with 16 extra '1's instead of '0's.
If you are using such a value not as a number but to contain some flags, use unsigned int instead of int.
ETA: for completeness, I'll add further explanation for the second shifting operation.
Once you touch the 'negative bit' inside the int number, when you shift towards right the runtime pads the number with '1's in order to preserve its negative value. Shifting to the left k positions corresponds to dividing the number by 2^k, and since the number is negative to start with then the result must remain negative.

How to random flip binary bit of char in C/C++

If I have a char array A, I use it to store hex
A = "0A F5 6D 02" size=11
The binary representation of this char array is:
00001010 11110101 01101101 00000010
I want to ask is there any function can random flip the bit?
That is:
if the parameter is 5
00001010 11110101 01101101 00000010
-->
10001110 11110001 01101001 00100010
it will random choose 5 bit to flip.
I am trying make this hex data to binary data and use bitmask method to achieve my requirement. Then turn it back to hex. I am curious is there any method to do this job more quickly?
Sorry, my question description is not clear enough. In simply, I have some hex data, and I want to simulate bit error in these data. For example, if I have 5 byte hex data:
"FF00FF00FF"
binary representation is
"1111111100000000111111110000000011111111"
If the bit error rate is 10%. Then I want to make these 40 bits have 4 bits error. One extreme random result: error happened in the first 4 bit:
"0000111100000000111111110000000011111111"
First of all, find out which char the bit represents:
param is your bit to flip...
char *byteToWrite = &A[sizeof(A) - (param / 8) - 1];
So that will give you a pointer to the char at that array offset (-1 for 0 array offset vs size)
Then get modulus (or more bit shifting if you're feeling adventurous) to find out which bit in here to flip:
*byteToWrite ^= (1u << param % 8);
So that should result for a param of 5 for the byte at A[10] to have its 5th bit toggled.
store the values of 2^n in an array
generate a random number seed
loop through x times (in this case 5) and go data ^= stored_values[random_num]
Alternatively to storing the 2^n values in an array, you could do some bit shifting to a random power of 2 like:
data ^= (1<<random%7)
Reflecting the first comment, you really could just write out that line 5 times in your function and avoid the overhead of a for loop entirely.
You have 32 bit number. You can treate the bits as parts of hte number and just xor this number with some random 5-bits-on number.
int count_1s(int )
{
int m = 0x55555555;
int r = (foo&m) + ((foo>>>1)&m);
m = 0x33333333;
r = (r&m) + ((r>>>2)&m);
m = 0x0F0F0F0F;
r = (r&m) + ((r>>>4)&m);
m = 0x00FF00FF;
r = (r&m) + ((r>>>8)&m);
m = 0x0000FFFF;
return r = (r&m) + ((r>>>16)&m);
}
void main()
{
char input[] = "0A F5 6D 02";
char data[4] = {};
scanf("%2x %2x %2x %2x", &data[0], &data[1], &data[2], &data[3]);
int *x = reinterpret_cast<int*>(data);
int y = rand();
while(count_1s(y) != 5)
{
y = rand(); // let's have this more random
}
*x ^= y;
printf("%2x %2x %2x %2x" data[0], data[1], data[2], data[3]);
return 0;
}
I see no reason to convert the entire string back and forth from and to hex notation. Just pick a random character out of the hex string, convert this to a digit, change it a bit, convert back to hex character.
In plain C:
#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
int main (void)
{
char *hexToDec_lookup = "0123456789ABCDEF";
char hexstr[] = "0A F5 6D 02";
/* 0. make sure we're fairly random */
srand(time(0));
/* 1. loop 5 times .. */
int i;
for (i=0; i<5; i++)
{
/* 2. pick a random hex digit
we know it's one out of 8, grouped per 2 */
int hexdigit = rand() & 7;
hexdigit += (hexdigit>>1);
/* 3. convert the digit to binary */
int hexvalue = hexstr[hexdigit] > '9' ? hexstr[hexdigit] - 'A'+10 : hexstr[hexdigit]-'0';
/* 4. flip a random bit */
hexvalue ^= 1 << (rand() & 3);
/* 5. write it back into position */
hexstr[hexdigit] = hexToDec_lookup[hexvalue];
printf ("[%s]\n", hexstr);
}
return 0;
}
It might even be possible to omit the convert-to-and-from-ASCII steps -- flip a bit in the character string, check if it's still a valid hex digit and if necessary, adjust.
First randomly chose x positions (each position consist of array index and the bit position).
Now if you want to flip ith bit from right for a number n. Find the remainder of n by 2n as :
code:
int divisor = (2,i);
int remainder = n % divisor;
int quotient = n / divisor;
remainder = (remainder == 0) ? 1 : 0; // flip the remainder or the i th bit from right.
n = divisor * quotient + remainder;
Take mod 8 of input(5%8)
Shift 0x80 to right by input value (e.g 5)
XOR this value with (input/8)th element of your character array.
code:
void flip_bit(int bit)
{
Array[bit/8] ^= (0x80>>(bit%8));
}

Sum of binary numbers in C++ and overflow bit?

I need help with adding the 16 bits that are concatenated in 'bits'. Every time a set of 16 bits is concatenated, I want them to be added (binary addition) to an array...till all sets of 16 are complete in my string. If there is an overflow, length of final sum >16...then add that extra bit to the final sum as 0000000000000001 (where 1 is the 16th bit).
For a string entered: "hello"
std::vector<std::string> bitvec;
std::string bits;
for (int i = 0; i < s.size(); i += 2) {
bits = std::bitset<8>(s[i]).to_string() + std::bitset<8>(s[i + 1]).to_string();
bitvec.push_back(bits);
}
Possible problems:
If s holds "hello", then std::bitset<8>(s[i]) will be 0. You need to pass a string containing only "1"s and "0"s to the bitset constructor
Once your bitsets are initialized properly, you can't add them together by using the to_string() function, that will just concatenate the representations: "1011" + "1100" will become "10111100"
Oh, wait, maybe that's what you do want.
It sort of sounds like you are inventing a complicated way to sum the pairs of ascii values interpreted as 16 bit numbers, but it's not clear. Your code is roughly equivalent to something like:
std::vector<uint16_t> bitvec;
unsigned char* cp = s.c_str()+1;
while (*cp) {
uint16_t bits = *(cp-1)>>8 + *(cp);
bitvec.push_back(bits);
}
//sum over the numbers contained in bitvec here?
uint32_t sum=0;
for(std::vector<int16_t>::iterator j=bitvec.begin();j!=bitvec.end();++j) {
sum += *j;
uint16_t overflow = sum>>16; //capture the overflow bit, move it back to lsb
sum &= (1<<16)-1; //clear the overflow
sum += overflow; //add it back as lsb
}

Checksum calculation - two’s complement sum of all bytes

I have instructions on creating a checksum of a message described like this:
The checksum consists of a single byte equal to the two’s complement sum of all bytes starting from the “message type” word up to the end of the message block (excluding the transmitted checksum). Carry from the most significant bit is ignored.
Another description I found was:
The checksum value contains the twos complement of the modulo 256 sum of the other words in the data message (i.e., message type, message length, and data words). The receiving equipment may calculate the modulo 256 sum of the received words and add this sum to the received checksum word. A result of zero generally indicates that the message was correctly received.
I understand this to mean that I sum the value of all bytes in message (excl checksum), get modulo 256 of this number. get twos complement of this number and that is my checksum.
But I am having trouble with an example message example (from design doc so I must assume it has been encoded correctly).
unsigned char arr[] = {0x80,0x15,0x1,0x8,0x30,0x33,0x31,0x35,0x31,0x30,0x33,0x30,0x2,0x8,0x30,0x33,0x35,0x31,0x2d,0x33,0x32,0x31,0x30,0xe};
So the last byte, 0xE, is the checksum. My code to calculate the checksum is as follows:
bool isMsgValid(unsigned char arr[], int len) {
int sum = 0;
for(int i = 0; i < (len-1); ++i) {
sum += arr[i];
}
//modulo 256 sum
sum %= 256;
char ch = sum;
//twos complement
unsigned char twoscompl = ~ch + 1;
return arr[len-1] == twoscompl;
}
int main(int argc, char* argv[])
{
unsigned char arr[] = {0x80,0x15,0x1,0x8,0x30,0x33,0x31,0x35,0x31,0x30,0x33,0x30,0x2,0x8,0x30,0x33,0x35,0x31,0x2d,0x33,0x32,0x31,0x30,0xe};
int arrsize = sizeof(arr) / sizeof(arr[0]);
bool ret = isMsgValid(arr, arrsize);
return 0;
}
The spec is here:= http://www.sinet.bt.com/227v3p5.pdf
I assume I have misunderstood the algorithm required. Any idea how to create this checksum?
Flippin spec writer made a mistake in their data example. Just spotted this then came back on here and found others spotted too. Sorry if I wasted your time. I will study responses because it looks like some useful comments for improving my code.
You miscopied the example message from the pdf you linked. The second parameter length is 9 bytes, but you used 0x08 in your code.
The document incorrectly states "8 bytes" in the third column when there are really 9 bytes in the parameter. The second column correctly states "00001001".
In other words, your test message should be:
{0x80,0x15,0x1,0x8,0x30,0x33,0x31,0x35,0x31,0x30,0x33,0x30, // param1
0x2,0x9,0x30,0x33,0x35,0x31,0x2d,0x33,0x32,0x31,0x30,0xe} // param2
^^^
With the correct message array, ret == true when I try your program.
Agree with the comment: looks like the checksum is wrong. Where in the .PDF is this data?
Some general tips:
Use an unsigned type as the accumulator; that gives you well-defined behavior on overflow, and you'll need that for longer messages. Similarly, if you store the result in a char variable, make it unsigned char.
But you don't need to store it; just do the math with an unsigned type, complement the result, add 1, and mask off the high bits so that you get an 8-bit result.
Also, there's a trick here, if you're on hardware that uses twos-complement arithmetic: just add all of the values, including the checksum, then mask off the high bits; the result will be 0 if the input was correct.
The receiving equipment may calculate the modulo 256 sum of the received words and add this sum to the received checksum word.
It's far easier to use this condition to understand the checksum:
{byte 0} + {byte 1} + ... + {last byte} + {checksum} = 0 mod 256
{checksum} = -( {byte 0} + {byte 1} + ... + {last byte} ) mod 256
As the others have said, you really should use unsigned types when working with individual bits. This is also true when doing modular arithmetic. If you use signed types, you leave yourself open to a rather large number of sign-related mistakes. OTOH, pretty much the only mistake you open yourself up to using unsigned numbers is things like forgetting 2u-3u is a positive number.
(do be careful about mixing signed and unsigned numbers together: there are a lot of subtleties involved in that too)

How to write individual bytes to filein C++

GIven the fact that I generate a string containing "0" and "1" of a random length, how can I write the data to a file as bits instead of ascii text ?
Given my random string has 12 bits, I know that I should write 2 bytes (or add 4 more 0 bits to make 16 bits) in order to write the 1st byte and the 2nd byte.
Regardless of the size, given I have an array of char[8] or int[8] or a string, how can I write each individual group of bits as one byte in the output file?
I've googled a lot everywhere (it's my 3rd day looking for an answer) and didn't understand how to do it.
Thank you.
You don't do I/O with an array of bits.
Instead, you do two separate steps. First, convert your array of bits to a number. Then, do binary file I/O using that number.
For the first step, the types uint8_t and uint16_t found in <stdint.h> and the bit manipulation operators << (shift left) and | (or) will be useful.
You haven't said what API you're using, so I'm going to assume you're using I/O streams. To write data to the stream just do this:
f.write(buf, len);
You can't write single bits, the best granularity you are going to get is bytes. If you want bits you will have to do some bitwise work to your byte buffer before you write it.
If you want to pack your 8 element array of chars into one byte you can do something like this:
char data[8] = ...;
char byte = 0;
for (unsigned i = 0; i != 8; ++i)
{
byte |= (data[i] & 1) << i;
}
f.put(byte);
If data contains ASCII '0' or '1' characters rather than actual 0 or 1 bits replace the |= line with this:
byte |= (data[i] == '1') << i;
Make an unsigned char out of the bits in an array:
unsigned char make_byte(char input[8]) {
unsigned char result = 0;
for (int i=0; i<8; i++)
if (input[i] != '0')
result |= (1 << i);
return result;
}
This assumes input[0] should become the least significant bit in the byte, and input[7] the most significant.