How can I concatenate bytes?, for example
I have one byte array, BYTE *buffer[2] = [0x00, 0x02], but I want concatenate this two bytes, but backwards.
something like that
0x0200 <---
and later convert those bytes in decimal 0x0200 = 512
but I don't know how do it on C, because I can't use memcpy or strcat for the reason that buffer is BYTE and not a CHAR, even don't know if I can do that
Can somebody help me with a code or how can I concatenate bytes to convert on decimal?
because I have another byte array, buff = {0x00, 0x00, 0x0C, 0x00, 0x00, 0x00} and need do the same.
help please.
regards.
BYTE is not a standard type and is probably a typedef for unsigned char. Here, I'll use the definitions from <stdint.h> that define intgers for specified byte widths and where a byte is uint8_t.
Concatenating two bytes "backwards" is easy if you think about it:
uint8_t buffer[2] = {0x00, 0x02};
uint16_t x = buffer[1] * 256 + buffer[0];
It isn't called backwards, by the way, but Little Endian byte order. The opposite would be Big Endian, where the most significant byte comes first:
uint16_t x = buffer[0] * 256 + buffer[1];
Then, there's no such thing as "converting to decimal". Internally, all numbers are binary. You can print them as decimal numbers or as hexadeximal numbers or as numbers of any base or even as Roman numerals if you like, but it's still the same number:
printf("dec: %u\n", x); // prints 512
printf("hex: %x\n", x); // prints 200
Now let's look what happens for byte arrays of any length:
uint8_t buffer[4] = {0x11, 0x22, 0x33, 0x44};
uint32_t x = buffer[3] * 256 * 256 * 256
+ buffer[2] * 256 * 256
+ buffer[1] * 256
+ buffer[0];
See a pattern? You can rewrite this as:
uint32_t x = ( ( (buffer[3]) * 256
+ buffer[2]) * 256
+ buffer[1]) * 256
+ buffer[0];
You can convert this logic to a function easily:
uint64_t int_little_endian(uint8_t *arr, size_t n)
{
uint64_t res = 0ul;
while (n--) res = res * 256 + arr[n];
return res;
}
Likewise for Big Endian, wher you move "forward":
uint64_t int_big_endian(uint8_t *arr, size_t n)
{
uint64_t res = 0ul;
while (n--) res = res * 256 + *arr++;
return res;
}
Lastly, code that deals with byte conversions usually doesn't use the arithmetic operations of multiplication and addition, but is uses so-called bit-wise operators. A multiplication with 2 is represented by a shifting all bits of a number right by one. (Much as a multiplication by 10 in decimal is done by shifting all digits by one and appending a zero.) Out multiplication by 256 will become a bit-shift of 8 bytes to the left, which i C notation is x << 8.
Addition is done by applying the bit-wise or. These two operations are not identical, because the bit-wise or operates on bits and does not account for carry. In our case, where there are no clashes between additions, they behave the same. Your Little-Endian conversion function now looks like this:
uint64_t int_little_endian(uint8_t *arr, size_t n)
{
uint64_t res = 0ul;
while (n--) res = res << 8 | arr[n];
return res;
}
And If that doesn't look like some nifty C code, I don't know. (If these bitwise operators confuse you, leave them for now. In your example, you're fine with multiplication and addition.)
Related
I need to compare given text data with checkSumCalculator method and I try to send the data with command method. I find and changed the code according to my own needs. But I dont understand some parts.
How can 0x00 hex char will be increase with given data? and how/what is the point of comparing check_data with 0xFF? How to extract (check_data & 0xFF) from 0x100 hex? I am very confused.
void Widget::command()
{
std::string txt = "<DONE:8022ff";
unsigned char check_sum = checkSumCalculator(&txt[0], txt.size());
QString reply= QString::fromStdString(txt) + QString("%1>").arg(check_sum, 2, 16,
QChar('0'));
emit finished(replyMessage, true);
}
static unsigned char checkSumCalculator(void *data, int length)
{
unsigned char check_data = 0x00;
for (int i = 0; i < lenght; i++)
check_data+= ((unsigned char*)data)[i];
check_data = (0x100 - (check_data & 0xFF)) & 0xFF;
return check_data;
}
checkSumCalculator starts by adding together all the values of the buffer in data. Because the type of data is unsigned char, this sum is done modulo 0x100 (256), 1 more than the maximum value an unsigned char can handle (0xFF = 255); the value is said to "wrap around" ((unsigned char) (0xFF + 1) = 256) is again 0).
These two lines:
check_data = (0x100 - (check_data & 0xFF)) & 0xFF;
return check_data;
are really more complicated than it's needed. All that would be needed would be:
return -check_data;
That is, at the end it negates the value. Because the arithmetic is modulo 256, this is essentially the same as flipping the bits and adding 1 (-check_data = ~check_data + 1). This is instead implemented in a more convoluted way:
check_data & 0xFF doesn't do much, because it's a bitwise AND with all the possible bits that can be set on an unsigned char. The value is promoted to an unsigned int (due to C's default integer promotions) where all the bits higher than the lower 8 are necessarily 0. So this is the same as (unsigned int)check_data. Ultimately, this promotion has no bearing on the result.
Subtracting from 0x100 is the same as -check_data, as far as the lower 8 bits are concerned (which what we end up caring about).
The final & 0xFF is also redundant because even though the expression was promoted to unsigned int, it will converted as an unsigned char by returning.
I am in process of rewriting code from Big Endian Machine to Little Endian machine.
Let's say there is a variable called a, which is a 32 bit integer which holds current timestamp(user request's current timestamp).
In Big Endian machine, right now the code is this way:
uint32 a = current_timestamp_of_user_request;
uint8 arr[3] = {0};
arr[0] = ((a >> (8 * 2)) & 0x000000FF);
arr[1] = ((a >> (8 * 1)) & 0x000000FF);
arr[2] = ((a >> (8 * 0)) & 0x000000FF);
Now, when I am writing the same logic for little endian machine, can I use the same code(method a), or should I convert the code this way(let's call this method b)?
uint32 a = current_timestamp_of_user_request;
uint32 b = htonl(a);
uint8 arr[3] = {0};
arr[0] = ((b >> (8 * 2)) & 0x000000FF);
arr[1] = ((b >> (8 * 1)) & 0x000000FF);
arr[2] = ((b >> (8 * 0)) & 0x000000FF);
I wrote this program to verify:
#include<stdio.h>
#include<stdlib.h>
void main() {
long int a = 3265973637;
long int b = 0;
int arr[3] = {0,0,0};
arr[0] = ((a >> (8 * 2)) & 0x000000FF);
arr[1] = ((a >> (8 * 1)) & 0x000000FF);
arr[2] = ((a >> (8 * 0)) & 0x000000FF);
printf("arr[0] = %d\t arr[1] = %d\t arr[2] = %d\n", arr[0], arr[1], arr[2]);
b = htonl(a);
arr[0] = ((b >> (8 * 2)) & 0x000000FF);
arr[1] = ((b >> (8 * 1)) & 0x000000FF);
arr[2] = ((b >> (8 * 0)) & 0x000000FF);
printf("After htonl:\n");
printf("arr[0] = %d\t arr[1] = %d\t arr[2] = %d\n", arr[0], arr[1], arr[2]);
}
Results:
Result with little endian machine:
bgl-srtg-lnx11: /scratch/nnandiga/test>./x86
arr[0] = 170 arr[1] = 205 arr[2] = 133
After htonl:
arr[0] = 205 arr[1] = 170 arr[2] = 194
Result with big endian machine:
arr[0] = 170 arr[1] = 205 arr[2] = 133
After htonl:
arr[0] = 170 arr[1] = 205 arr[2] = 133
Looks like without conversion to big endian order, the same logic(without htonl()) gives exact results in filling the array arr. Now, can you please answer should I use htonl() or not if I want the array to be the same in both little endian and big endian machines(little endian result should be exact as big endian result).
Your code as originally written will do what you want on both big endian and little endian machines.
If for example the value of a is 0x00123456, then 0x12 goes in arr[0], 0x34 goes in arr[1], and 0x56 goes in arr[2]. This occurs regardless of what the endianness of the machine is.
When you use the >> and & operators, they operate on the value of the expression in question, not the representation of that value.
When you call htonl, you change the value to match a particular representation. So on a little endian machine htonl(0x00123456) will result in the value 0x56341200. Then when you operate on that value you get different results.
Where endianness matters is when the representation of a number using multiple bytes is read or written as bytes, i.e. to disk, over a network, or to/from a byte buffer.
For example, if you do this:
uint32_t a = 0x12345678;
...
write(fd, &a, sizeof(a));
Then the four bytes that a consists of are written to the file descriptor (be it a file or a socket) one at a time. A big endian machine will write 0x12, 0x34, 0x56, 0x78 in that order while a little endian machine will write 0x78, 0x56, 0x34, 0x12.
If you want the bytes to be written in a consistent order then you would first call a = htonl(a) before calling write. Then the bytes will always be written as 0x12, 0x34, 0x56, 0x78.
Because your code operates on the value and not the individual bytes of the value, you don't need to worry about endianness.
You should use htonl(). On a big-endian machine this does nothing, it just returns the original value. On a little-endian machine it swaps the bytes appropriately. So by using this, you don't have to concern yourself with the endian-ness of the machine, you can use the same code after calling it.
I have a CFBitVector that looks like '100000000000000'
I pass the byte array to CFBitVectorGetBits, which then contains the values from this CFBitVector. After this call, bytes[2] looks like:
bytes[0] == '0x80'
bytes[1] == '0x00'
This is exactly what I would expect. However, when copying the contents of bytes[2] to unsigned int bytesValue, the value is 128 when it should be 32768. The decimal value 128 is represented by the hex value 0x0080. Essentially it seems that the byte order is reversed while performing memcpy. What is going on here? Is this just an issue with endianness?
Thanks
CFMutableBitVectorRef bitVector = CFBitVectorCreateMutable(kCFAllocatorDefault, 16);
CFBitVectorSetCount(bitVector, 16);
CFBitVectorSetBitAtIndex(bitVector, 0, 1);
CFRange range = CFRangeMake(0, 16);
Byte bytes[2] = {0,0};
unsigned int bytesValue = 0;
CFBitVectorGetBits(bitVector, range, bytes);
memcpy(&bytesValue, bytes, sizeof(bytes));
return bytesValue;
What is going on here? Is this just an issue with endianness?
Yes.
Your computer is little endian. The 16-bit value 32768 would be represented in-memory as:
00 80
On a little endian machine. You have:
80 00
Which is the opposite, representing 128 as you're seeing.
I'm attempting to implement circular bit-shifting in C++. It kind of works, except after a certain point I get a bunch of zeroes.
for (int n=0;n<12;n++) {
unsigned char x=0x0f;
x=((x<<n)|(x>>(8-n))); //chars are 8 bits
cout<<hex<<"0x"<<(int)x<<endl;
}
My output is:
0xf
0x1e
0x3c
0x78
0xf0
0xe1
0xc3
0x87
0xf
0x0
0x0
0x0
As you can see, I start getting 0x0's instead of the expected 0x1e, 0x3c, etc.
If I expand the for loop to iterate 60 times or so, the numbers come back correctly (after a bunch of zeroes.)
I'm assuming that a char houses a big space, and the "gaps" of unused data are zeroes. My understanding is a bit limited, so any suggestions would be appreciated. Is there a way to toss out those zeroes?
Shifting by a negative amount is undefined behavior.
You loop from 0 to 12, but you have an 8 - n in your shifts. So that will go negative.
If you want to handle n > 8, you'll need to take a modulus by 8. (assuming you want 8-bit circular shift.)
for (int n=0; n < 12; n++) {
unsigned char x = 0x0f;
int shift = n % 8; // Wrap modulus
x = ((x << shift) | (x >> (8 - shift))); //chars are 8 bits
cout << hex << "0x" << (int)x << endl;
}
Shifting a byte left by more than 7 will always result in 0.
Also, shifting by a negative amount is not defined.
In order to fix this you have to limit the shift to the size of the type.
Basically:
unsigned char x = 0xf;
int shift = n&7;
x=((x<<shift)|(x>>(8-shift)))
I have run into an interesting problem lately:
Lets say I have an array of bytes (uint8_t to be exact) of length at least one. Now i need a function that will get a subsequence of bits from this array, starting with bit X (zero based index, inclusive) and having length L and will return this as an uint32_t. If L is smaller than 32 the remaining high bits should be zero.
Although this is not very hard to solve, my current thoughts on how to do this seem a bit cumbersome to me. I'm thinking of a table of all the possible masks for a given byte (start with bit 0-7, take 1-8 bits) and then construct the number one byte at a time using this table.
Can somebody come up with a nicer solution? Note that i cannot use Boost or STL for this - and no, it is not a homework, its a problem i run into at work and we do not use Boost or STL in the code where this thing goes. You can assume that: 0 < L <= 32 and that the byte array is large enough to hold the subsequence.
One example of correct input/output:
array: 00110011 1010 1010 11110011 01 101100
subsequence: X = 12 (zero based index), L = 14
resulting uint32_t = 00000000 00000000 00 101011 11001101
Only the first and last bytes in the subsequence will involve some bit slicing to get the required bits out, while the intermediate bytes can be shifted in whole into the result. Here's some sample code, absolutely untested -- it does what I described, but some of the bit indices could be off by one:
uint8_t bytes[];
int X, L;
uint32_t result;
int startByte = X / 8, /* starting byte number */
startBit = 7 - X % 8, /* bit index within starting byte, from LSB */
endByte = (X + L) / 8, /* ending byte number */
endBit = 7 - (X + L) % 8; /* bit index within ending byte, from LSB */
/* Special case where start and end are within same byte:
just get bits from startBit to endBit */
if (startByte == endByte) {
uint8_t byte = bytes[startByte];
result = (byte >> endBit) & ((1 << (startBit - endBit)) - 1);
}
/* All other cases: get ending bits of starting byte,
all other bytes in between,
starting bits of ending byte */
else {
uint8_t byte = bytes[startByte];
result = byte & ((1 << startBit) - 1);
for (int i = startByte + 1; i < endByte; i++)
result = (result << 8) | bytes[i];
byte = bytes[endByte];
result = (result << (8 - endBit)) | (byte >> endBit);
}
Take a look at std::bitset and boost::dynamic_bitset.
I would be thinking something like loading a uint64_t with a cast and then shifting left and right to lose the uninteresting bits.
uint32_t extract_bits(uint8_t* bytes, int start, int count)
{
int shiftleft = 32+start;
int shiftright = 64-count;
uint64_t *ptr = (uint64_t*)(bytes);
uint64_t hold = *ptr;
hold <<= shiftleft;
hold >>= shiftright;
return (uint32_t)hold;
}
For the sake of completness, i'am adding my solution inspired by the comments and answers here. Thanks to all who bothered to think about the problem.
static const uint8_t firstByteMasks[8] = { 0xFF, 0x7F, 0x3F, 0x1F, 0x0F, 0x07, 0x03, 0x01 };
uint32_t getBits( const uint8_t *buf, const uint32_t bitoff, const uint32_t len, const uint32_t bitcount )
{
uint64_t result = 0;
int32_t startByte = bitoff / 8; // starting byte number
int32_t endByte = ((bitoff + bitcount) - 1) / 8; // ending byte number
int32_t rightShift = 16 - ((bitoff + bitcount) % 8 );
if ( endByte >= len ) return -1;
if ( rightShift == 16 ) rightShift = 8;
result = buf[startByte] & firstByteMasks[bitoff % 8];
result = result << 8;
for ( int32_t i = startByte + 1; i <= endByte; i++ )
{
result |= buf[i];
result = result << 8;
}
result = result >> rightShift;
return (uint32_t)result;
}
Few notes: i tested the code and it seems to work just fine, however, there may be bugs. If i find any, i will update the code here. Also, there are probably better solutions!