I originally had 2 WORD (that's 4 bytes). I have stored them in an unsigned int. How can I split this such that I have 2 (left-most) bytes in one unsigned short variable and the other 2 bytes in another unsigned short variable?
I hope my question is clear, otherwise please tell me and I will add more details! :)
Example: I have this hexadecimal stored in unsigned int: 4f07aabb
How can I turn this into two unsigned shorts so one of them holds 4f07 and the other holds aabb?
If you are sure that unsigned int has at least 4 bytes on your target system (this is not guaranteed!), you can do:
unsigned short one = static_cast<unsigned short>(original >> (2 * 8));
unsigned short two = static_cast<unsigned short>(original % (1 << (2 * 8)));
This is only guaranteed to work if the original value indeed only contains a 4-byte value (possibly with padding zeroes in front). If you're not fond of bitshifting, you could also do
uint32_t original = 0x4f07aabb; // guarantee 32 bits
uint16_t parts[2];
std::memcpy(&parts[0], &original, sizeof(uint32_t));
unsigned short one = static_cast<unsigned short>(parts[0]);
unsigned short two = static_cast<unsigned short>(parts[1]);
This will yield the two values depending on the target system's endianness; on a litte-endian architecture, the results are reversed. You can check endianness with upcoming C++20's std::endian::native.
Related
This question already has answers here:
Split up two byte char into two single byte chars
(4 answers)
Closed 1 year ago.
I have an unsigned char lets say 0x5E which I want to split into two equal parts.
What is the required bit shifts in order to get this done? I did the following to convert hex in unsigned long to two parts
unsigned int first_half = (my_long & 0xffffffff00000000) >> 32;
unsigned int second_half = my_long & 0x00000000ffffffff;
How to go by doing it with the unsigned character. does 32 get replaced by the 8 because its a character.?
The original code does >> 32 because it tries to shift half of the bits down. my_long was a unsigned long int, which has 64 bits, so half of that is 32.
A character is a byte, which is 8, so it would be shifted 4 bits.
How do I merge two unsigned chars into a single unsigned short in c++.
The Most Significant Byte in the array is contained in array[0] and the Least Significant Byte is located at array[1] . (Big endian)
(array[0] << 8) | array[1]
Note that an unsigned char is implicitly converted ("promoted") to int when you use it for any calculations. Therefore array[0] << 8 doesn't overflow. Also the result of this calculation is an int, so you may cast it back to unsigned short if your compiler issues a warning.
Set the short to the least significant byte, shift the most significant byte by 8 and perform a binary or operation to keep the lower half
unsigned short s;
s = array[1];
s |= (unsigned short) array[0] << 8;
I need to read binary data which contain a column of numbers (time tags) and use 8bytes to record each number. I know that they are recorded in little endian order. If read correctly they should be decoded as (example)
...
2147426467
2147426635
2147512936
...
I recognize that the above numbers are on the 2^31 -1 threshold.
I try to read the data and invert the endiandness with:
(length is the total number of bytes and buffer is pointer to an array that contains the bytes)
unsigned long int tag;
//uint64_t tag;
for (int j=0; j<length; j=j+8) //read the whole file in 8-byte blocks
{ tag = 0;
for (int i=0; i<=7; i++) //read each block ,byte by byte
{tag ^= ((unsigned char)buffer[j+i])<<8*i ;} //shift each byte to invert endiandness and add them with ^=
}
}
when run, the code gives:
...
2147426467
2147426635
18446744071562097256
similar big numbers
...
The last number is not (2^64 - 1 - correct value).
Same result using uint64_t tag.
The code succeeds with declaring tag as
unsigned int tag;
but fails for tags greater than 2^32 -1. At least this makes sense.
I suppose I need some kind of casting on buffer[i+j] but I don't know how to do it.
(static_cast<uint64_t>(buffer[j+i]))
also doesn't work.
I read a similar question but still need some help.
We assume that buffer[j+i] is a char, and that chars are signed on your platform. Casting to unsigned char converts buffer[j+i] into an unsigned type. However, when applying the << operator, the unsigned char value gets promoted to int so long as an int can hold all values representable by unsigned char.
Your attempt to cast buffer[j+i] directly to uint64_t fails because if char is signed, the sign extension is still applied before the value is converted to the unsigned type.
A double cast may work (that is, cast to unsigned char and then to unsigned long), but using an unsigned long variable to hold the intermediate value should make the intention of the code more clear. For me, the code would look like:
decltype(tag) val = static_cast<unsigned char>(buffer[j+i]);
tag ^= val << 8*i;
You use a temporary value.
The computer will automatically reserve the least amount needed to store a temporary value. In your case that would be the 32 bits.
Once you shift the byte further than 32 bits it will be shifted into oblivion.
In order to fix this you need to explicitly store the value in a 64 bit integer first.
So instead of
{tag ^= ((unsigned char)buffer[j+i])<<8*i ;}
you should use something like this
{
unsigned long long tmp = (unsigned char)buffer[j+i];
tmp <<= 8*i;
tag ^= tmp;
}
I am dealing with very large list of booleans in C++, around 2^N items of N booleans each. Because memory is critical in such situation, i.e. an exponential growth, I would like to build a N-bits long variable to store each element.
For small N, for example 24, I am just using unsigned long int. It takes 64MB ((2^24)*32/8/1024/1024). But I need to go up to 36. The only option with build-in variable is unsigned long long int, but it takes 512GB ((2^36)*64/8/1024/1024/1024), which is a bit too much.
With a 36-bits variable, it would work for me because the size drops to 288GB ((2^36)*36/8/1024/1024/1024), which fits on a node of my supercomputer.
I tried std::bitset, but std::bitset< N > creates a element of at least 8B.
So a list of std::bitset< 1 > is much greater than a list of unsigned long int.
It is because the std::bitset just change the representation, not the container.
I also tried boost::dynamic_bitset<> from Boost, but the result is even worst (at least 32B!), for the same reason.
I know an option is to write all elements as one chain of booleans, 2473901162496 (2^36*36), then to store then in 38654705664 (2473901162496/64) unsigned long long int, which gives 288GB (38654705664*64/8/1024/1024/1024). Then to access an element is just a game of finding in which elements the 36 bits are stored (can be either one or two). But it is a lot of rewriting of the existing code (3000 lines) because mapping becomes impossible and because adding and deleting items during the execution in some functions will be surely complicated, confusing, challenging, and the result will be most likely not efficient.
How to build a N-bits variable in C++?
How about a struct with 5 chars (and perhaps some fancy operator overloading as needed to keep it compatible to the existing code)? A struct with a long and a char probably won't work because of padding / alignment...
Basically your own mini BitSet optimized for size:
struct Bitset40 {
unsigned char data[5];
bool getBit(int index) {
return (data[index / 8] & (1 << (index % 8))) != 0;
}
bool setBit(int index, bool newVal) {
if (newVal) {
data[index / 8] |= (1 << (index % 8));
} else {
data[index / 8] &= ~(1 << (index % 8));
}
}
};
Edit: As geza has also pointed out int he comments, the "trick" here is to get as close as possible to the minimum number of bytes needed (without wasting memory by triggering alignment losses, padding or pointer indirection, see http://www.catb.org/esr/structure-packing/).
Edit 2: If you feel adventurous, you could also try a bit field (and please let us know how much space it actually consumes):
struct Bitset36 {
unsigned long long data:36;
}
I'm not an expert, but this is what I would "try". Find the bytes for the smallest type your compiler supports (should be char). You can check with sizeof and you should get 1. That means 1 byte, so 8 bits.
So if you wanted a 24 bit type...you would need 3 chars. For 36 you would need 5 char array and you would have 4 bits of wasted padding on the end. This could easily be accounted for.
i.e.
char typeSize[3] = {0}; // should hold 24 bits
Now make a bit mask to access each position of typeSize.
const unsigned char one = 0b0000'0001;
const unsigned char two = 0b0000'0010;
const unsigned char three = 0b0000'0100;
const unsigned char four = 0b0000'1000;
const unsigned char five = 0b0001'0000;
const unsigned char six = 0b0010'0000;
const unsigned char seven = 0b0100'0000;
const unsigned char eight = 0b1000'0000;
Now you can use the bit-wise or to set the values to 1 where needed..
typeSize[1] |= four;
*typeSize[0] |= (four | five);
To turn off bits use the & operator..
typeSize[0] &= ~four;
typeSize[2] &= ~(four| five);
You can read the position of each bit with the & operator.
typeSize[0] & four
Bear in mind, I don't have a compiler handy to try this out so hopefully this is a useful approach to your problem.
Good luck ;-)
You can use array of unsigned long int and store and retrieve needed bit chains with bitwise operations. This approach excludes space overhead.
Simplified example for unsigned byte array B[] and 12-bit variables V (represented as ushort):
Set V[0]:
B[0] = V & 0xFF; //low byte
B[1] = B[1] & 0xF0; // clear low nibble
B[1] = B[1] | (V >> 8); //fill low nibble of the second byte with the highest nibble of V
I'm programming with a PLC and I'm reading values out of it.
It gives me the data in unsigned char. That's fine, but the values in my PLC can be over 255. And since unsigned chars can't give a value over 255 I get the wrong information.
The structure I get from the library:
struct PlcVarValue
{
unsigned long ulTimeStamp ALIGNATTRIB;
unsigned char bQuality ALIGNATTRIB;
unsigned char byData[1] ALIGNATTRIB;
};
ulTimeStamp gives the time
bQuality gives true/false (be able to read it or not)
byData[1] gives the data.
Anyways I'm trying this now: (where ppValues is an object of PlcVarValue)
unsigned char* variableValue = ppValues[0]->byData;
int iVariableValue = *variableValue;
This works fine... untill ppValues[0]->byData is > 255;
When I try the following when the number is for example 257:
unsigned char testValue = ppValues[0]->byData[0];
unsigned char testValue2 = ppValues[0]->byData[1];
the output is testvalue = 1 and testvalue2 = 1
that doesn't make sense to me.
So my question is, how can I get this solved so it gives me the correct number?
That actually looks like a variable-sized structure, where having an array of size 1 at the end being a common way to have it. See e.g. this tutorial about it.
In this case, both bytes being 1 for the value 257 is the correct values. Think of the two bytes as a 16-bit value, and combine the bits. One byte will become the hight byte, where 1 corresponds to 256, and then add the low bytes which is 1 and you have 256 + 1 which of course is equal to 257. Simple binary arithmetic.
Which byte is the high, and which is the low we can't say, but it's easy to check if you can force a message that contains the value 258 instead, as then one byte will still be 1 but the other will be 2.
How to combine it into a single unsigned 16-bit value is also easy if you know the bitwise shift and or operators:
uint8_t high_byte = ...
uint8_t low_byte = ...
uint16_t word = high_byte << 8 | low_byte;