Bitwise unpacking using signed data - c++

I've been trying for a while pack & unpack some chars into an integer. Although there are some topics related to this question, my problem is related with the signed shift. I don't get the 'trick' to unpack a signed value, i.e.:
char c1 = -119;
char c2 = 26;
// pack
int packed = (unsigned char)c1 | (c2 << 8);
// unpack
c1 = packed >> 0;
c2 = packed >> 8;
// printf(c1, c2) -> Unpacked data: -119 | 26
That works as expected but when i try to pack more data, i.e:
char c0 = -42;
char c1 = -119;
char c2 = 26;
// pack
int packed = (unsigned char)c0 | (unsigned char)(c1 << 8) | (c2 << 16);
// unpack
c0 = packed >> 0;
c1 = packed >> 8;
c2 = packed >> 16;
// printf -> Unpacked data: -42 | 0 | 26
c1 value is missed. I guess It's related to something with the sign bit is shifted into the high-order position.
How could i get back c1 value?
Thanks in advance.

You are casting c1 to unsigned char after shifting it out of the range of that type, so the result of the cast is zero. You should do the cast before shifting:
int packed = (unsigned char)c0 | ((unsigned char)c1 << 8) | (c2 << 16);

(unsigned char)(c1 << 8)
This will
shift the wrong (sign-extended) value
trim the result to 8 bits (yielding 0)
You don't want any of that so you should use ((unsigned char)c1 << 8).

Some ints are 16bits. For this code to be portable use int32_t. The correct way to accomplish this (if slightly paranoid) is:
int32_t packed = ((uint8_t)c0) | (((uint8_t)c1)<<8) | (((uint8_t)c2) << 16);
I also tend to list these in reverse order, so it is more natural which characters become the most and least significant bytes.

Related

C++ Bitshift 4 int_8t into a normal integer (32 bit )

I had already asked a question how to get 4 int8_t into a 32bit int, I was told that I have to cast the int8_t to a uint8_t first to pack it with bitshifting into a 32bit integer.
int8_t offsetX = -10;
int8_t offsetY = 120;
int8_t offsetZ = -60;
using U = std::uint8_t;
int toShader = (U(offsetX) << 24) | (U(offsetY) << 16) | (U(offsetZ) << 8) | (0 << 0);
std::cout << (int)(toShader >> 24) << " "<< (int)(toShader >> 16) << " " << (int)(toShader >> 8) << std::endl;
My Output is
-10 -2440 -624444
It's not what I expected, of course, does anyone have a solution?
In the shader I want to unpack the int16 later and that is only possible with a 32bit integer because glsl does not have any other data types.
int offsetX = data[gl_InstanceID * 3 + 2] >> 24;
int offsetY = data[gl_InstanceID * 3 + 2] >> 16 ;
int offsetZ = data[gl_InstanceID * 3 + 2] >> 8 ;
What is written in the square bracket does not matter it is about the correct shifting of the bits or casting after the bracket.
If any of the offsets is negative, then the shift results in undefined behaviour.
Solution: Convert the offsets to an unsigned type first.
However, this brings another potential problem: If you convert to unsigned, then negative numbers will have very large values with set bits in most significant bytes, and OR operation with those bits will always result in 1 regardless of offsetX and offsetY. A solution is to convert into a small unsigned type (std::uint8_t), and another is to mask the unused bytes. Former is probably simpler:
using U = std::uint8_t;
int third = U(offsetX) << 24u
| U(offsetY) << 16u
| U(offsetZ) << 8u
| 0u << 0u;
I think you're forgetting to mask the bits that you care about before shifting them.
Perhaps this is what you're looking for:
int32 offsetX = (data[gl_InstanceID * 3 + 2] & 0xFF000000) >> 24;
int32 offsetY = (data[gl_InstanceID * 3 + 2] & 0x00FF0000) >> 16 ;
int32 offsetZ = (data[gl_InstanceID * 3 + 2] & 0x0000FF00) >> 8 ;
if (offsetX & 0x80) offsetX |= 0xFFFFFF00;
if (offsetY & 0x80) offsetY |= 0xFFFFFF00;
if (offsetZ & 0x80) offsetZ |= 0xFFFFFF00;
Without the bit mask, the X part will end up in offsetY, and the X and Y part in offsetZ.
on CPU side you can use union to avoid bit shifts and bit masking and branches ...
int8_t x,y,z,w; // your 8bit ints
int32_t i; // your 32bit int
union my_union // just helper union for the casting
{
int8_t i8[4];
int32_t i32;
} a;
// 4x8bit -> 32bit
a.i8[0]=x;
a.i8[1]=y;
a.i8[2]=z;
a.i8[3]=w;
i=a.i32;
// 32bit -> 4x8bit
a.i32=i;
x=a.i8[0];
y=a.i8[1];
z=a.i8[2];
w=a.i8[3];
If you do not like unions the same can be done with pointers...
Beware on GLSL side is this not possible (nor unions nor pointers) and you have to use bitshifts and masks like in the other answer...

Split an integer into bytes and combine back into the integers results into error

Toy program to split an integer into 4 bytes and later combine these bytes to get back the input value results into error. However the program works for positive integers. I am interested in signed integers. Need help.
Expected Output: -12345
Actual Output: -57
int main()
{
int j,i = -12345;
char b[4];
b[0] = (i >> 24) & 0xFF;
b[1] = (i >> 16) & 0xFF;
b[2] = (i >> 8) & 0xFF;
b[3] = (i >> 0) & 0xFF;
j = (int)((b[0] << 24) | (b[1] << 16) | (b[2] << 8) | (b[3] << 0));
std::cout << j;
return 0;
}
There are actually two problems that leads to your "error".
The first is that the result of e.g. b[0] << 24 will be an int. When you cast that to a char (and assuming that char is an 8-bit type) then you cut off the top 24 bits of the value, truncating it.
The second problem is that char could be unsigned (it's implementation-defined if char is signed or unsigned). If char is unsigned then the value -1 (0xffffffff) will become 255 (0x000000ff).
When you then bring all that together it will almost certainly result in wrong values.
In general, whenever you feel the need to do a C-style cast (like in (char)(b[0] << 24)) when programming in C++, you should take that as a sign that you're doing something wrong.
One possible way to solve your problem, always work with explicit unsigned data-types.
First you need to copy the original int value to an unsigned int:
unsigned ui;
memcpy(&ui, &i, sizeof ui);
Then use ui instead of i when doing the "split". And explicitly use unsigned char:
unsigned char b[sizeof(unsigned)] = { 0 };
b[0] = (ui >> 24) & 0xFF;
b[1] = (ui >> 16) & 0xFF;
b[2] = (ui >> 8) & 0xFF;
b[3] = (ui >> 0) & 0xFF;
Then to put it all back, again use an explicit unsigned type, and copy it to the resulting variable:
unsigned uj = (b[0] << 24) | (b[1] << 16) | (b[2] << 8) | (b[3] << 0);
memcpy(&j, &uj, sizeof j);
I suggest using unsigned data types here to avoid possible problems that can come from sign-extension during conversion.
Your code works only for possessive numbers! "i" is negative and by shifting it to to right b[0] becomes positive! and finally desensitization results error!
try
int main()
{
int j, i = -12345;
const char* bytes = reinterpret_cast<const char*>(&i);
j = *reinterpret_cast<const int*>(bytes);
std::cout << j;
return 0;
}

Correct way to concatenate bitwise operations?

Im in need to concatenate some bitwise operations but the current output seems to be wrong. The splitted operations are similar to this :
unsigned char a = 0x12
unsigned char x = 0x00;
x = a << 4;
x = x >> 4;
expected result x = 0x02;
current result x = 0x02;
If i try to concatenate the operations the result is not correct:
unsigned char a = 0x12
unsigned char x = 0x00;
x = (a << 4) >> 4;
expected result x = 0x02;
current result x = 0x12;
Thanks in advance for any suggestion.
The problem is (a << 4) is cast to int (via Integral promotion), so (0x12 << 4) >> 4 is essentially 0x12
What you want to do is convert back (a << 4) to unsigned char by using static_cast
The final code:
unsigned char a = 0x12;
unsigned char x = 0x00;
x = static_cast<unsigned char>(a << 4) >> 4;
Compiler is NOT applying integral promotions for the >> and << operations
You might think that
x = (a << 4) >> 4;
Would use a byte-wide register for the operation, but the compiler promotes the char a to an int before doing the shift, preserving the bits that are shifted to the left.
You can solve this by doing this:
x = ((a << 4) & 0xff) >> 4;
Again, the issue is that integral promotion preserves the bits until the final cast.

c++ 64 bit network to host translation

I know there are answers for this question using using gcc byteswap and other alternatives on the web but was wondering why my code below isn't working.
Firstly I have gcc warnings ( which I feel shouldn't be coming ) and reason why I don't want to use byteswap is because I need to determine if my machine is big endian or little endian and use byteswap accordingly i.,e if my machine is big endian I could memcpy the bytes as is without any translation otherwise I need to swap them and copy it.
static inline uint64_t ntohl_64(uint64_t val)
{
unsigned char *pp =(unsigned char *)&val;
uint64_t val2 = ( pp[0] << 56 | pp[1] << 48
| pp[2] << 40 | pp[3] << 32
| pp[4] << 24 | pp[5] << 16
| pp[6] << 8 | pp[7]);
return val2;
}
int main()
{
int64_t a=0xFFFF0000;
int64_t b=__const__byteswap64(a);
int64_t c=ntohl_64(a);
printf("\n %lld[%x] [%lld] [%lld]\n ", a, a, b, c);
}
Warnings:-
In function \u2018uint64_t ntohl_64(uint64_t)\u2019:
warning: left shift count >= width of type
warning: left shift count >= width of type
warning: left shift count >= width of type
warning: left shift count >= width of type
Output:-
4294901760[00000000ffff0000] 281470681743360[0000ffff00000000] 65535[000000000000ffff]
I am running this on a little endian machine so byteswap and ntohl_64 should result in exact same values but unfortunately I get completely unexpected results. It would be great if someone can pointout whats wrong.
The reason your code does not work is because you're shifting unsigned chars. As they shift the bits fall off the top and any shift greater than 7 can be though of as returning 0 (though some implementations end up with weird results due to the way the machine code shifts work, x86 is an example). You have to cast them to whatever you want the final size to be first like:
((uint64_t)pp[0]) << 56
Your optimal solution with gcc would be to use htobe64. This function does everything for you.
P.S. It's a little bit off topic, but if you want to make the function portable across endianness you could do:
Edit based on Nova Denizen's comment:
static inline uint64_t htonl_64(uint64_t val)
{
union{
uint64_t retVal;
uint8_t bytes[8];
};
bytes[0] = (val & 0x00000000000000ff);
bytes[1] = (val & 0x000000000000ff00) >> 8;
bytes[2] = (val & 0x0000000000ff0000) >> 16;
bytes[3] = (val & 0x00000000ff000000) >> 24;
bytes[4] = (val & 0x000000ff00000000) >> 32;
bytes[5] = (val & 0x0000ff0000000000) >> 40;
bytes[6] = (val & 0x00ff000000000000) >> 48;
bytes[7] = (val & 0xff00000000000000) >> 56;
return retVal;
}
static inline uint64_t ntohl_64(uint64_t val)
{
union{
uint64_t inVal;
uint8_t bytes[8];
};
inVal = val;
return bytes[0] |
((uint64_t)bytes[1]) << 8 |
((uint64_t)bytes[2]) << 16 |
((uint64_t)bytes[3]) << 24 |
((uint64_t)bytes[4]) << 32 |
((uint64_t)bytes[5]) << 40 |
((uint64_t)bytes[6]) << 48 |
((uint64_t)bytes[7]) << 56;
}
Assuming the compiler doesn't do something to the uint64_t on it's way back through the return, and assuming the user treats the result as an 8-byte value (and not an integer), that code should work on any system. With any luck, your compiler will be able to optimize out the whole expression if you're on a big endian system and use some builtin byte swapping technique if you're on a little endian machine (and it's guaranteed to still work on any other kind of machine).
uint64_t val2 = ( pp[0] << 56 | pp[1] << 48
| pp[2] << 40 | pp[3] << 32
| pp[4] << 24 | pp[5] << 16
| pp[6] << 8 | pp[7]);
pp[0] is an unsigned char and 56 is an int, so pp[0] << 56 performs the left-shift as an unsigned char, with an unsigned char result. This isn't what you want, because you want all these shifts to have type unsigned long long.
The way to fix this is to cast, like ((unsigned long long)pp[0]) << 56.
Since pp[x] is 8-bit wide, the expression pp[0] << 56 results in zero. You need explicit masking on the original value and then shifting:
uint64_t val2 = (( val & 0xff ) << 56 ) |
(( val & 0xff00 ) << 48 ) |
...
In any case, just use compiler built-ins, they usually result in a single byte-swapping instruction.
Casting and shifting works as PlasmaHH suggesting but I don't know why 32 bit shifts upconvert automatically and not 64 bit.
typedef uint64_t __u64;
static inline uint64_t ntohl_64(uint64_t val)
{
unsigned char *pp =(unsigned char *)&val;
return ((__u64)pp[0] << 56 |
(__u64)pp[1] << 48 |
(__u64)pp[2] << 40 |
(__u64)pp[3] << 32 |
(__u64)pp[4] << 24 |
(__u64)pp[5] << 16 |
(__u64)pp[6] << 8 |
(__u64)pp[7]);
}

Get signed integer from 2 16-bit signed bytes?

So this sensor I have returns a signed value between -500-500 by returning two (high and low) signed bytes. How can I use these to figure out what the actual value is? I know I need to do 2's complement, but I'm not sure how. This is what I have now -
real_velocity = temp.values[0];
if(temp.values[1] != -1)
real_velocity += temp.values[1];
//if high byte > 1, negative number - take 2's compliment
if(temp.values[1] > 1) {
real_velocity = ~real_velocity;
real_velocity += 1;
}
But it just returns the negative value of what would be a positive. So for instance, -200 returns bytes 255 (high) and 56(low). Added these are 311. But when I run the above code it tells me -311. Thank you for any help.
-200 in hex is 0xFF38,
you're getting two bytes 0xFF and 0x38,
converting these back to decimal you might recognise them
0xFF = 255,
0x38 = 56
your sensor is not returning 2 signed bytes but a simply the high and low byte of a signed 16 bit number.
so your result is
value = (highbyte << 8) + lowbyte
value being a 16 bit signed variable.
Based on the example you gave, it appears that the value is already 2's complement. You just need to shift the high byte left 8 bits and OR the values together.
real_velocity = (short) (temp.values[0] | (temp.values[1] << 8));
You can shift the bits and mask the values.
int main()
{
char data[2];
data[0] = 0xFF; //high
data[1] = 56; //low
int value = 0;
if (data[0] & 0x80) //sign
value = 0xFFFF8000;
value |= ((data[0] & 0x7F) << 8) | data[1];
std::cout<<std::hex<<value<<std::endl;
std::cout<<std::dec<<value<<std::endl;
std::cin.get();
}
Output:
ffffff38
-200
real_velocity = temp.values[0];
real_velocity = real_velocity << 8;
real_velocity |= temp.values[1];
// And, assuming 32-bit integers
real_velocity <<= 16;
real_velocity >>= 16;
For 8-bit bytes, first just convert to unsigned:
typedef unsigned char Byte;
unsigned const u = (Byte( temp.values[1] ) << 8) | Byte( temp.values[0] );
Then if that is greater than the upper range for 16-bit two's complement, subtract 216:
int const i = int(u >= (1u << 15)? u - (1u << 16) : u);
You could do tricks at the bit level, but I don't think there's any point in that.
The above assuming that CHAR_BIT = 8, that unsigned is more than 16 bits, and that the machine and desired result is two's complement.
#include <iostream>
using namespace std;
int main()
{
typedef unsigned char Byte;
struct { char values[2]; } const temp = { 56, 255 };
unsigned const u = (Byte( temp.values[1] ) << 8) | Byte( temp.values[0] );
int const i = int(u >= (1u << 15)? u - (1u << 16) : u);
cout << i << endl;
}