Bitwise or before casting to int32 - c++

I was messing about with arrays and noticed this. EG:
int32_t array[];
int16_t value = -4000;
When I tried to write the value into the top and bottom half of the int32 array value,
array[0] = (value << 16) | value;
the compiler would cast the value into a 32 bit value first before the doing the bit shift and the bitwise OR. Thus, instead of 16 bit -4000 being written in the top and bottom halves, the top value will be -1 and the bottom will be -4000.
Is there a way to OR in the 16 bit value of -4000 so both halves are -4000? It's not really a huge problem. I am just curious to know if it can be done.

Sure thing, just undo the sign-extension:
array[0] = (value << 16) | (value & 0xFFFF);
Don't worry, the compiler should handle this reasonably.
To avoid shifting a negative number:
array[0] = ((value & 0xFFFF) << 16) | (value & 0xFFFF);
Fortunately that extra useless & (even more of a NOP than the one on the right) doesn't show up in the code.

Left shift on signed types is defined only in some cases. From standard
6.5.7/4 [...] If E1 has a signed type and nonnegative value, and E1 × 2E2 is representable in the result type, then that is the resulting
value; otherwise, the behavior is undefined.
According to this definition, it seems what you have is undefined behaviour.

use unsigned value:
const uint16_t uvalue = value
array[0] = (uvalue << 16) | uvalue;

Normally when faced with this kind of issue I first set the resultant value to zero, then bitwise assign the values in.
So the code would be:
int32_t array[1];
int16_t value = -4000;
array[0] = 0x0000FFFF;
array[0] &= value;
array[0] |= (value << 16);

Cast the 16 too. Otherwise the int type is contagious.
array[0] = (value << (int16_t 16)) | value;
Edit: I don't have a compiler handy to test. Per comment below, this may not be right, but will get you in the right direction.

Related

Signed extension from 24 bit to 32 bit in C++

I have 3 unsigned bytes that are coming over the wire separately.
[byte1, byte2, byte3]
I need to convert these to a signed 32-bit value but I am not quite sure how to handle the sign of the negative values.
I thought of copying the bytes to the upper 3 bytes in the int32 and then shifting everything to the right but I read this may have unexpected behavior.
Is there an easier way to handle this?
The representation is using two's complement.
You could use:
uint32_t sign_extend_24_32(uint32_t x) {
const int bits = 24;
uint32_t m = 1u << (bits - 1);
return (x ^ m) - m;
}
This works because:
if the old sign was 1, then the XOR makes it zero and the subtraction will set it and borrow through all higher bits, setting them as well.
if the old sign was 0, the XOR will set it, the subtract resets it again and doesn't borrow so the upper bits stay 0.
Templated version
template<class T>
T sign_extend(T x, const int bits) {
T m = 1;
m <<= bits - 1;
return (x ^ m) - m;
}
Assuming both representations are two's complement, simply
upper_byte = (Signed_byte(incoming_msb) >= 0? 0 : Byte(-1));
where
using Signed_byte = signed char;
using Byte = unsigned char;
and upper_byte is a variable representing the missing fourth byte.
The conversion to Signed_byte is formally implementation-dependent, but a two's complement implementation doesn't have a choice, really.
You could let the compiler process itself the sign extension. Assuming that the lowest significant byte is byte1 and the high significant byte is byte3;
int val = (signed char) byte3; // C guarantees the sign extension
val << 16; // shift the byte at its definitive place
val |= ((int) (unsigned char) byte2) << 8; // place the second byte
val |= ((int) (unsigned char) byte1; // and the least significant one
I have used C style cast here when static_cast would have been more C++ish, but as an old dinosaur (and Java programmer) I find C style cast more readable for integer conversions.
This is a pretty old question, but I recently had to do the same (while dealing with 24-bit audio samples), and wrote my own solution for it. It's using a similar principle as this answer, but more generic, and potentially generates better code after compiling.
template <size_t Bits, typename T>
inline constexpr T sign_extend(const T& v) noexcept {
static_assert(std::is_integral<T>::value, "T is not integral");
static_assert((sizeof(T) * 8u) >= Bits, "T is smaller than the specified width");
if constexpr ((sizeof(T) * 8u) == Bits) return v;
else {
using S = struct { signed Val : Bits; };
return reinterpret_cast<const S*>(&v)->Val;
}
}
This has no hard-coded math, it simply lets the compiler do the work and figure out the best way to sign-extend the number. With certain widths, this can even generate a native sign-extension instruction in the assembly, such as MOVSX on x86.
This function assumes you copied your N-bit number into the lower N bits of the type you want to extend it to. So for example:
int16_t a = -42;
int32_t b{};
memcpy(&b, &a, sizeof(a));
b = sign_extend<16>(b);
Of course it works for any number of bits, extending it to the full width of the type that contained the data.
Here's a method that works for any bit count, even if it's not a multiple of 8. This assumes you've already assembled the 3 bytes into an integer value.
const int bits = 24;
int mask = (1 << bits) - 1;
bool is_negative = (value & ~(mask >> 1)) != 0;
value |= -is_negative & ~mask;
You can use a bitfield
template<size_t L>
inline int32_t sign_extend_to_32(const char *x)
{
struct {int32_t i: L;} s;
memcpy(&s, x, 3);
return s.i;
// or
return s.i = (x[2] << 16) | (x[1] << 8) | x[0]; // assume little endian
}
Easy and no undefined behavior invoked
int32_t r = sign_extend_to_32<24>(your_3byte_array);
Of course copying the bytes to the upper 3 bytes in the int32 and then shifting everything to the right as you thought is also a good idea. There's no undefined behavior if you use memcpy like above. An alternative is reinterpret_cast in C++ and union in C, which can avoid the use of memcpy. However there's an implementation defined behavior because right shift is not always a sign-extension shift (although almost all modern compilers do that)
Assuming your 24bit value is stored in variable int32_t val, you can easily extend the sign by following:
val = (val << 8) >> 8;

Merge char* arrays to uint16_t

I'm trying to merge a char*-array into a uint16_t. This is my code so far:
char* arr = new char[2]{0};
arr[0] = 0x0; // Binary: 00000000
arr[1] = 0xBE; // Binary: 10111110
uint16_t merged = (arr[0] << 8) + arr[1];
cout << merged << " As Bitset: " << bitset<16>(merged) << endl;
I was expecting merged to be 0xBE, or in binary 00000000 10111110.
But the output of the application is:
65470 As Bitset: 1111111110111110
In the following descriptions I'm reading bits from left to right.
So arr[1] is at the right position, which is the last 8 bits.
The first 8 bits however are set to 1, which they should not be.
Also, if I change the values to
arr[0] = 0x1; // Binary: 00000001
arr[1] = 0xBE; // Binary: 10111110
The output is:
0000000010111110
Again, arr[1] is at the right position. But now the first 8 bits are 0, whereas the last on of the first 8 should be 1.
Basically what I wan't to do is append arr[1] to arr[0] and interpret the new number as whole.
Perhaps char is signed type in your case, and you are left-shifting 0xBE vhich is interpreted as signed negative value (-66 in a likely case of two's complement).
It is undefined behavior according to the standard. In practice it is often results in extending the sign bit, hence leading ones.
3.9.1 Fundamental types....
It is implementationdefined
whether a char object can hold negative values.
5.8 Shift operators....
The value of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are zero-filled. If E1 has an unsigned
type, the value of the result is E1 × 2E2, reduced modulo one more than the maximum value representable
in the result type. Otherwise, if E1 has a signed type and non-negative value, and E1×2E2 is representable
in the corresponding unsigned type of the result type, then that value, converted to the result type, is the
resulting value; otherwise, the behavior is undefined.
You need to assign to the wider type before shifting, otherwise you're shifting away† your high bits before they ever even hit the only variable here that's big enough to hold them.
uint16_t merged = arr[0];
merged <<= 8;
merged += arr[1];
Or, arguably:
const uint16_t merged = ((uint16_t)arr[0] << 8) + arr[1];
You also may want to consider converting through unsigned char first to avoid oddities with the high bit set. Try out a few different values in your unit test suite and see what happens.
† Well, your program has undefined behaviour from this out-of-range shift, so who knows what might happen!

Bits shifted by bit shifting operators(<<, >>) in C, C++

can we access the bits shifted by bit shifting operators(<<, >>) in C, C++?
For example:
23>>1
can we access the last bit shifted(1 in this case)?
No, the shift operators only give the value after shifting. You'll need to do other bitwise operations to extract the bits that are shifted out of the value; for example:
unsigned all_lost = value & ((1 << shift)-1); // all bits to be removed by shift
unsigned last_lost = (value >> (shift-1)) & 1; // last bit to be removed by shift
unsigned remaining = value >> shift; // lose those bits
By using 23>>1, the bit 0x01 is purged - you have no way of retrieving it after the bit shift.
That said, nothing's stopping you from checking for the bit before shifting:
int value = 23;
bool bit1 = value & 0x01;
int shifted = value >> 1;
You can access the bits before shifting, e.g.
value = 23; // start with some value
lsbits = value & 1; // extract the LSB
value >>= 1; // shift
It worth signal that on MSVC compiler an intrinsic function exists: _bittest
that speeds up the operation.

How to set specific bits?

Let's say I've got a uint16_t variable where I must set specific bits.
Example:
uint16_t field = 0;
That would mean the bits are all zero: 0000 0000 0000 0000
Now I get some values that I need to set at specific positions.
val1=1; val2=2, val3=0, val4=4, val5=0;
The structure how to set the bits is the following
0|000| 0000| 0000 000|0
val1 should be set at the first bit on the left. so its only one or zero.
val2 should be set at the next three bits. val3 on the next four bits. val4 on the next seven bits and val5 one the last bit.
The result would be this:
1010 0000 0000 1000
I only found out how to the one specific bit but not 'groups'. (shift or bitset)
Does anyone have an idea how to solve this issue?
There are (at least) two basic approaches. One would be to create a struct with some bitfields:
struct bits {
unsigned a : 1;
unsigned b : 7;
unsigned c : 4;
unsigned d : 3;
unsigned e : 1;
};
bits b;
b.a = val1;
b.b = val2;
b.c = val3;
b.d = val4;
b.e = val5;
To get the 16-bit value, you could (for one example) create a union of that struct with a uint16_t. Just one minor problem: the standard doesn't guarantee what order the bit fields will end up in when you look at the 16-bit value. Just for example, you might need to reverse the order I've given above to get the order from most to least significant bits that you really want (but changing compilers might muck things up again).
The other obvious possibility would be to use shifting and masking to put the pieces together into a number:
int16_t result = val1 | (val2 << 1) | (val3 << 8) | (val4 << 12) | (val5 << 15);
For the moment, I've assumed each of the inputs starts out in the correct range (i.e., has a value that can be represented in the chosen number of bits). If there's a possibility that could be wrong, you'd want to mask it to the correct number of bits first. The usual way to do that is something like:
uint16_t result = input & ((1 << num_bits) - 1);
In case you're curious about the math there, it works like this. Lets's assume we want to ensure an input fits in 4 bits. Shifting 1 left 4 bits produces 00010000 (in binary). Subtracting one from that then clears the one bit that's set, and sets all the less significant bits than that, giving 00001111 for our example. That gives us the first least significant bits set. When we do a bit-wise AND between that and the input, any higher bits that were set in the input are cleared in the result.
One of the solutions would be to set a K-bit value starting at the N-th bit of field as:
uint16_t value_mask = ((1<<K)-1) << N; // for K=4 and N=3 will be 00..01111000
field = field & ~value_mask; // zeroing according bits inside the field
field = field | ((value << N) & value_mask); // AND with value_mask is for extra safety
Or, if you can use struct instead of uint16_t, you can use Bit fields and let the compiler to perform all these actions for you.
finalvle = 0;
finalvle = (val1&0x01)<<15;
finalvle += (val2&0x07)<<12;
finalvle += (val3&0x0f)<<8
finalvle += (val4&0xfe)<<1;
finalvle += (val5&0x01);
You can use the bitwise or and shift operators to achieve this.
Use shift << to 'move bytes to the left':
int i = 1; // ...0001
int j = i << 3 // ...1000
You can then use bitwise or | to put it at the right place, (assuming you have all zeros at the bits you are trying to overwrite).
int k = 0; // ...0000
k |= i // ...0001
k |= j // ...1001
Edit: Note that #Inspired's answer also explains with zeroing out a certain area of bits. It overall explains how you would go about implementing it properly.
try this code:
uint16_t shift(uint16_t num, int shift)
{
return num | (int)pow (2, shift);
}
where shift is position of bit that you wanna set

Arithmetic vs logical shift operation in C++

I have some code that stuffs in parameters of various length (u8, u16, u32) into a u64 with the left shift operator.
Then at various places in the code i need to get back the original parameters from this big bloated parameter.
Just wondering how , in the code, should we ensure that its a logical right shift and not arithmetic one while getting back the original parameters.
So the qestion is are there any #defs or other ways to ensure and check whether the compiler will screw up?
Here's the C++ code:
u32 x , y ,z;
u64 uniqID = 0;
u64 uniqID = (s64) x << 54 |
(s64) y << 52 |
(s64) z << 32 |
uniqID; // the original uniqID value.
And later on while getting the values back :
z= (u32) ((uniqID >> 32 ) & (0x0FFFFF)); //20 bits
y= (u32) ((uniqID >> (52 ) & 0x03)); //2 bits
x= (u32) ((uniqID >> (54) & 0x03F)); //6 bits
The general rule is a logical shift is suitable for unsigned binary numbers, while the arithmetic shift is suitable for signed 2's comp numbers. It will depend on your compiler (gcc etc), not so much the language, but you can assume that the compiler will use a logical shift for unsigned numbers... So if you have an unsigned type one would think that it will be a logical shift.
You can always write your own method to check and do the shifting if you need some portability between compilers. Or you can use in-line asm to do this and avoid any issues (but you would be fixed to a platform).
In short to be 100% correct check your compiler doco.
This looks like C/C++, so just make sure uniqID is an unsigned integer type.
Alternatively, just cast it:
z = (u32) ( ((unsigned long long)uniqID >> (32) & (0x0FFFFF)); //20 bits
y = (u32) ( ((unsigned long long)uniqID >> (52) & 0x03)) ; //2 bits
x = (u32) ( ((unsigned long long)uniqID >> (54) & 0x03F)) ; //6 bits