C++ bit shifting

C++ bit shifting - c++

I am new to working with bits & bytes in C++ and I'm looking at some previously developed code and I need some help in understanding what is going on with the code. There is a byte array and populating it with some data and I noticed that the data was being '&' with a 0x0F (Please see code snipped below). I don't really understand what is going on there....if somebody could please explain that, it would be greatly apperciated. Thanks!
//Message Definition
/*
Byte 1: Bit(s) 3:0 = Unused; set to zero
Bit(s) 7:4 = Message ID; set to 10
*/
/*
Byte 2: Bit(s) 3:0 = Unused; set to zero
Bit(s) 7:4 = Acknowledge Message ID; set to 11
*/
//Implementation
BYTE Msg_Arry[2];
int Msg_Id = 10;
int AckMsg_Id = 11;
Msg_Arry[0] = Msg_Id & 0x0F; //MsgID & Unused
Msg_Arry[1] = AckMsg_Id & 0x0F; //AckMsgID & Unused

0x0f is 00001111 in binary. When you perform a bitwise-and (&) with this, it has the effect of masking off the top four bits of the char (because 0 & anything is always 0).

x & 0xF
returns the low four bits of the data.
If you think of the binary representation of x, and use the and operator with 0x0f (00001111 in binary), the top four bits of x will always become zero, and the bottom four bits will become what they were before the operation.

In the given example, it actually does nothing. Msg_Id and AckMsg_Id are both less than 0x0F, and so masking them has no effect here.
However the use of the bitwise-and operator (&) on integer types performs a bit for bit AND between the given operands.

Related

c++ combining 2 uint8_t into one uint16_t not working?

So I have a little piece of code that takes 2 uint8_t's and places then next to each other, and then returns a uint16_t. The point is not adding the 2 variables, but putting them next to each other and creating a uint16_t from them.
The way I expect this to work is that when the first uint8_t is 0, and the second uint8_t is 1, I expect the uint16_t to also be one.
However, this is in my code not the case.
This is my code:
uint8_t *bytes = new uint8_t[2];
bytes[0] = 0;
bytes[1] = 1;
uint16_t out = *((uint16_t*)bytes);
It is supposed to make the bytes uint8_t pointer into a uint16_t pointer, and then take the value. I expect that value to be 1 since x86 is little endian. However it returns 256.
Setting the first byte to 1 and the second byte to 0 makes it work as expected. But I am wondering why I need to switch the bytes around in order for it to work.
Can anyone explain that to me?
Thanks!

There is no uint16_t or compatible object at that address, and so the behaviour of *((uint16_t*)bytes) is undefined.
I expect that value to be 1 since x86 is little endian. However it returns 256.
Even if the program was fixed to have well defined behaviour, your expectation is backwards. In little endian, the least significant byte is stored in the lowest address. Thus 2 byte value 1 is stored as 1, 0 and not 0, 1.
Does endianess also affect the order of the bit's in the byte or not?
There is no way to access a bit by "address"1, so there is no concept of endianness. When converting to text, bits are conventionally shown most significant on left and least on right; just like digits of decimal numbers. I don't know if this is true in right to left writing systems.
1 You can sort of create "virtual addresses" for bits using bitfields. The order of bitfields i.e. whether the first bitfield is most or least significant is implementation defined and not necessarily related to byte endianness at all.
Here is a correct way to set two octets as uint16_t. The result will depend on endianness of the system:
// no need to complicate a simple example with dynamic allocation
uint16_t out;
// note that there is an exception in language rules that
// allows accessing any object through narrow (unsigned) char
// or std::byte pointers; thus following is well defined
std::byte* data = reinterpret_cast<std::byte*>(&out);
data[0] = 1;
data[1] = 0;
Note that assuming that input is in native endianness is usually not a good choice, especially when compatibility across multiple systems is required, such as when communicating through network, or accessing files that may be shared to other systems.
In these cases, the communication protocol, or the file format typically specify that the data is in specific endianness which may or may not be the same as the native endianness of your target system. De facto standard in network communication is to use big endian. Data in particular endianness can be converted to native endianness using bit shifts, as shown in Frodyne's answer for example.

In a little endian system the small bytes are placed first. In other words: The low byte is placed on offset 0, and the high byte on offset 1 (and so on). So this:
uint8_t* bytes = new uint8_t[2];
bytes[0] = 1;
bytes[1] = 0;
uint16_t out = *((uint16_t*)bytes);
Produces the out = 1 result you want.
However, as you can see this is easy to get wrong, so in general I would recommend that instead of trying to place stuff correctly in memory and then cast it around, you do something like this:
uint16_t out = lowByte + (highByte << 8);
That will work on any machine, regardless of endianness.
Edit: Bit shifting explanation added.
x << y means to shift the bits in x y places to the left (>> moves them to the right instead).
If X contains the bit-pattern xxxxxxxx, and Y contains the bit-pattern yyyyyyyy, then (X << 8) produces the pattern: xxxxxxxx00000000, and Y + (X << 8) produces: xxxxxxxxyyyyyyyy.
(And Y + (X<<8) + (Z<<16) produces zzzzzzzzxxxxxxxxyyyyyyyy, etc.)
A single shift to the left is the same as multiplying by 2, so X << 8 is the same as X * 2^8 = X * 256. That means that you can also do: Y + (X*256) + (Z*65536), but I think the shifts are clearer and show the intent better.
Note that again: Endianness does not matter. Shifting 8 bits to the left will always clear the low 8 bits.
You can read more here: https://en.wikipedia.org/wiki/Bitwise_operation. Note the difference between Arithmetic and Logical shifts - in C/C++ unsigned values use logical shifts, and signed use arithmetic shifts.

If p is a pointer to some multi-byte value, then:
"Little-endian" means that the byte at p is the least-significant byte, in other words, it contains bits 0-7 of the value.
"Big-endian" means that the byte at p is the most-significant byte, which for a 16-bit value would be bits 8-15.
Since the Intel is little-endian, bytes[0] contains bits 0-7 of the uint16_t value and bytes[1] contains bits 8-15. Since you are trying to set bit 0, you need:
bytes[0] = 1; // Bits 0-7
bytes[1] = 0; // Bits 8-15

Your code works but your misinterpreted how to read "bytes"
#include <cstdint>
#include <cstddef>
#include <iostream>
int main()
{
uint8_t *in = new uint8_t[2];
in[0] = 3;
in[1] = 1;
uint16_t out = *((uint16_t*)in);
std::cout << "out: " << out << "\n in: " << in[1]*256 + in[0]<< std::endl;
return 0;
}
By the way, you should take care of alignment when casting this way.

One way to think in numbers is to use MSB and LSB order
which is MSB is the highest Bit and LSB ist lowest Bit for
Little Endian machines.
For ex.
(u)int32: MSB:Bit 31 ... LSB: Bit 0
(u)int16: MSB:Bit 15 ... LSB: Bit 0
(u)int8 : MSB:Bit 7 ... LSB: Bit 0
with your cast to a 16Bit value the Bytes will arrange like this
16Bit <= 8Bit 8Bit
MSB ... LSB BYTE[1] BYTE[0]
Bit15 Bit0 Bit7 .. 0 Bit7 .. 0
0000 0001 0000 0000 0000 0001 0000 0000
which is 256 -> correct value.

Using first bit as a flag in an unsigned int

I am trying to use the first bit of an unsigned int as a flag if the client as server should re-key for their encryption keys. I would like to use the rest of the unsigned int as the length for the remaining data that is being sent.
Currently I am trying this but it doesn't seem to work.
unsigned int payloadLength;
read(sock, &payloadLength, sizeof(payloadLength));
short bit = (payloadLength >> 0) & 1U; // get first bit
payloadLength &= 1UL << 0; // set first bit to 0
payloadLength = ntohl(payloadLength);
if (bit == 1)
//rekey
else
//read more data
The rekey flag seems to be set right, but when trying to get the length it always ends up as the wrong number.
Edit: I should have clarified I meant most significant bit, not first bit

Let's take a look at this code:
short bit = (payloadLength >> 0) & 1U; // get first bit
payloadLength &= 1UL << 0; // set first bit to 0
payloadLength = ntohl(payloadLength);
Although it's perfectly legal for you to order these statements this way, there's a risk that this won't work the way you want it to. Specifically, you probably want to decode the payload to use the host byte ordering before you start poking and prodding the bytes. Otherwise, if you send data from one system to another, there's a risk that you'll be reading the wrong bits back. But that's not too hard to fix - just move the last statement, which decodes payloadLength, to the top, like this:
/* CAUTION: This still has errors! */
payloadLength = ntohl(payloadLength);
short bit = (payloadLength >> 0) & 1U; // get first bit
payloadLength &= 1UL << 0; // set first bit to 0
Next, there's a question about which bit you are trying to read. It looks like you are trying to read the least-significant bit of the number. If that's the case, you don't need to include any bitshifts, as bitshifting by zero positions doesn't have any effect. Let's remove those, giving this:
/* CAUTION: This still has errors! */
payloadLength = ntohl(payloadLength);
short bit = payloadLength & 1U; // get first bit
payloadLength &= 1UL; // set first bit to 0
Your code to extract the final bit of the number is correct. It'll mask out everything except the lowest bit, then store the value in the variable bit. (Out of curiosity, is there any reason you're storing this as a short? If you're looking for a boolean "do I need to reencode this?," you might want to just go with bool and write something like
bool reencode = (payloadLength & 1U) != 0;
That's up to you to decide, though.)
However, your code to clear the last bit is incorrect. Right now, when you write
payloadLength &= 1UL; // set first bit to 0
you are actually doing the opposite of what you intended - you're clearing every bit except the first. That's because if you AND payloadLength with a value, you're zeroing every bit in payloadLength except for the bits that are equal to 1 in 1UL. However, 1UL only has a 1 bit in the last position. You probably meant to write something like
payloadLength &= ~1UL; // set first bit to 0
where the ~ operator flips the bits of your mask. Overall, this would give you the following:
payloadLength = ntohl(payloadLength);
short bit = payloadLength & 1U; // get first bit
payloadLength &= ~1UL; // set first bit to 0
I have one last question, though. By repurposing the lowest bit of the number this way, you are requiring that your payload length always be an even number, since you're harnessing the 1's bit of the payload length to encode whether to rekey. If you're okay with that, great! You don't need to do anything.
On the other hand, if that's an issue, you have a couple of options, which I'll leave to you to select from:
Instead of using the least-significant bit, use the most-significant bit. This will cause problems if you try to send payloads of size 231 or higher, though.
Use the least-significant bit, but have the upper 31 bits of the number represent the actual payload length. To extract the payload length, just shift everything over one position. This has the drawback of not supporting payloads of size 231 or greater, though.
Don't use the bits at all to encode this! Instead, send a payload length, then have the payload start with a header that tells you whether to rekey, etc. This uses more bytes per payload, but also gives you more flexibility in the future (what if you need to send other flags as well?)
Hope this helps!

payloadLength &= 1UL << 0; will zero all bits other than the first one.
payloadLength &= ~1UL; will zero the first bit.

16-bit to 10-bit conversion code explanation

I came across the following code to convert 16-bit numbers to 10-bit numbers and store it inside an integer. Could anyone maybe explain to me what exactly is happening with the AND 0x03?
// Convert the data to 10-bits
int xAccl = (((data[1] & 0x03) * 256) + data[0]);
if(xAccl > 511) {
xAccl -= 1024;
}
Link to where I got the code: https://www.instructables.com/id/Measurement-of-Acceleration-Using-ADXL345-and-Ardu/

The bitwise operator & will make a mask, so in this case, it voids the 6 highest bits of the integer.
Basically, this code does a modulo % 1024 (for unsigned values).

data[1] takes the 2nd byte; & 0x03 masks that byte with binary 11 - so: takes 2 bits; * 256 is the same as << 8 - i.e. pushes those 2 bits into the 9th and 10th positions; adding data[0] to data combines these two bytes (personally I'd have used |, not +).
So; xAccl is now the first 10 bits, using big-endian ordering.
The > 511 seems to be a sign check; essentially, it is saying "if the 10th bit is set, treat the entire thing as a negative integer as though we'd used 10-bit twos complement rules".

PIC Bit Masking and Shifting for 4 Bit LCD Control

I have a question regarding both masking and bit shifting.
I have the following code:
void WriteLCD(unsigned char word, unsigned commandType, unsigned usDelay)
{
// Most Significant Bits
// Need to do bit masking for upper nibble, and shift left by 8.
LCD_D = (LCD & 0x0FFF) | (word << 8);
EnableLCD(commandType, usDelay); // Send Data
// Least Significant Bits
// Need to do bit masking for lower nibble, and shift left by 12.
LCD_D = (LCD & 0x0FFF) | (word << 12);
EnableLCD(commandType, usDelay); // Send Data
}
The "word" is 8 bits, and is being put through a 4 bit LCD interface. Meaning I have to break the most significant bits and least significant bits apart before I send the data.
LCD_D is a 16 bit number, in which only the most significant bits I pass to it I want to actually "do" something. I want the previous 12 bits preserved in case they were doing something else.
Is my logic in terms of bit masking and shifting the "word" correct in terms of passing the upper and lower nibbles appropriately to the LCD_D?
Thanks for the help!

Looks ok apart from needing to cast "word" to an unsigned short (16 bit) before the shift, in both cases, so that the shift is not performed on a char and looses the data. eg:
LCD_D = (LCD & 0x0FFF) | ((unsigned short) word << 8);

What is this doing: "input >> 4 & 0x0F"?

I don't understand what this code is doing at all, could someone please explain it?
long input; //just here to show the type, assume it has a value stored
unsigned int output( input >> 4 & 0x0F );
Thanks

bitshifts the input 4 bits to the right, then masks by the lower 4 bits.
Take this example 16 bit number: (the dots are just for visual separation)
1001.1111.1101.1001 >> 4 = 0000.1001.1111.1101
0000.1001.1111.1101 & 0x0F = 1101 (or 0000.0000.0000.1101 to be more explicit)

& is the bitwise AND operator. "& 0x0F" is sometimes done to pad the first 4 bits with 0s, or ignore the first(leftmost) 4 bits in a value.
0x0f = 00001111. So a bitwise & operation of 0x0f with any other bit pattern will retain only the rightmost 4 bits, clearing the left 4 bits.
If the input has a value of 01010001, after doing &0x0F, we'll get 00000001 - which is a pattern we get after clearing the left 4 bits.
Just as another example, this is a code I've used in a project:
Byte verflag = (Byte)(bIsAck & 0x0f) | ((version << 4) & 0xf0). Here I'm combining two values into a single Byte value to save space because it's being used in a packet header structure. bIsAck is a BOOL and version is a Byte whose value is very small. So both these values can be contained in a single Byte variable.
The first nibble in the resultant variable will contain the value of version and the second nibble will contain the value of bIsAck. I can retrieve the values into separate variables at the receiving by doing a 4 bits >> while taking the value of version.
Hope this is somewhere near to what you asked for.

That is doing a bitwise right shift the contents of "input" by 4 bits, then doing a bitwise AND of the result with 0x0F (1101).
What it does depends on the contents and type of "input". Is it an int? A long? A string (which would mean the shift and bitwise AND are being done on a pointer to the first byte).
Google for "c++ bitwise operations" for more details on what's going on under the hood.
Additionally, look at C++ operator precedence because the C/C++ precedence is not exactly the same as in many other languages.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

C++ bit shifting - c++

0x0f is 00001111 in binary. When you perform a bitwise-and (&) with this, it has the effect of masking off the top four bits of the char (because 0 & anything is always 0).

x & 0xF returns the low four bits of the data. If you think of the binary representation of x, and use the and operator with 0x0f (00001111 in binary), the top four bits of x will always become zero, and the bottom four bits will become what they were before the operation.

In the given example, it actually does nothing. Msg_Id and AckMsg_Id are both less than 0x0F, and so masking them has no effect here. However the use of the bitwise-and operator (&) on integer types performs a bit for bit AND between the given operands.