Parsing 32 bit integer in C program - c++

I have a 32 bit integer, split into parts like this:
--------------------------------------
| Part1 | Part2 | Part 3 |
--------------------------------------
Part 1 higher 16 bits. (Part 2 + Part 3) = lower 16 bits.
Part 2 is 10 bits and Part 3 is 6 bits
I need help on how do we read and update part 1, part2 and part 3 in C program.

Given an integer x with the above format, you can replace Part2 like this:
x = (x & ~(0x3ff << 6)) | (newPart2 << 6);
and Part3 like so:
x = (x & ~0x3f) | newPart3;
This assumes that both newPart2 and newPart3 are e.g. unsigned int with their new values right-adjusted.

int i
To extract the individual parts
part1 = (i & 0xFFFF0000) >> 16
part2 = (i & 0x0000FFC0) >> 6
part3 = (i & 0x0000003F)
To compose the integer
i = (part1 << 16) | (part2 << 6) | (part3)

Try cast to this structure
struct {
uint32_t part_1:16;
uint32_t part_2:10;
uint32_t part_3:6;
} parts;
Could be the one below depending on endianness
struct {
uint32_t part_1:6;
uint32_t part_2:10;
uint32_t part_3:16;
} parts;
Obviously not portable!
Since you need to read and update, a pointer will do. For example, if you 32bit value is called x, you do the following
parts *ptr = (parts *)&x;
ptr->part_2 = <part2 update>

The theory to be used behind this are and, or and shift operations with masks.
To access some bits of the integer, first create a mask where there are ones in the bits you want to be used. Now apply and and(&) operation between the mask and the integer. According to the behavior of the & the bits where the mask is 0 will be 0 and where the mask is 1 will have the value of that bit in the integer. Now that we have only the bits we want we align them to the right, that is done shifting the bits to the right the correct number of positions as to leave the rightmost bit of the mask in the less significant position of the byte.
To write in a part of a byte, we need fist to nullify what was in that part for that we use the negated mask that is used to read that part. Once that part is negated we apply an or(|) operation with the new value that must be aligned to that position.
To read:
unsigned int read_part_1(unsigned int composed) {
return (composed & 0xffff0000) >> 16;
}
unsigned int read_part_2(unsigned int composed) {
return (composed & 0x0000ffc0) >> 6;
}
unsigned int read_part_3(unsigned int composed) {
return (composed & 0x0000003f);
}
To write(val aligned to the right):
unsigned int write_part_1(unsigned int composed, unsigned int val) {
return (composed & ~0xffff0000) | ((val & 0x0000ffff) << 16);
}
unsigned int write_part_2(unsigned int composed, unsigned int val) {
return (composed & ~0x0000ffc0) | ((val & 0x000003ff) << 10);
}
unsigned int write_part_3(unsigned int composed, unsigned int val) {
return (composed & ~0x0000003f) | (val & 0x0000003f);
}

Related

Reverse nibbles of a hexadecimal number in C++

What would be the fastest way possible to reverse the nibbles (e.g digits) of a hexadecimal number in C++?
Here's an example of what I mean : 0x12345 -> 0x54321
Here's what I already have:
unsigned int rotation (unsigned int hex) {
unsigned int result = 0;
while (hex) {
result = (result << 4) | (hex & 0xF);
hex >>= 4;
}
return result;
}
This problem can be split into two parts:
Reverse the nibbles of an integer. Reverse the bytes, and swap the nibble within each byte.
Shift the reversed result right by some amount to adjust for the "variable length". There are std::countl_zero(x) & -4 (number of leading zeroes, rounded down to a multiple of 4) leading zero bits that are part of the leading zeroes in hexadecimal, shifting right by that amount makes them not participate in the reversal.
For example, using some of the new functions from <bit>:
#include <stdint.h>
#include <bit>
uint32_t reverse_nibbles(uint32_t x) {
// reverse bytes
uint32_t r = std::byteswap(x);
// swap adjacent nibbles
r = ((r & 0x0F0F0F0F) << 4) | ((r >> 4) & 0x0F0F0F0F);
// adjust for variable-length of input
int len_of_zero_prefix = std::countl_zero(x) & -4;
return r >> len_of_zero_prefix;
}
That requires C++23 for std::byteswap which may be a bit optimistic, you can substitute it with some other byteswap.
Easily adaptable to uint64_t too.
i would do it without loops based on the assumption that the input is 32 bits
result = (hex & 0x0000000f) << 28
| (hex & 0x000000f0) << 20
| (hex & 0x00000f00) << 12
....
dont know if faster, but I find it more readable

Convert every 5 bits into integer values in C++

Firstly, if anyone has a better title for me, let me know.
Here is an example of the process I am trying to automate with C++
I have an array of values that appear in this format:
9C07 9385 9BC7 00 9BC3 9BC7 9385
I need to convert them to binary and then convert every 5 bits to decimal like so with the last bit being a flag:
I'll do this with only the first word here.
9C07
10011 | 10000 | 00011 | 1
19 | 16 | 3
These are actually x,y,z coordinates and the final bit determines the order they are in a '0' would make it x=19 y=16 z=3 and '1' is x=16 y=3 z=19
I already have a buffer filled with these hex values, but I have no idea where to go from here.
I assume these are integer literals, not strings?
The way to do this is with bitwise right shift (>>) and bitwise AND (&)
#include <cstdint>
struct Coordinate {
std::uint8_t x;
std::uint8_t y;
std::uint8_t z;
constexpr Coordinate(std::uint16_t n) noexcept
{
if (n & 1) { // flag
x = (n >> 6) & 0x1F; // 1 1111
y = (n >> 1) & 0x1F;
z = n >> 11;
} else {
x = n >> 11;
y = (n >> 6) & 0x1F;
z = (n >> 1) & 0x1F;
}
}
};
The following code would extract the three coordinates and the flag from the 16 least significant bits of value (ie. its least significant word).
int flag = value & 1; // keep only the least significant bit
value >>= 1; // shift right by one bit
int third_integer = value & 0x1f; // keep only the five least significant bits
value >>= 5; // shift right by five bits
int second_integer = value & 0x1f; // keep only the five least significant bits
value >>= 5; // shift right by five bits
int first_integer = value & 0x1f; // keep only the five least significant bits
value >>= 5; // shift right by five bits (only useful if there are other words in "value")
What you need is most likely some loop doing this on each word of your array.

Manually changing a group of bytes in an unsigned int

I'm working with C and I'm trying to figure out how to change a set of bits in a 32-bit unsigned integer.
For example, if I have
int a = 17212403u;
In binary, that becomes 1000001101010001111110011. Now, supposing I labeled these bits, which are arranged in little-endian format, such that the bit utmost right represents the ones, the second to the right is the twos, and so on, how can I manually change a group of bits?
For example, suppose I wanted to change the bits such that the 11th bit to the 15th bit has the decimal value of 17. How would this be possible?
I was thinking of getting that range by doing as such:
unsigned int range = (a << (sizeof(a) * 8) - 14) >> (28)
But I'm not sure where to go on from now.
You will (1) first have to clear the bits 11..15, and (2) then to set the bits according to the value you want to set. To achieve (1), create a "mask" that has all bits set to 1 except the ones that you want to clear; use then a & bitMask to set the bits to 0. Then, use | myValue to set the bits to the value wanted.
Use the bit shift operator << to place the mask and the value at the right positions:
int main(int argc, char** argv) {
// Let's assume a range of 5 bits
unsigned int bitRange = 0x1Fu; // is ...00000000011111
// Let's assume to position the range from bit 11 onwards (i.e. move 10 left):
bitRange = bitRange << 10; // something like 000000111110000000000
unsigned int bitMask = ~bitRange; // something like 111111000001111111111
unsigned int valueToSet = (17u << 10); // corresponds to 000000101110000000000
unsigned int a = (17212403u & bitMask) | valueToSet;
return 0;
}
This is the long version to explain what's going on. In brief, you could also write:
unsigned int a = (17212403u & ~(0x1Fu << 10)) | (17u << 10)
The 11th to 15th bit is 5 bits, assuming you meant including the 15th bit. 5 bits is the hex value: 0x1f
Then you shift these 5 bits 11 position to the left:0x1f << 11
Now we have a mask for the bits 11 through 15 that we want to clear in the original variable, which - we do that by inverting the mask, bitwise and the variable with the inverted mask: a & ~(0x1f << 11)
Next is shifting the value 17 up to the 11th bit: 17 << 11
Then we bitwise or that into the 5 bits we have cleared:
unsigned int b = (a & ~(0x1f << 11)) | (17 << 11)
Consider using bit fields. This allows you to name and access sub-sections of the integer as though they were integer members of a struct.
For info on C bitfields see:
https://www.tutorialspoint.com/cprogramming/c_bit_fields.htm
Below is code to do what you want, using bitfields. The "middle5" member of the struct holds bits 11-15. The "lower11" member is a filler for the lower 11 bits, so that the "middle5" member will be in the right place.
#include <stdio.h>
void showBits(unsigned int w)
{
unsigned int bit = 1<<31;
while (bit > 0)
{
printf("%d", ((bit & w) != 0)? 1 : 0);
bit >>= 1;
}
printf("\n");
}
int main(int argc, char* argv[])
{
struct aBitfield {
unsigned int lower11: 11;
unsigned int middle5: 5;
unsigned int upper16: 16;
};
union uintBits {
unsigned int whole;
struct aBitfield parts;
};
union uintBits b;
b.whole = 17212403u;
printf("Before:\n");
showBits(b.whole);
b.parts.middle5 = 17;
printf("After:\n");
showBits(b.whole);
}
Output of the program:
Before:
00000001000001101010001111110011
After:
00000001000001101000101111110011
Of course, you would want to use more meaningful naming for the various fields.
Be careful though, bitfields may be implemented differently on different platforms - so it may not be completely portable.

How to read/write arbitrary bits in C/C++

Assuming I have a byte b with the binary value of 11111111
How do I for example read a 3 bit integer value starting at the second bit or write a four bit integer value starting at the fifth bit?
Some 2+ years after I asked this question I'd like to explain it the way I'd want it explained back when I was still a complete newb and would be most beneficial to people who want to understand the process.
First of all, forget the "11111111" example value, which is not really all that suited for the visual explanation of the process. So let the initial value be 10111011 (187 decimal) which will be a little more illustrative of the process.
1 - how to read a 3 bit value starting from the second bit:
___ <- those 3 bits
10111011
The value is 101, or 5 in decimal, there are 2 possible ways to get it:
mask and shift
In this approach, the needed bits are first masked with the value 00001110 (14 decimal) after which it is shifted in place:
___
10111011 AND
00001110 =
00001010 >> 1 =
___
00000101
The expression for this would be: (value & 14) >> 1
shift and mask
This approach is similar, but the order of operations is reversed, meaning the original value is shifted and then masked with 00000111 (7) to only leave the last 3 bits:
___
10111011 >> 1
___
01011101 AND
00000111
00000101
The expression for this would be: (value >> 1) & 7
Both approaches involve the same amount of complexity, and therefore will not differ in performance.
2 - how to write a 3 bit value starting from the second bit:
In this case, the initial value is known, and when this is the case in code, you may be able to come up with a way to set the known value to another known value which uses less operations, but in reality this is rarely the case, most of the time the code will know neither the initial value, nor the one which is to be written.
This means that in order for the new value to be successfully "spliced" into byte, the target bits must be set to zero, after which the shifted value is "spliced" in place, which is the first step:
___
10111011 AND
11110001 (241) =
10110001 (masked original value)
The second step is to shift the value we want to write in the 3 bits, say we want to change that from 101 (5) to 110 (6)
___
00000110 << 1 =
___
00001100 (shifted "splice" value)
The third and final step is to splice the masked original value with the shifted "splice" value:
10110001 OR
00001100 =
___
10111101
The expression for the whole process would be: (value & 241) | (6 << 1)
Bonus - how to generate the read and write masks:
Naturally, using a binary to decimal converter is far from elegant, especially in the case of 32 and 64 bit containers - decimal values get crazy big. It is possible to easily generate the masks with expressions, which the compiler can efficiently resolve during compilation:
read mask for "mask and shift": ((1 << fieldLength) - 1) << (fieldIndex - 1), assuming that the index at the first bit is 1 (not zero)
read mask for "shift and mask": (1 << fieldLength) - 1 (index does not play a role here since it is always shifted to the first bit
write mask : just invert the "mask and shift" mask expression with the ~ operator
How does it work (with the 3bit field beginning at the second bit from the examples above)?
00000001 << 3
00001000 - 1
00000111 << 1
00001110 ~ (read mask)
11110001 (write mask)
The same examples apply to wider integers and arbitrary bit width and position of the fields, with the shift and mask values varying accordingly.
Also note that the examples assume unsigned integer, which is what you want to use in order to use integers as portable bit-field alternative (regular bit-fields are in no way guaranteed by the standard to be portable), both left and right shift insert a padding 0, which is not the case with right shifting a signed integer.
Even easier:
Using this set of macros (but only in C++ since it relies on the generation of member functions):
#define GETMASK(index, size) ((((size_t)1 << (size)) - 1) << (index))
#define READFROM(data, index, size) (((data) & GETMASK((index), (size))) >> (index))
#define WRITETO(data, index, size, value) ((data) = (((data) & (~GETMASK((index), (size)))) | (((value) << (index)) & (GETMASK((index), (size))))))
#define FIELD(data, name, index, size) \
inline decltype(data) name() const { return READFROM(data, index, size); } \
inline void set_##name(decltype(data) value) { WRITETO(data, index, size, value); }
You could go for something as simple as:
struct A {
uint bitData;
FIELD(bitData, one, 0, 1)
FIELD(bitData, two, 1, 2)
};
And have the bit fields implemented as properties you can easily access:
A a;
a.set_two(3);
cout << a.two();
Replace decltype with gcc's typeof pre-C++11.
You need to shift and mask the value, so for example...
If you want to read the first two bits, you just need to mask them off like so:
int value = input & 0x3;
If you want to offset it you need to shift right N bits and then mask off the bits you want:
int value = (intput >> 1) & 0x3;
To read three bits like you asked in your question.
int value = (input >> 1) & 0x7;
just use this and feelfree:
#define BitVal(data,y) ( (data>>y) & 1) /** Return Data.Y value **/
#define SetBit(data,y) data |= (1 << y) /** Set Data.Y to 1 **/
#define ClearBit(data,y) data &= ~(1 << y) /** Clear Data.Y to 0 **/
#define TogleBit(data,y) (data ^=BitVal(y)) /** Togle Data.Y value **/
#define Togle(data) (data =~data ) /** Togle Data value **/
for example:
uint8_t number = 0x05; //0b00000101
uint8_t bit_2 = BitVal(number,2); // bit_2 = 1
uint8_t bit_1 = BitVal(number,1); // bit_1 = 0
SetBit(number,1); // number = 0x07 => 0b00000111
ClearBit(number,2); // number =0x03 => 0b0000011
You have to do a shift and mask (AND) operation.
Let b be any byte and p be the index (>= 0) of the bit from which you want to take n bits (>= 1).
First you have to shift right b by p times:
x = b >> p;
Second you have to mask the result with n ones:
mask = (1 << n) - 1;
y = x & mask;
You can put everything in a macro:
#define TAKE_N_BITS_FROM(b, p, n) ((b) >> (p)) & ((1 << (n)) - 1)
"How do I for example read a 3 bit integer value starting at the second bit?"
int number = // whatever;
uint8_t val; // uint8_t is the smallest data type capable of holding 3 bits
val = (number & (1 << 2 | 1 << 3 | 1 << 4)) >> 2;
(I assumed that "second bit" is bit #2, i. e. the third bit really.)
To read bytes use std::bitset
const int bits_in_byte = 8;
char myChar = 's';
cout << bitset<sizeof(myChar) * bits_in_byte>(myChar);
To write you need to use bit-wise operators such as & ^ | & << >>. make sure to learn what they do.
For example to have 00100100 you need to set the first bit to 1, and shift it with the << >> operators 5 times. if you want to continue writing you just continue to set the first bit and shift it. it's very much like an old typewriter: you write, and shift the paper.
For 00100100: set the first bit to 1, shift 5 times, set the first bit to 1, and shift 2 times:
const int bits_in_byte = 8;
char myChar = 0;
myChar = myChar | (0x1 << 5 | 0x1 << 2);
cout << bitset<sizeof(myChar) * bits_in_byte>(myChar);
int x = 0xFF; //your number - 11111111
How do I for example read a 3 bit integer value starting at the second bit
int y = x & ( 0x7 << 2 ) // 0x7 is 111
// and you shift it 2 to the left
If you keep grabbing bits from your data, you might want to use a bitfield. You'll just have to set up a struct and load it with only ones and zeroes:
struct bitfield{
unsigned int bit : 1
}
struct bitfield *bitstream;
then later on load it like this (replacing char with int or whatever data you are loading):
long int i;
int j, k;
unsigned char c, d;
bitstream=malloc(sizeof(struct bitfield)*charstreamlength*sizeof(char));
for (i=0; i<charstreamlength; i++){
c=charstream[i];
for(j=0; j < sizeof(char)*8; j++){
d=c;
d=d>>(sizeof(char)*8-j-1);
d=d<<(sizeof(char)*8-1);
k=d;
if(k==0){
bitstream[sizeof(char)*8*i + j].bit=0;
}else{
bitstream[sizeof(char)*8*i + j].bit=1;
}
}
}
Then access elements:
bitstream[bitpointer].bit=...
or
...=bitstream[bitpointer].bit
All of this is assuming are working on i86/64, not arm, since arm can be big or little endian.

Bit shifts and their logical operators

This program below moves the last (junior) and the penultimate bytes variable i type int. I'm trying to understand why the programmer wrote this
i = (i & LEADING_TWO_BYTES_MASK) | ((i & PENULTIMATE_BYTE_MASK) >> 8) | ((i & LAST_BYTE_MASK) << 8);
Can anyone explain to me in plain English whats going on in the program below.
#include <stdio.h>
#include <cstdlib>
#define LAST_BYTE_MASK 255 //11111111
#define PENULTIMATE_BYTE_MASK 65280 //1111111100000000
#define LEADING_TWO_BYTES_MASK 4294901760 //11111111111111110000000000000000
int main(){
unsigned int i = 0;
printf("i = ");
scanf("%d", &i);
i = (i & LEADING_TWO_BYTES_MASK) | ((i & PENULTIMATE_BYTE_MASK) >> 8) | ((i & LAST_BYTE_MASK) << 8);
printf("i = %d", i);
system("pause");
}
Since you asked for plain english: He swaps the first and second bytes of an integer.
The expression is indeed a bit convoluted but in essence the author does this:
// Mask out relevant bytes
unsigned higher_order_bytes = i & LEADING_TWO_BYTES_MASK;
unsigned first_byte = i & LAST_BYTE_MASK;
unsigned second_byte = i & PENULTIMATE_BYTE_MASK;
// Switch positions:
unsigned first_to_second = first_byte << 8;
unsigned second_to_first = second_byte >> 8;
// Concatenate back together:
unsigned result = higher_order_bytes | first_to_second | second_to_first;
Incidentally, defining the masks using hexadecimal notation is more readable than using decimal. Furthermore, using #define here is misguided. Both C and C++ have const:
unsigned const LEADING_TWO_BYTES_MASK = 0xFFFF0000;
unsigned const PENULTIMATE_BYTE_MASK = 0xFF00;
unsigned const LAST_BYTE_MASK = 0xFF;
To understand this code you need to know what &, | and bit shifts are doing on the bit level.
It's more instructive to define your masks in hexadecimal rather than decimal, because then they correspond directly to the binary representations and it's easy to see which bits are on and off:
#define LAST 0xFF // all bits in the first byte are 1
#define PEN 0xFF00 // all bits in the second byte are 1
#define LEAD 0xFFFF0000 // all bits in the third and fourth bytes are 1
Then
i = (i & LEAD) // leave the first 2 bytes of the 32-bit integer the same
| ((i & PEN) >> 8) // take the 3rd byte and shift it 8 bits right
| ((i & LAST) << 8) // take the 4th byte and shift it 8 bits left
);
So the expression is swapping the two least significant bytes while leaving the two most significant bytes the same.