Manually changing a group of bytes in an unsigned int - c++

I'm working with C and I'm trying to figure out how to change a set of bits in a 32-bit unsigned integer.
For example, if I have
int a = 17212403u;
In binary, that becomes 1000001101010001111110011. Now, supposing I labeled these bits, which are arranged in little-endian format, such that the bit utmost right represents the ones, the second to the right is the twos, and so on, how can I manually change a group of bits?
For example, suppose I wanted to change the bits such that the 11th bit to the 15th bit has the decimal value of 17. How would this be possible?
I was thinking of getting that range by doing as such:
unsigned int range = (a << (sizeof(a) * 8) - 14) >> (28)
But I'm not sure where to go on from now.

You will (1) first have to clear the bits 11..15, and (2) then to set the bits according to the value you want to set. To achieve (1), create a "mask" that has all bits set to 1 except the ones that you want to clear; use then a & bitMask to set the bits to 0. Then, use | myValue to set the bits to the value wanted.
Use the bit shift operator << to place the mask and the value at the right positions:
int main(int argc, char** argv) {
// Let's assume a range of 5 bits
unsigned int bitRange = 0x1Fu; // is ...00000000011111
// Let's assume to position the range from bit 11 onwards (i.e. move 10 left):
bitRange = bitRange << 10; // something like 000000111110000000000
unsigned int bitMask = ~bitRange; // something like 111111000001111111111
unsigned int valueToSet = (17u << 10); // corresponds to 000000101110000000000
unsigned int a = (17212403u & bitMask) | valueToSet;
return 0;
}
This is the long version to explain what's going on. In brief, you could also write:
unsigned int a = (17212403u & ~(0x1Fu << 10)) | (17u << 10)

The 11th to 15th bit is 5 bits, assuming you meant including the 15th bit. 5 bits is the hex value: 0x1f
Then you shift these 5 bits 11 position to the left:0x1f << 11
Now we have a mask for the bits 11 through 15 that we want to clear in the original variable, which - we do that by inverting the mask, bitwise and the variable with the inverted mask: a & ~(0x1f << 11)
Next is shifting the value 17 up to the 11th bit: 17 << 11
Then we bitwise or that into the 5 bits we have cleared:
unsigned int b = (a & ~(0x1f << 11)) | (17 << 11)

Consider using bit fields. This allows you to name and access sub-sections of the integer as though they were integer members of a struct.
For info on C bitfields see:
https://www.tutorialspoint.com/cprogramming/c_bit_fields.htm
Below is code to do what you want, using bitfields. The "middle5" member of the struct holds bits 11-15. The "lower11" member is a filler for the lower 11 bits, so that the "middle5" member will be in the right place.
#include <stdio.h>
void showBits(unsigned int w)
{
unsigned int bit = 1<<31;
while (bit > 0)
{
printf("%d", ((bit & w) != 0)? 1 : 0);
bit >>= 1;
}
printf("\n");
}
int main(int argc, char* argv[])
{
struct aBitfield {
unsigned int lower11: 11;
unsigned int middle5: 5;
unsigned int upper16: 16;
};
union uintBits {
unsigned int whole;
struct aBitfield parts;
};
union uintBits b;
b.whole = 17212403u;
printf("Before:\n");
showBits(b.whole);
b.parts.middle5 = 17;
printf("After:\n");
showBits(b.whole);
}
Output of the program:
Before:
00000001000001101010001111110011
After:
00000001000001101000101111110011
Of course, you would want to use more meaningful naming for the various fields.
Be careful though, bitfields may be implemented differently on different platforms - so it may not be completely portable.

Related

Concatenate Bits from 3 characters, taken from different locations in the bitset

I am trying to concatenate the bits of 3 characters a, b and c into a bitset of 16 bits. The constraints are the following:
Concatenate the last 2 bits of a into newVal1
Concatenate the 8 bits of b into newVal1
Concatenate the first 2 bits of c into newVal1
On paper I am getting: 1111111111110000 same as the result. But I am not sure of the way I am concatenating the bits. First shift left by 14 character a then shift left by 6 character b and finally, since There is no space left for character c then shift right by 2. Is there a better way to do it? It's already confusing for me
#include <iostream>
#include <bitset>
int main() {
int a = 0b11111111 & 0b00000011;
int b = 0b11111111;
int c = 0b11111111 & 0b11000000;
uint16_t newVal1 = (a << 14) + (b << 6) + (c >> 2 );
std::cout << std::bitset<16>(newVal1).to_string() << std::endl;
return 0;
}
First of all you need to consider the signed and unsigned integer problem. With signed integers you can get unexpected sign extensions, adding all ones at the top. And possible overflow will lead to undefined behavior.
So the first thing I would do is to use all unsigned integer values.
Then to make it clear and simple, my suggestion is that you do all the shifting on newVal1 instead, and just do bitwise OR into it:
unsigned a = /* value of a */;
unsigned b = /* value of b */;
unsigned c = /* value of c */
unsigned newVal1 = 0;
newVal1 |= a & 0x02; // Get lowest two bits of a
newVal1 <<= 8; // Make space for the next eight bits
newVal1 |= b & 0xffu; // "Concatenate" eight bits from b
newVal1 <<= 2; // Make space for the next two bits
newVal1 |= (c >> 6) & 0x02; // Get the two "top" bits from c
Now the lowest twelve bits of newVal1 should follow the three rules set up for your assignment. The bits top bits will be all zero.

Arduino code: shifting bits seems to change data type from int to long

on my Arduino, the following code produces output I don't understand:
void setup(){
Serial.begin(9600);
int a = 250;
Serial.println(a, BIN);
a = a << 8;
Serial.println(a, BIN);
a = a >> 8;
Serial.println(a, BIN);
}
void loop(){}
The output is:
11111010
11111111111111111111101000000000
11111111111111111111111111111010
I do understand the first line: leading zeros are not printed to the serial terminal. However, after shifting the bits the data type of a seems to have changed from int to long (32 bits are printed). The expected behaviour is that bits are shifted to the left, and that bits which are shifted "out" of the 16 bits an int has are simply dropped. Shifting the bits back does not turn the "32bit" variable to "16bit" again.
Shifting by 7 or less positions does not show this effect.
I probably should say that I am not using the Arduino IDE, but the Makefile from https://github.com/sudar/Arduino-Makefile.
What is going on? I almost expect this to be "normal", but I don't get it. Or is it something in the printing routine which simply adds 16 "1"'s to the output?
Enno
In addition to other answers, Integers might be stored in 16 bits or 32 bits depending on what arduino you have.
The function printing numbers in Arduino is defined in /arduino-1.0.5/hardware/arduino/cores/arduino/Print.cpp
size_t Print::printNumber(unsigned long n, uint8_t base) {
char buf[8 * sizeof(long) + 1]; // Assumes 8-bit chars plus zero byte.
char *str = &buf[sizeof(buf) - 1];
*str = '\0';
// prevent crash if called with base == 1
if (base < 2) base = 10;
do {
unsigned long m = n;
n /= base;
char c = m - base * n;
*--str = c < 10 ? c + '0' : c + 'A' - 10;
} while(n);
return write(str);
}
All other functions rely on this one, so yes your int gets promoted to an unsigned long when you print it, not when you shift it.
However, the library is correct. By shifting left 8 positions, the negative bit in the integer number becomes '1', so when the integer value is promoted to unsigned long the runtime correctly pads it with 16 extra '1's instead of '0's.
If you are using such a value not as a number but to contain some flags, use unsigned int instead of int.
ETA: for completeness, I'll add further explanation for the second shifting operation.
Once you touch the 'negative bit' inside the int number, when you shift towards right the runtime pads the number with '1's in order to preserve its negative value. Shifting to the left k positions corresponds to dividing the number by 2^k, and since the number is negative to start with then the result must remain negative.

Get Integer From Bits Inside `std::vector<char>`

I have a vector<char> and I want to be able to get an unsigned integer from a range of bits within the vector. E.g.
And I can't seem to be able to write the correct operations to get the desired output. My intended algorithm goes like this:
& the first byte with (0xff >> unused bits in byte on the left)
<< the result left the number of output bytes * number of bits in a byte
| this with the final output
For each subsequent byte:
<< left by the (byte width - index) * bits per byte
| this byte with the final output
| the final byte (not shifted) with the final output
>> the final output by the number of unused bits in the byte on the right
And here is my attempt at coding it, which does not give the correct result:
#include <vector>
#include <iostream>
#include <cstdint>
#include <bitset>
template<class byte_type = char>
class BitValues {
private:
std::vector<byte_type> bytes;
public:
static const auto bits_per_byte = 8;
BitValues(std::vector<byte_type> bytes) : bytes(bytes) {
}
template<class return_type>
return_type get_bits(int start, int end) {
auto byte_start = (start - (start % bits_per_byte)) / bits_per_byte;
auto byte_end = (end - (end % bits_per_byte)) / bits_per_byte;
auto byte_width = byte_end - byte_start;
return_type value = 0;
unsigned char first = bytes[byte_start];
first &= (0xff >> start % 8);
return_type first_wide = first;
first_wide <<= byte_width;
value |= first_wide;
for(auto byte_i = byte_start + 1; byte_i <= byte_end; byte_i++) {
auto byte_offset = (byte_width - byte_i) * bits_per_byte;
unsigned char next_thin = bytes[byte_i];
return_type next_byte = next_thin;
next_byte <<= byte_offset;
value |= next_byte;
}
value >>= (((byte_end + 1) * bits_per_byte) - end) % bits_per_byte;
return value;
}
};
int main() {
BitValues<char> bits(std::vector<char>({'\x78', '\xDA', '\x05', '\x5F', '\x8A', '\xF1', '\x0F', '\xA0'}));
std::cout << bits.get_bits<unsigned>(15, 29) << "\n";
return 0;
}
(In action: http://coliru.stacked-crooked.com/a/261d32875fcf2dc0)
I just can't seem to wrap my head around these bit manipulations, and I find debugging very difficult! If anyone can correct the above code, or help me in any way, it would be much appreciated!
Edit:
My bytes are 8 bits long
The integer to return could be 8,16,32 or 64 bits wside
The integer is stored in big endian
You made two primary mistakes. The first is here:
first_wide <<= byte_width;
You should be shifting by a bit count, not a byte count. Corrected code is:
first_wide <<= byte_width * bits_per_byte;
The second mistake is here:
auto byte_offset = (byte_width - byte_i) * bits_per_byte;
It should be
auto byte_offset = (byte_end - byte_i) * bits_per_byte;
The value in parenthesis needs to be the number of bytes to shift right by, which is also the number of bytes byte_i is away from the end. The value byte_width - byte_i has no semantic meaning (one is a delta, the other is an index)
The rest of the code is fine. Though, this algorithm has two issues with it.
First, when using your result type to accumulate bits, you assume you have room on the left to spare. This isn't the case if there are set bits near the right boundry and the choice of range causes the bits to be shifted out. For example, try running
bits.get_bits<uint16_t>(11, 27);
You'll get the result 42 which corresponds to the bit string 00000000 00101010 The correct result is 53290 with the bit string 11010000 00101010. Notice how the rightmost 4 bits got zeroed out. This is because you start off by overshifting your value variable, causing those four bits to be shifted out of the variable. When shifting back at the end, this results in the bits being zeroed out.
The second problem has to do with the right shift at the end. If the rightmost bit of the value variable happens to be a 1 before the right shift at the end, and the template parameter is a signed type, then the right shift that is done is an 'arithmetic' right shift, which causes bits on the right to be 1-filled, leaving you with an incorrect negative value.
Example, try running:
bits.get_bits<int16_t>(5, 21);
The expected result should be 6976 with the bit string 00011011 01000000, but the current implementation returns -1216 with the bit string 11111011 01000000.
I've put my implementation of this below which builds the bit string from the right to the left, placing bits in their correct positions to start with so that the above two problems are avoided:
template<class ReturnType>
ReturnType get_bits(int start, int end) {
int max_bits = kBitsPerByte * sizeof(ReturnType);
if (end - start > max_bits) {
start = end - max_bits;
}
int inclusive_end = end - 1;
int byte_start = start / kBitsPerByte;
int byte_end = inclusive_end / kBitsPerByte;
// Put in the partial-byte on the right
uint8_t first = bytes_[byte_end];
int bit_offset = (inclusive_end % kBitsPerByte);
first >>= 7 - bit_offset;
bit_offset += 1;
ReturnType ret = 0 | first;
// Add the rest of the bytes
for (int i = byte_end - 1; i >= byte_start; i--) {
ReturnType tmp = (uint8_t) bytes_[i];
tmp <<= bit_offset;
ret |= tmp;
bit_offset += kBitsPerByte;
}
// Mask out the partial byte on the left
int shift_amt = (end - start);
if (shift_amt < max_bits) {
ReturnType mask = (1 << shift_amt) - 1;
ret &= mask;
}
}
There is one thing you certainly missed I think: the way you index the bits in the vector is different from what you have been given in the problem. I.e. with algorithm you outlined, the order of the bits will be like 7 6 5 4 3 2 1 0 | 15 14 13 12 11 10 9 8 | 23 22 21 .... Frankly, I didn't read through your whole algorithm, but this one was missed in the very first step.
Interesting problem. I've done similar, for some systems work.
Your char is 8 bits wide? Or 16? How big is your integer? 32 or 64?
Ignore the vector complexity for a minute.
Think about it as just an array of bits.
How many bits do you have? You have 8*number of chars
You need to calculate a starting char, number of bits to extract, ending char, number of bits there, and number of chars in the middle.
You will need bitwise-and & for the first partial char
you will need bitwise-and & for the last partial char
you will need left-shift << (or right-shift >>), depending upon which order you start from
what is the endian-ness of your Integer?
At some point you will calculate an index into your array that is bitindex/char_bit_width, you gave the value 171 as your bitindex, and 8 as your char_bit_width, so you will end up with these useful values calculated:
171/8 = 23 //location of first byte
171%8 = 3 //bits in first char/byte
8 - 171%8 = 5 //bits in last char/byte
sizeof(integer) = 4
sizeof(integer) + ( (171%8)>0?1:0 ) // how many array positions to examine
Some assembly required...

n bit 2s binary to decimal in C++

I am trying to convert a string of signed binary numbers to decimal value in C++ using stoi as shown below.
stoi( binaryString, nullptr, 2 );
My inputs are binary string in 2s format and stoi will work fine as long as the number of digits is eight. for instance "1100" results 12 because stoi probably perceive it as "00001100".
But for a 4 bit system, 1100 in 2s format equals to -4. Any clues how to do this kind of conversion for arbitrary bit length 2s numbers in C++?
Handle sigendness for numbers with less bits:
convert binary -> decimal
calc 2s-complement if signed bit is set (wherever your sign bit is depending on wordlength).
.
#define BITSIZE 4
#define SIGNFLAG (1<<(BITSIZE-1)) // 0b1000
#define DATABITS (SIGNFLAG-1) // 0b0111
int x= std::stoi( "1100", NULL, 2); // x= 12
if ((x & SIGNFLAG)!=0) { // signflag set
x= (~x & DATABITS) + 1; // 2s complement without signflag
x= -x; // negative number
}
printf("%d\n", x); // -4
You can use strtoul, which is the unsigned equivalent. The only difference is that it returns an unsigned long, instead of an int.
You probably can implement
in C++, where a is binaryString, N is binaryString.size() and w is result.
The correct answer would probably depend on what you ultimately want to do with the int after you convert it. If you want to do signed math with it then you would need to 'sign extend' your result after the stoi conversion -- this is what the compiler does internally on a cast operation from one size signed int to another.
You can manually do this with something like this for a 4-bit system:
int myInt;
myInt = std::stoi( "1100", NULL, 2);
myInt |= myInt & 0x08 ? (-16 ) : 0;
Note, I used 0x08 as the test mask and -16 as the or mask as this is for a 4-bit result. You can change the mask to be correct for whatever your input bit length is. Also using a negative int like this will correctly sign-extend no matter what your systems integer size is.
Example for arbitrary bit width system (I used bitWidth to denote the size:
myInt = std::stoi( "1100", NULL, 2);
int bitWidth = 4;
myInt |= myInt & (1 << (bitWidth-1)) ? ( -(1<<bitWidth) ) : 0;
you can use the bitset header file for this :
#include <iostream>
#include <bitset>
using namespace std;
int main()
{
bitset<4> bs;
int no;
cin>>bs;
if(bs[3])
{
bs[3]=0;
no=-1*bs.to_ulong();
}
else
no=bs.to_ulong();
cout<<no;
return 0;
}
Since it returns unsigned long so you have to check the last bit.

Parsing 32 bit integer in C program

I have a 32 bit integer, split into parts like this:
--------------------------------------
| Part1 | Part2 | Part 3 |
--------------------------------------
Part 1 higher 16 bits. (Part 2 + Part 3) = lower 16 bits.
Part 2 is 10 bits and Part 3 is 6 bits
I need help on how do we read and update part 1, part2 and part 3 in C program.
Given an integer x with the above format, you can replace Part2 like this:
x = (x & ~(0x3ff << 6)) | (newPart2 << 6);
and Part3 like so:
x = (x & ~0x3f) | newPart3;
This assumes that both newPart2 and newPart3 are e.g. unsigned int with their new values right-adjusted.
int i
To extract the individual parts
part1 = (i & 0xFFFF0000) >> 16
part2 = (i & 0x0000FFC0) >> 6
part3 = (i & 0x0000003F)
To compose the integer
i = (part1 << 16) | (part2 << 6) | (part3)
Try cast to this structure
struct {
uint32_t part_1:16;
uint32_t part_2:10;
uint32_t part_3:6;
} parts;
Could be the one below depending on endianness
struct {
uint32_t part_1:6;
uint32_t part_2:10;
uint32_t part_3:16;
} parts;
Obviously not portable!
Since you need to read and update, a pointer will do. For example, if you 32bit value is called x, you do the following
parts *ptr = (parts *)&x;
ptr->part_2 = <part2 update>
The theory to be used behind this are and, or and shift operations with masks.
To access some bits of the integer, first create a mask where there are ones in the bits you want to be used. Now apply and and(&) operation between the mask and the integer. According to the behavior of the & the bits where the mask is 0 will be 0 and where the mask is 1 will have the value of that bit in the integer. Now that we have only the bits we want we align them to the right, that is done shifting the bits to the right the correct number of positions as to leave the rightmost bit of the mask in the less significant position of the byte.
To write in a part of a byte, we need fist to nullify what was in that part for that we use the negated mask that is used to read that part. Once that part is negated we apply an or(|) operation with the new value that must be aligned to that position.
To read:
unsigned int read_part_1(unsigned int composed) {
return (composed & 0xffff0000) >> 16;
}
unsigned int read_part_2(unsigned int composed) {
return (composed & 0x0000ffc0) >> 6;
}
unsigned int read_part_3(unsigned int composed) {
return (composed & 0x0000003f);
}
To write(val aligned to the right):
unsigned int write_part_1(unsigned int composed, unsigned int val) {
return (composed & ~0xffff0000) | ((val & 0x0000ffff) << 16);
}
unsigned int write_part_2(unsigned int composed, unsigned int val) {
return (composed & ~0x0000ffc0) | ((val & 0x000003ff) << 10);
}
unsigned int write_part_3(unsigned int composed, unsigned int val) {
return (composed & ~0x0000003f) | (val & 0x0000003f);
}