Change the width of a signed integer to a nonstandard width - c++

For a networking application I need a signed, 2's complement integer. With a custom width. Specified at run time. Assuming the value of the integer falls in the width.
The problem I have is the parity bit. Is there any way of avoid having to manually set the parity bit? Say I have an integer with a width of 11 bits, i'll store it in an array of 2 chars like this:
int myIntWidth = 11;
int32_t myInt= 5;
unsigned char charArray[2] = memcpy(charArray, &myInt, (myIntWidth + 7)/8);

It doesn't work like that. It can't work, because you are copying two bytes from the start of myInt but you don't know where the bytes that you are interested in are stored. You also need to know in which order you are supposed to store the bytes. Depending on that, use one of these two codes:
unsigned char charArray [2];
charArray [0] = myInt & 0xff; // Lowest 8 bits
charArray [1] = (myInt >> 8) & 0x07; // Next 3 bits
or
unsigned char charArray [2];
charArray [1] = myInt & 0xff; // Lowest 8 bits
charArray [0] = (myInt >> 8) & 0x07; // Next 3 bits

With the help of a lot of the posts above, I've come up with this solution:
inline void reduceSignedIntWidth(int32_t& destInt, int width)
{
//create a value mask, with 1's at the masked part
uint32_t l_mask = (0x01u << width) - 1;
destInt &= l_mask;
}
It will return the reduced int, with zeros as padding.

Related

Concatenate Bits from 3 characters, taken from different locations in the bitset

I am trying to concatenate the bits of 3 characters a, b and c into a bitset of 16 bits. The constraints are the following:
Concatenate the last 2 bits of a into newVal1
Concatenate the 8 bits of b into newVal1
Concatenate the first 2 bits of c into newVal1
On paper I am getting: 1111111111110000 same as the result. But I am not sure of the way I am concatenating the bits. First shift left by 14 character a then shift left by 6 character b and finally, since There is no space left for character c then shift right by 2. Is there a better way to do it? It's already confusing for me
#include <iostream>
#include <bitset>
int main() {
int a = 0b11111111 & 0b00000011;
int b = 0b11111111;
int c = 0b11111111 & 0b11000000;
uint16_t newVal1 = (a << 14) + (b << 6) + (c >> 2 );
std::cout << std::bitset<16>(newVal1).to_string() << std::endl;
return 0;
}
First of all you need to consider the signed and unsigned integer problem. With signed integers you can get unexpected sign extensions, adding all ones at the top. And possible overflow will lead to undefined behavior.
So the first thing I would do is to use all unsigned integer values.
Then to make it clear and simple, my suggestion is that you do all the shifting on newVal1 instead, and just do bitwise OR into it:
unsigned a = /* value of a */;
unsigned b = /* value of b */;
unsigned c = /* value of c */
unsigned newVal1 = 0;
newVal1 |= a & 0x02; // Get lowest two bits of a
newVal1 <<= 8; // Make space for the next eight bits
newVal1 |= b & 0xffu; // "Concatenate" eight bits from b
newVal1 <<= 2; // Make space for the next two bits
newVal1 |= (c >> 6) & 0x02; // Get the two "top" bits from c
Now the lowest twelve bits of newVal1 should follow the three rules set up for your assignment. The bits top bits will be all zero.

Increasing hex value with given data

I need to compare given text data with checkSumCalculator method and I try to send the data with command method. I find and changed the code according to my own needs. But I dont understand some parts.
How can 0x00 hex char will be increase with given data? and how/what is the point of comparing check_data with 0xFF? How to extract (check_data & 0xFF) from 0x100 hex? I am very confused.
void Widget::command()
{
std::string txt = "<DONE:8022ff";
unsigned char check_sum = checkSumCalculator(&txt[0], txt.size());
QString reply= QString::fromStdString(txt) + QString("%1>").arg(check_sum, 2, 16,
QChar('0'));
emit finished(replyMessage, true);
}
static unsigned char checkSumCalculator(void *data, int length)
{
unsigned char check_data = 0x00;
for (int i = 0; i < lenght; i++)
check_data+= ((unsigned char*)data)[i];
check_data = (0x100 - (check_data & 0xFF)) & 0xFF;
return check_data;
}
checkSumCalculator starts by adding together all the values of the buffer in data. Because the type of data is unsigned char, this sum is done modulo 0x100 (256), 1 more than the maximum value an unsigned char can handle (0xFF = 255); the value is said to "wrap around" ((unsigned char) (0xFF + 1) = 256) is again 0).
These two lines:
check_data = (0x100 - (check_data & 0xFF)) & 0xFF;
return check_data;
are really more complicated than it's needed. All that would be needed would be:
return -check_data;
That is, at the end it negates the value. Because the arithmetic is modulo 256, this is essentially the same as flipping the bits and adding 1 (-check_data = ~check_data + 1). This is instead implemented in a more convoluted way:
check_data & 0xFF doesn't do much, because it's a bitwise AND with all the possible bits that can be set on an unsigned char. The value is promoted to an unsigned int (due to C's default integer promotions) where all the bits higher than the lower 8 are necessarily 0. So this is the same as (unsigned int)check_data. Ultimately, this promotion has no bearing on the result.
Subtracting from 0x100 is the same as -check_data, as far as the lower 8 bits are concerned (which what we end up caring about).
The final & 0xFF is also redundant because even though the expression was promoted to unsigned int, it will converted as an unsigned char by returning.

How to build N bits variables in C++?

I am dealing with very large list of booleans in C++, around 2^N items of N booleans each. Because memory is critical in such situation, i.e. an exponential growth, I would like to build a N-bits long variable to store each element.
For small N, for example 24, I am just using unsigned long int. It takes 64MB ((2^24)*32/8/1024/1024). But I need to go up to 36. The only option with build-in variable is unsigned long long int, but it takes 512GB ((2^36)*64/8/1024/1024/1024), which is a bit too much.
With a 36-bits variable, it would work for me because the size drops to 288GB ((2^36)*36/8/1024/1024/1024), which fits on a node of my supercomputer.
I tried std::bitset, but std::bitset< N > creates a element of at least 8B.
So a list of std::bitset< 1 > is much greater than a list of unsigned long int.
It is because the std::bitset just change the representation, not the container.
I also tried boost::dynamic_bitset<> from Boost, but the result is even worst (at least 32B!), for the same reason.
I know an option is to write all elements as one chain of booleans, 2473901162496 (2^36*36), then to store then in 38654705664 (2473901162496/64) unsigned long long int, which gives 288GB (38654705664*64/8/1024/1024/1024). Then to access an element is just a game of finding in which elements the 36 bits are stored (can be either one or two). But it is a lot of rewriting of the existing code (3000 lines) because mapping becomes impossible and because adding and deleting items during the execution in some functions will be surely complicated, confusing, challenging, and the result will be most likely not efficient.
How to build a N-bits variable in C++?
How about a struct with 5 chars (and perhaps some fancy operator overloading as needed to keep it compatible to the existing code)? A struct with a long and a char probably won't work because of padding / alignment...
Basically your own mini BitSet optimized for size:
struct Bitset40 {
unsigned char data[5];
bool getBit(int index) {
return (data[index / 8] & (1 << (index % 8))) != 0;
}
bool setBit(int index, bool newVal) {
if (newVal) {
data[index / 8] |= (1 << (index % 8));
} else {
data[index / 8] &= ~(1 << (index % 8));
}
}
};
Edit: As geza has also pointed out int he comments, the "trick" here is to get as close as possible to the minimum number of bytes needed (without wasting memory by triggering alignment losses, padding or pointer indirection, see http://www.catb.org/esr/structure-packing/).
Edit 2: If you feel adventurous, you could also try a bit field (and please let us know how much space it actually consumes):
struct Bitset36 {
unsigned long long data:36;
}
I'm not an expert, but this is what I would "try". Find the bytes for the smallest type your compiler supports (should be char). You can check with sizeof and you should get 1. That means 1 byte, so 8 bits.
So if you wanted a 24 bit type...you would need 3 chars. For 36 you would need 5 char array and you would have 4 bits of wasted padding on the end. This could easily be accounted for.
i.e.
char typeSize[3] = {0}; // should hold 24 bits
Now make a bit mask to access each position of typeSize.
const unsigned char one = 0b0000'0001;
const unsigned char two = 0b0000'0010;
const unsigned char three = 0b0000'0100;
const unsigned char four = 0b0000'1000;
const unsigned char five = 0b0001'0000;
const unsigned char six = 0b0010'0000;
const unsigned char seven = 0b0100'0000;
const unsigned char eight = 0b1000'0000;
Now you can use the bit-wise or to set the values to 1 where needed..
typeSize[1] |= four;
*typeSize[0] |= (four | five);
To turn off bits use the & operator..
typeSize[0] &= ~four;
typeSize[2] &= ~(four| five);
You can read the position of each bit with the & operator.
typeSize[0] & four
Bear in mind, I don't have a compiler handy to try this out so hopefully this is a useful approach to your problem.
Good luck ;-)
You can use array of unsigned long int and store and retrieve needed bit chains with bitwise operations. This approach excludes space overhead.
Simplified example for unsigned byte array B[] and 12-bit variables V (represented as ushort):
Set V[0]:
B[0] = V & 0xFF; //low byte
B[1] = B[1] & 0xF0; // clear low nibble
B[1] = B[1] | (V >> 8); //fill low nibble of the second byte with the highest nibble of V

Bit shifts and their logical operators

This program below moves the last (junior) and the penultimate bytes variable i type int. I'm trying to understand why the programmer wrote this
i = (i & LEADING_TWO_BYTES_MASK) | ((i & PENULTIMATE_BYTE_MASK) >> 8) | ((i & LAST_BYTE_MASK) << 8);
Can anyone explain to me in plain English whats going on in the program below.
#include <stdio.h>
#include <cstdlib>
#define LAST_BYTE_MASK 255 //11111111
#define PENULTIMATE_BYTE_MASK 65280 //1111111100000000
#define LEADING_TWO_BYTES_MASK 4294901760 //11111111111111110000000000000000
int main(){
unsigned int i = 0;
printf("i = ");
scanf("%d", &i);
i = (i & LEADING_TWO_BYTES_MASK) | ((i & PENULTIMATE_BYTE_MASK) >> 8) | ((i & LAST_BYTE_MASK) << 8);
printf("i = %d", i);
system("pause");
}
Since you asked for plain english: He swaps the first and second bytes of an integer.
The expression is indeed a bit convoluted but in essence the author does this:
// Mask out relevant bytes
unsigned higher_order_bytes = i & LEADING_TWO_BYTES_MASK;
unsigned first_byte = i & LAST_BYTE_MASK;
unsigned second_byte = i & PENULTIMATE_BYTE_MASK;
// Switch positions:
unsigned first_to_second = first_byte << 8;
unsigned second_to_first = second_byte >> 8;
// Concatenate back together:
unsigned result = higher_order_bytes | first_to_second | second_to_first;
Incidentally, defining the masks using hexadecimal notation is more readable than using decimal. Furthermore, using #define here is misguided. Both C and C++ have const:
unsigned const LEADING_TWO_BYTES_MASK = 0xFFFF0000;
unsigned const PENULTIMATE_BYTE_MASK = 0xFF00;
unsigned const LAST_BYTE_MASK = 0xFF;
To understand this code you need to know what &, | and bit shifts are doing on the bit level.
It's more instructive to define your masks in hexadecimal rather than decimal, because then they correspond directly to the binary representations and it's easy to see which bits are on and off:
#define LAST 0xFF // all bits in the first byte are 1
#define PEN 0xFF00 // all bits in the second byte are 1
#define LEAD 0xFFFF0000 // all bits in the third and fourth bytes are 1
Then
i = (i & LEAD) // leave the first 2 bytes of the 32-bit integer the same
| ((i & PEN) >> 8) // take the 3rd byte and shift it 8 bits right
| ((i & LAST) << 8) // take the 4th byte and shift it 8 bits left
);
So the expression is swapping the two least significant bytes while leaving the two most significant bytes the same.

Parsing 32 bit integer in C program

I have a 32 bit integer, split into parts like this:
--------------------------------------
| Part1 | Part2 | Part 3 |
--------------------------------------
Part 1 higher 16 bits. (Part 2 + Part 3) = lower 16 bits.
Part 2 is 10 bits and Part 3 is 6 bits
I need help on how do we read and update part 1, part2 and part 3 in C program.
Given an integer x with the above format, you can replace Part2 like this:
x = (x & ~(0x3ff << 6)) | (newPart2 << 6);
and Part3 like so:
x = (x & ~0x3f) | newPart3;
This assumes that both newPart2 and newPart3 are e.g. unsigned int with their new values right-adjusted.
int i
To extract the individual parts
part1 = (i & 0xFFFF0000) >> 16
part2 = (i & 0x0000FFC0) >> 6
part3 = (i & 0x0000003F)
To compose the integer
i = (part1 << 16) | (part2 << 6) | (part3)
Try cast to this structure
struct {
uint32_t part_1:16;
uint32_t part_2:10;
uint32_t part_3:6;
} parts;
Could be the one below depending on endianness
struct {
uint32_t part_1:6;
uint32_t part_2:10;
uint32_t part_3:16;
} parts;
Obviously not portable!
Since you need to read and update, a pointer will do. For example, if you 32bit value is called x, you do the following
parts *ptr = (parts *)&x;
ptr->part_2 = <part2 update>
The theory to be used behind this are and, or and shift operations with masks.
To access some bits of the integer, first create a mask where there are ones in the bits you want to be used. Now apply and and(&) operation between the mask and the integer. According to the behavior of the & the bits where the mask is 0 will be 0 and where the mask is 1 will have the value of that bit in the integer. Now that we have only the bits we want we align them to the right, that is done shifting the bits to the right the correct number of positions as to leave the rightmost bit of the mask in the less significant position of the byte.
To write in a part of a byte, we need fist to nullify what was in that part for that we use the negated mask that is used to read that part. Once that part is negated we apply an or(|) operation with the new value that must be aligned to that position.
To read:
unsigned int read_part_1(unsigned int composed) {
return (composed & 0xffff0000) >> 16;
}
unsigned int read_part_2(unsigned int composed) {
return (composed & 0x0000ffc0) >> 6;
}
unsigned int read_part_3(unsigned int composed) {
return (composed & 0x0000003f);
}
To write(val aligned to the right):
unsigned int write_part_1(unsigned int composed, unsigned int val) {
return (composed & ~0xffff0000) | ((val & 0x0000ffff) << 16);
}
unsigned int write_part_2(unsigned int composed, unsigned int val) {
return (composed & ~0x0000ffc0) | ((val & 0x000003ff) << 10);
}
unsigned int write_part_3(unsigned int composed, unsigned int val) {
return (composed & ~0x0000003f) | (val & 0x0000003f);
}