Why can't I pack these ints together? - c++

I have the following code. The goal is to combine the two uint32_ts into a single uint64_t and then retrieve the values.
#include <iostream>
#include <cstdint>
int main()
{
uint32_t first = 5;
uint32_t second = 6;
uint64_t combined = (first << 32) | second;
uint32_t firstR = combined >> 32;
uint32_t secondR = combined & 0xffffffff;
std::cout << "F: " << firstR << " S: " << secondR << std::endl;
}
It outputs
F: 0 S: 7
How do I successfully retrieve the values correctly?

first is a 32-bit type and you bit-shift it by 32 bits. This is technically undefined behaviour, but probably the most likely outcome is that the result of the expression is 0. You need to cast it to a larger type before bit-shifting it.
uint64_t combined = (static_cast<uint64_t>(first) << 32) | second;

When you perform first << 32, you are shifting 32 bits within the space of 32 bits, so there are no bits remaining after the shift. The result of the shift is 0. You need to convert the first value to 64 bits before you shift it:
uint64_t combined = (uint64_t(first) << 32) | second;

As per the comments:
#include <iostream>
#include <cstdint>
int main()
{
uint32_t first = 5;
uint32_t second = 6;
uint64_t combined = (uint64_t(first) << 32) | second;
uint32_t firstR = combined >> 32;
uint32_t secondR = combined & 0xffffffff;
std::cout << "F: " << firstR << " S: " << secondR << std::endl;
}
The bit manipulation operators return a type of the first parameter. So you need to cast it to uint64_t in order for it to have room for the second value.

Related

Arrays of enum's packed into bit fields in MSVC++

Unsing MS Studio 2022 I am trying to pack two items into a union of size 16 bits but I am having problems with the correct syntax.
The first item is an unsigned short int so no problems there. The other is an array of 5 items, all two bits long. So imagine:
enum States {unused, on, off};
// Should be able to store this in a 2 bit field
then I want
States myArray[5];
// Should be able to fit in 10 bits and
// be unioned with my unsigned short
Unfortunatly I am completely failing to work out the correct syntax which leads to my array fitting into 16 bits. Any ideas?
You can't do that. An array is an array, not some packed bits.
What you can do is using manual bit manipulation:
#include <iostream>
#include <cstdint>
#include <bitset>
#include <climits>
enum status {
on = 0x03,
off = 0x01,
unused = 0x00
};
constexpr std::uint8_t status_bit_width = 2;
std::uint16_t encode(status s,std::uint8_t id, std::uint16_t status_vec) {
if(id >= (CHAR_BIT * sizeof(std::uint16_t)) / status_bit_width) {
std::cout << "illegal id" << std::endl;
return status_vec;
}
std::uint8_t bit_value = s;
status_vec |= bit_value << (id*status_bit_width);
return status_vec;
};
int main(void) {
std::uint16_t bits = 0;
bits = encode(on,1,bits);
std::cout << std::bitset<16>(bits) << std::endl;
bits = encode(off,2,bits);
std::cout << std::bitset<16>(bits) << std::endl;
bits = encode(unused,3,bits);
std::cout << std::bitset<16>(bits) << std::endl;
bits = encode(off,4,bits);
std::cout << std::bitset<16>(bits) << std::endl;
bits = encode(off,7,bits);
std::cout << std::bitset<16>(bits) << std::endl;
bits = encode(on,8,bits);
}

Lemires Nearly Divisionless Modulo Trick

In https://lemire.me/blog/2019/06/06/nearly-divisionless-random-integer-generation-on-various-systems/, Lemire uses -s % s to compute something which according to the paper is supposed to be 2^L % s. According to https://shufflesharding.com/posts/dissecting-lemire this should be equivalent, but I'm getting different results. A 32-bit example:
#include <iostream>
int main() {
uint64_t s = 1440000000;
uint64_t k1 = (1ULL << 32ULL) % s;
uint64_t k2 = (-s) % s;
std::cout << k1 << std::endl;
std::cout << k2 << std::endl;
}
Output:
./main
1414967296
109551616
The results aren't matching. What am I missing?
Unary negation on integers operates on every bit (two's complement and all that).
So if you want to simulate 32 bit operations using uint64_t variables, you need to cast the value to 32 bits for that step:
#include <iostream>
int main() {
uint64_t s = 1440000000;
uint64_t k1 = (1ULL << 32ULL) % s;
uint64_t k2 = (-uint32_t(s)) % s;
std::cout << k1 << std::endl;
std::cout << k2 << std::endl;
}
Which leads to the expected result:
Program returned: 0
Program stdout
1414967296
1414967296
See on godbolt

Appending bits in C/C++

I want to append two unsigned 32bit integers into 1 64 bit integer. I have tried this code, but it fails. However, it works for 16bit integers into 1 32 bit
Code:
char buffer[33];
char buffer2[33];
char buffer3[33];
/*
uint16 int1 = 6535;
uint16 int2 = 6532;
uint32 int3;
*/
uint32 int1 = 653545;
uint32 int2 = 562425;
uint64 int3;
int3 = int1;
int3 = (int3 << 32 /*(when I am doing 16 bit integers, this 32 turns into a 16)*/) | int2;
itoa(int1, buffer, 2);
itoa(int2, buffer2, 2);
itoa(int3, buffer3, 2);
std::cout << buffer << "|" << buffer2 << " = \n" << buffer3 << "\n";
Output when the 16bit portion is enabled:
1100110000111|1100110000100 =
11001100001110001100110000100
Output when the 32bit portion is enabled:
10011111100011101001|10001001010011111001 =
10001001010011111001
Why is it not working? Thanks
I see nothing wrong with this code. It works for me. If there's a bug, it's in the code that's not shown.
Version of the given code, using standardized type declarations and iostream manipulations, instead of platform-specific library calls. The bit operations are identical to the example given.
#include <iostream>
#include <iomanip>
#include <stdint.h>
int main()
{
uint32_t int1 = 653545;
uint32_t int2 = 562425;
uint64_t int3;
int3 = int1;
int3 = (int3 << 32) | int2;
std::cout << std::hex << std::setw(8) << std::setfill('0')
<< int1 << " "
<< std::setw(8) << std::setfill('0')
<< int2 << "="
<< std::setw(16) << std::setfill('0')
<< int3 << std::endl;
return (0);
}
Resulting output:
0009f8e9 000894f9=0009f8e9000894f9
The bitwise operation looks correct to me. When working with bits, hexadecimal is more convenient. Any bug, if there is one, is in the code that was not shown in the question. As far as "appending bits in C++" goes, what you have in your code appears to be correct.
Try declaring buffer3 as buffer3[65]
Edit:
Sorry.
But I don't understand what the complaint is about.
In fact the answer is just as expected. You can infer it from your own result for the 16 bit input.
Since when you are oring the 32 '0' bits in lsb with second integer it will have leading zeroes in msb (when assigned to a 32 bit int which is in the signature of atoi) which are truncated in atoi (only the integer value equivalent will be read in the string, hence the string has to be 0X0 terminated, otherwise it would have a determinable size), giving the result.

MIPS Simulator --- read instructions into memory (C++)

I need to read a 32 bit address in hex format (ex: 0129ef12) and split up the 32 bits into 6-5-5-16 packets that represent Opcode-Rd-Rs-Immediate, respectively.
This is what I have so far:
typedef unsigned char u8;
typedef unsigned short u16;
union {
unsigned int address;
struct {
u16 imm : 16;
u8 rs : 5;
u8 rd : 5;
u8 opcode : 6;
} i;
} InstRead;
InstRead.address = 0x0129ef12;
cout << hex << int(InstRead.i.opcode) << "\n";
cout << hex << int(InstRead.i.rs) << "\n";
cout << hex << int(InstRead.i.rd) << "\n";
cout << hex << int(InstRead.i.imm) << "\n";
However, this does not give the correct output... i.e the bits are not selected by the lengths 6-5-5-16 that I have specified... What am I doing wrong?
union {
unsigned int address;
unsigned int imm : 16,
rs : 5,
rd : 5,
opcode : 6;
} InstRead;
See if you have better luck with that union. It will depend on your compiler though.
Your code works for me (gcc 4.8.3 on Windows 7).
You can use bit operations to extract the fields in a more portable manner:
imm = address & 0xffff;
rs = address & 0x1f0000 >> 16;
// et cetera

Converting hex String to structure

I've got a file containing a large string of hexidecimal. Here's the first few lines:
0000038f
0000111d
0000111d
03030303
//Goes on for a long time
I have a large struct that is intended to hold that data:
typedef struct
{
unsigned int field1: 5;
unsigned int field2: 11;
unsigned int field3: 16;
//Goes on for a long time
}calibration;
What I want to do is read the above string and store it in the struct. I can assume the input is valid (it's verified before I get it).
I've already got a loop that reads the file and puts the whole item in a string:
std::string line = "";
std::string hexText = "";
while(!std::getline(readFile, line))
{
hexText += line;
}
//Convert string into calibration
//Convert string into long int
long int hexInt = strtol(hexText.c_str(), NULL, 16);
//Here I get stuck: How to get from long int to calibration...?
How to get from long int to calibration...?
Cameron's answer is good, and probably what you want.
I offer here another (maybe not so different) approach.
Note1: Your file input needs re-work. I will suggest
a) use getline() to fetch one line at a time into a string
b) convert the one entry to a uint32_t (I would use stringstream instead of atol)
once you learn how to detect and recover from invalid input,
you could then work on combining a) and b) into one step
c) then install the uint32_t in your structure, for which my
offering below might offer insight.
Note2: I have worked many years with bit fields, and have developed a distaste for them.
I have never found them more convenient than the alternatives.
The alternative I prefer is bit masks and field shifting.
So far as we can tell from your problem statement, it appears your problem does not need bit-fields (which Cameron's answer illustrates).
Note3: Not all compilers will pack these bit fields for you.
The last compiler I used require what is called a "pragma".
G++ 4.8 on ubuntu seemed to pack the bytes just fine (i.e. no pragma needed)
The sizeof(calibration) for your original code is 4 ... i.e. packed.
Another issue is that packing can unexpectedly change when you change options or upgrade the compiler or change the compiler.
My team's work-around was to always have an assert against struct size and a few byte offsets in the CTOR.
Note4: I did not illustrate the use of 'union' to align a uint32_t array over your calibration struct.
This may be preferred over the reinterpret cast approach. Check your requirements, team lead, professor.
Anyway, in the spirit of your original effort, consider the following additions to your struct calibration:
typedef struct
{
uint32_t field1 : 5;
uint32_t field2 : 11;
uint32_t field3 : 16;
//Goes on for a long time
// I made up these next 2 fields for illustration
uint32_t field4 : 8;
uint32_t field5 : 24;
// ... add more fields here
// something typically done by ctor or used by ctor
void clear() { field1 = 0; field2 = 0; field3 = 0; field4 = 0; field5 = 0; }
void show123(const char* lbl=0) {
if(0 == lbl) lbl = " ";
std::cout << std::setw(16) << lbl;
std::cout << " " << std::setw(5) << std::hex << field3 << std::dec
<< " " << std::setw(5) << std::hex << field2 << std::dec
<< " " << std::setw(5) << std::hex << field1 << std::dec
<< " 0x" << std::hex << std::setfill('0') << std::setw(8)
<< *(reinterpret_cast<uint32_t*>(this))
<< " => " << std::dec << std::setfill(' ')
<< *(reinterpret_cast<uint32_t*>(this))
<< std::endl;
} // show
// I did not create show456() ...
// 1st uint32_t: set new val, return previous
uint32_t set123(uint32_t nxtVal) {
uint32_t* myVal = reinterpret_cast<uint32_t*>(this);
uint32_t prevVal = myVal[0];
myVal[0] = nxtVal;
return (prevVal);
}
// return current value of the combined field1, field2 field3
uint32_t get123(void) {
uint32_t* myVal = reinterpret_cast<uint32_t*>(this);
return (myVal[0]);
}
// 2nd uint32_t: set new val, return previous
uint32_t set45(uint32_t nxtVal) {
uint32_t* myVal = reinterpret_cast<uint32_t*>(this);
uint32_t prevVal = myVal[1];
myVal[1] = nxtVal;
return (prevVal);
}
// return current value of the combined field4, field5
uint32_t get45(void) {
uint32_t* myVal = reinterpret_cast<uint32_t*>(this);
return (myVal[1]);
}
// guess that next 4 fields fill 32 bits
uint32_t get6789(void) {
uint32_t* myVal = reinterpret_cast<uint32_t*>(this);
return (myVal[2]);
}
// ... tedious expansion
} calibration;
Here is some test code to illustrate the use:
uint32_t t125()
{
const char* lbl =
"\n 16 bits 11 bits 5 bits hex => dec";
calibration cal;
cal.clear();
std::cout << lbl << std::endl;
cal.show123();
cal.field1 = 1;
cal.show123("field1 = 1");
cal.clear();
cal.field1 = 31;
cal.show123("field1 = 31");
cal.clear();
cal.field2 = 1;
cal.show123("field2 = 1");
cal.clear();
cal.field2 = (2047 & 0x07ff);
cal.show123("field2 = 2047");
cal.clear();
cal.field3 = 1;
cal.show123("field3 = 1");
cal.clear();
cal.field3 = (65535 & 0x0ffff);
cal.show123("field3 = 65535");
cal.set123 (0xABCD6E17);
cal.show123 ("set123(0x...)");
cal.set123 (0xffffffff);
cal.show123 ("set123(0x...)");
cal.set123 (0x0);
cal.show123 ("set123(0x...)");
std::cout << "\n";
cal.clear();
std::cout << "get123(): " << cal.get123() << std::endl;
std::cout << " get45(): " << cal.get45() << std::endl;
// values from your file:
cal.set123 (0x0000038f);
cal.set45 (0x0000111d);
std::cout << "get123(): " << "0x" << std::hex << std::setfill('0')
<< std::setw(8) << cal.get123() << std::endl;
std::cout << " get45(): " << "0x" << std::hex << std::setfill('0')
<< std::setw(8) << cal.get45() << std::endl;
// cal.set6789 (0x03030303);
// std::cout << "get6789(): " << cal.get6789() << std::endl;
// ...
return(0);
}
And the test code output:
16 bits 11 bits 5 bits hex => dec
0 0 0 0x00000000 => 0
field1 = 1 0 0 1 0x00000001 => 1
field1 = 31 0 0 1f 0x0000001f => 31
field2 = 1 0 1 0 0x00000020 => 32
field2 = 2047 0 7ff 0 0x0000ffe0 => 65,504
field3 = 1 1 0 0 0x00010000 => 65,536
field3 = 65535 ffff 0 0 0xffff0000 => 4,294,901,760
set123(0x...) abcd 370 17 0xabcd6e17 => 2,882,366,999
set123(0x...) ffff 7ff 1f 0xffffffff => 4,294,967,295
set123(0x...) 0 0 0 0x00000000 => 0
get123(): 0
get45(): 0
get123(): 0x0000038f
get45(): 0x0000111d
The goal of this code is to help you see how the bit fields map into the lsbyte through msbyte of the data.
If you care at all about efficiency, don't read the whole thing into a string and then convert it. Simply read one word at a time, and convert that. Your loop should look something like:
calibration c;
uint32_t* dest = reinterpret_cast<uint32_t*>(&c);
while (true) {
char hexText[8];
// TODO: Attempt to read 8 bytes from file and then skip whitespace
// TODO: Break out of the loop on EOF
std::uint32_t hexValue = 0; // TODO: Convert hex to dword
// Assumes the structure padding & packing matches the dump version's
// Assumes the structure size is exactly a multiple of 32-bytes (w/ padding)
static_assert(sizeof(calibration) % 4 == 0);
assert(dest - &c < sizeof(calibration) && "Too much data");
*dest++ = hexValue;
}
assert(dest - &c == sizeof(calibration) && "Too little data");
Converting 8 chars of hex to an actual 4-byte int is a good exercise and is well-covered elsewhere, so I've left it out (along with the file reading, which is similarly well-covered).
Note the two assumptions in the loop: the first one cannot be checked either at run-time or compile time, and must be either agreed upon in advance or extra work has to be done to properly serialize the structure (handling structure packing and padding, etc.). The last one can at least be checked at compile time with the static_assert.
Also, care has to be taken to ensure that the endianness of the hex bytes in the file matches the endianness of the architecture executing the program when converting the hex string. This will depend on whether the hex was written in a specific endianness in the first place (in which case you can convert it from the know endianness to the current architecture's endianness quite easily), or whether it's architecture-dependent (in which case you have no choice but to assume the endianness is the same as your current architecture).