Converting hex String to structure - c++

I've got a file containing a large string of hexidecimal. Here's the first few lines:
0000038f
0000111d
0000111d
03030303
//Goes on for a long time
I have a large struct that is intended to hold that data:
typedef struct
{
unsigned int field1: 5;
unsigned int field2: 11;
unsigned int field3: 16;
//Goes on for a long time
}calibration;
What I want to do is read the above string and store it in the struct. I can assume the input is valid (it's verified before I get it).
I've already got a loop that reads the file and puts the whole item in a string:
std::string line = "";
std::string hexText = "";
while(!std::getline(readFile, line))
{
hexText += line;
}
//Convert string into calibration
//Convert string into long int
long int hexInt = strtol(hexText.c_str(), NULL, 16);
//Here I get stuck: How to get from long int to calibration...?

How to get from long int to calibration...?
Cameron's answer is good, and probably what you want.
I offer here another (maybe not so different) approach.
Note1: Your file input needs re-work. I will suggest
a) use getline() to fetch one line at a time into a string
b) convert the one entry to a uint32_t (I would use stringstream instead of atol)
once you learn how to detect and recover from invalid input,
you could then work on combining a) and b) into one step
c) then install the uint32_t in your structure, for which my
offering below might offer insight.
Note2: I have worked many years with bit fields, and have developed a distaste for them.
I have never found them more convenient than the alternatives.
The alternative I prefer is bit masks and field shifting.
So far as we can tell from your problem statement, it appears your problem does not need bit-fields (which Cameron's answer illustrates).
Note3: Not all compilers will pack these bit fields for you.
The last compiler I used require what is called a "pragma".
G++ 4.8 on ubuntu seemed to pack the bytes just fine (i.e. no pragma needed)
The sizeof(calibration) for your original code is 4 ... i.e. packed.
Another issue is that packing can unexpectedly change when you change options or upgrade the compiler or change the compiler.
My team's work-around was to always have an assert against struct size and a few byte offsets in the CTOR.
Note4: I did not illustrate the use of 'union' to align a uint32_t array over your calibration struct.
This may be preferred over the reinterpret cast approach. Check your requirements, team lead, professor.
Anyway, in the spirit of your original effort, consider the following additions to your struct calibration:
typedef struct
{
uint32_t field1 : 5;
uint32_t field2 : 11;
uint32_t field3 : 16;
//Goes on for a long time
// I made up these next 2 fields for illustration
uint32_t field4 : 8;
uint32_t field5 : 24;
// ... add more fields here
// something typically done by ctor or used by ctor
void clear() { field1 = 0; field2 = 0; field3 = 0; field4 = 0; field5 = 0; }
void show123(const char* lbl=0) {
if(0 == lbl) lbl = " ";
std::cout << std::setw(16) << lbl;
std::cout << " " << std::setw(5) << std::hex << field3 << std::dec
<< " " << std::setw(5) << std::hex << field2 << std::dec
<< " " << std::setw(5) << std::hex << field1 << std::dec
<< " 0x" << std::hex << std::setfill('0') << std::setw(8)
<< *(reinterpret_cast<uint32_t*>(this))
<< " => " << std::dec << std::setfill(' ')
<< *(reinterpret_cast<uint32_t*>(this))
<< std::endl;
} // show
// I did not create show456() ...
// 1st uint32_t: set new val, return previous
uint32_t set123(uint32_t nxtVal) {
uint32_t* myVal = reinterpret_cast<uint32_t*>(this);
uint32_t prevVal = myVal[0];
myVal[0] = nxtVal;
return (prevVal);
}
// return current value of the combined field1, field2 field3
uint32_t get123(void) {
uint32_t* myVal = reinterpret_cast<uint32_t*>(this);
return (myVal[0]);
}
// 2nd uint32_t: set new val, return previous
uint32_t set45(uint32_t nxtVal) {
uint32_t* myVal = reinterpret_cast<uint32_t*>(this);
uint32_t prevVal = myVal[1];
myVal[1] = nxtVal;
return (prevVal);
}
// return current value of the combined field4, field5
uint32_t get45(void) {
uint32_t* myVal = reinterpret_cast<uint32_t*>(this);
return (myVal[1]);
}
// guess that next 4 fields fill 32 bits
uint32_t get6789(void) {
uint32_t* myVal = reinterpret_cast<uint32_t*>(this);
return (myVal[2]);
}
// ... tedious expansion
} calibration;
Here is some test code to illustrate the use:
uint32_t t125()
{
const char* lbl =
"\n 16 bits 11 bits 5 bits hex => dec";
calibration cal;
cal.clear();
std::cout << lbl << std::endl;
cal.show123();
cal.field1 = 1;
cal.show123("field1 = 1");
cal.clear();
cal.field1 = 31;
cal.show123("field1 = 31");
cal.clear();
cal.field2 = 1;
cal.show123("field2 = 1");
cal.clear();
cal.field2 = (2047 & 0x07ff);
cal.show123("field2 = 2047");
cal.clear();
cal.field3 = 1;
cal.show123("field3 = 1");
cal.clear();
cal.field3 = (65535 & 0x0ffff);
cal.show123("field3 = 65535");
cal.set123 (0xABCD6E17);
cal.show123 ("set123(0x...)");
cal.set123 (0xffffffff);
cal.show123 ("set123(0x...)");
cal.set123 (0x0);
cal.show123 ("set123(0x...)");
std::cout << "\n";
cal.clear();
std::cout << "get123(): " << cal.get123() << std::endl;
std::cout << " get45(): " << cal.get45() << std::endl;
// values from your file:
cal.set123 (0x0000038f);
cal.set45 (0x0000111d);
std::cout << "get123(): " << "0x" << std::hex << std::setfill('0')
<< std::setw(8) << cal.get123() << std::endl;
std::cout << " get45(): " << "0x" << std::hex << std::setfill('0')
<< std::setw(8) << cal.get45() << std::endl;
// cal.set6789 (0x03030303);
// std::cout << "get6789(): " << cal.get6789() << std::endl;
// ...
return(0);
}
And the test code output:
16 bits 11 bits 5 bits hex => dec
0 0 0 0x00000000 => 0
field1 = 1 0 0 1 0x00000001 => 1
field1 = 31 0 0 1f 0x0000001f => 31
field2 = 1 0 1 0 0x00000020 => 32
field2 = 2047 0 7ff 0 0x0000ffe0 => 65,504
field3 = 1 1 0 0 0x00010000 => 65,536
field3 = 65535 ffff 0 0 0xffff0000 => 4,294,901,760
set123(0x...) abcd 370 17 0xabcd6e17 => 2,882,366,999
set123(0x...) ffff 7ff 1f 0xffffffff => 4,294,967,295
set123(0x...) 0 0 0 0x00000000 => 0
get123(): 0
get45(): 0
get123(): 0x0000038f
get45(): 0x0000111d
The goal of this code is to help you see how the bit fields map into the lsbyte through msbyte of the data.

If you care at all about efficiency, don't read the whole thing into a string and then convert it. Simply read one word at a time, and convert that. Your loop should look something like:
calibration c;
uint32_t* dest = reinterpret_cast<uint32_t*>(&c);
while (true) {
char hexText[8];
// TODO: Attempt to read 8 bytes from file and then skip whitespace
// TODO: Break out of the loop on EOF
std::uint32_t hexValue = 0; // TODO: Convert hex to dword
// Assumes the structure padding & packing matches the dump version's
// Assumes the structure size is exactly a multiple of 32-bytes (w/ padding)
static_assert(sizeof(calibration) % 4 == 0);
assert(dest - &c < sizeof(calibration) && "Too much data");
*dest++ = hexValue;
}
assert(dest - &c == sizeof(calibration) && "Too little data");
Converting 8 chars of hex to an actual 4-byte int is a good exercise and is well-covered elsewhere, so I've left it out (along with the file reading, which is similarly well-covered).
Note the two assumptions in the loop: the first one cannot be checked either at run-time or compile time, and must be either agreed upon in advance or extra work has to be done to properly serialize the structure (handling structure packing and padding, etc.). The last one can at least be checked at compile time with the static_assert.
Also, care has to be taken to ensure that the endianness of the hex bytes in the file matches the endianness of the architecture executing the program when converting the hex string. This will depend on whether the hex was written in a specific endianness in the first place (in which case you can convert it from the know endianness to the current architecture's endianness quite easily), or whether it's architecture-dependent (in which case you have no choice but to assume the endianness is the same as your current architecture).

Related

Why are the bit patterns not matching with the use of std::bitset [duplicate]

This question already has answers here:
Is the order of initialization guaranteed by the standard?
(5 answers)
Closed 3 years ago.
While I was unit testing my class and its constructors I noticed something peculiar with my outputs.
#include <bitset>
#include <cstdint>
#include <iostream>
#include <vector>
typedef std::uint8_t u8;
typedef std::uint16_t u16;
typedef std::uint32_t u32;
typedef std::uint64_t u64;
struct Reg8 {
std::bitset<8> bits;
u8 value;
Reg8() : value{0}, bits{value} {}
explicit Reg8( u8 val) : value{val}, bits{value} {}
explicit Reg8(u16 val) : value{static_cast<u8>(val)}, bits{value} {}
explicit Reg8(u32 val) : value{static_cast<u8>(val)}, bits{value} {}
explicit Reg8(u64 val) : value{static_cast<u8>(val)}, bits{value} {}
};
int main() {
u8 val8 = 24;
u16 val16 = 24;
u32 val32 = 24;
u64 val64 = 24;
Reg8 r8a(val8);
Reg8 r8b(val16);
Reg8 r8c(val32);
Reg8 r8d(val64);
std::cout << "Reg8(u8) r8a value = " << +r8a.value << '\n';
std::cout << "Reg8(u8) r8a bits = " << r8a.bits << "\n\n";
std::cout << "Reg8(u16) r8b value = " << +r8b.value << '\n';
std::cout << "Reg8(u16) r8b bits = " << r8b.bits << "\n\n";
std::cout << "Reg8(u32) r8c value = " << +r8c.value << '\n';
std::cout << "Reg8(u32) r8c bits = " << r8c.bits << "\n\n";
std::cout << "Reg8(u64) r8d value = " << +r8d.value << '\n';
std::cout << "Reg8(u64) r8d bits = " << r8d.bits << "\n\n";
std::bitset<8> bitsA{ val8 };
std::cout << "bits value = " << bitsA.to_ullong() << '\n';
std::cout << "bits binary = " << bitsA << "\n\n";
std::bitset<8> bitsB{ val16 };
std::cout << "bits value = " << bitsB.to_ullong() << '\n';
std::cout << "bits binary = " << bitsB << "\n\n";
std::bitset<8> bitsC{ val32 };
std::cout << "bits value = " << bitsC.to_ullong() << '\n';
std::cout << "bits binary = " << bitsC << "\n\n";
std::bitset<8> bitsD{ val64 };
std::cout << "bits value = " << bitsD.to_ullong() << '\n';
std::cout << "bits binary = " << bitsD << "\n\n";
return EXIT_SUCCESS;
}
Here is my output coming from a little endian machine and
Intel Quad Core Extreme running Windows 7 x64 and using
Visual Studio 2017 CE in x64 debug mode with compiler language options set to c++ latest draft standard. All other compiler flags - optimizations etc Visual Studio's defaults.
Reg8(u8) r8a value = 24
Reg8(u8) r8a bits = 11001100
Reg8(u16) r8b value = 24
Reg8(u16) r8b bits = 11001100
Reg8(u32) r8c value = 24
Reg8(u32) r8c bits = 11001100
Reg8(u64) r8d value = 24
Reg8(u64) r8d bits = 11001100
bits value = 24
bits binary = 00011000
bits value = 24
bits binary = 00011000
bits value = 24
bits binary = 00011000
bits value = 24
bits binary = 00011000
So why do the bit patterns from the class construction not match that of the ones declared in main?
I'm using the same variable types and values to initialize the bitset that is in main as well as the one in my class yet their bit patterns don't match up. All of the ones in the class are the same, all the ones outside of the class are the same.
What I'm expecting to see and my desired values should be that of the ones seen in main. When I look at the bottom half of the output these bitset variables in main have the value of 24
and their bit patterns match that of 24 for 8 bits.
0001 1000 = 24
However the bit pattern that is stored in the bitset within my class does not match but contains the appropriate value. The bitset stored in my class has bitt pattern
1100 1100 ... doesn't = 24 in binary
What is going on here?
Remember that a member initialization list initializes in declaration order. Meaning that for:
struct Reg8 {
std::bitset<8> bits; // <- first to be initialized in mem-init-list
u8 value; // <- second
};
And taking this constructor as an example:
explicit Reg8(u16 val) : value{static_cast<u8>(val)}, bits{value} {}
The order of the initialization list doesn't matter. bits comes before value in the declaration so bits will be initialized first. It will be initialized to value which is still uninitialized as value{static_cast<u8>(val)} comes after the initialization of bits.
To fix this, swap the declaration around:
struct Reg8 {
u8 value;
std::bitset<8> bits;
};
Side note: You're missing a stdint include in your code.

Why can't I pack these ints together?

I have the following code. The goal is to combine the two uint32_ts into a single uint64_t and then retrieve the values.
#include <iostream>
#include <cstdint>
int main()
{
uint32_t first = 5;
uint32_t second = 6;
uint64_t combined = (first << 32) | second;
uint32_t firstR = combined >> 32;
uint32_t secondR = combined & 0xffffffff;
std::cout << "F: " << firstR << " S: " << secondR << std::endl;
}
It outputs
F: 0 S: 7
How do I successfully retrieve the values correctly?
first is a 32-bit type and you bit-shift it by 32 bits. This is technically undefined behaviour, but probably the most likely outcome is that the result of the expression is 0. You need to cast it to a larger type before bit-shifting it.
uint64_t combined = (static_cast<uint64_t>(first) << 32) | second;
When you perform first << 32, you are shifting 32 bits within the space of 32 bits, so there are no bits remaining after the shift. The result of the shift is 0. You need to convert the first value to 64 bits before you shift it:
uint64_t combined = (uint64_t(first) << 32) | second;
As per the comments:
#include <iostream>
#include <cstdint>
int main()
{
uint32_t first = 5;
uint32_t second = 6;
uint64_t combined = (uint64_t(first) << 32) | second;
uint32_t firstR = combined >> 32;
uint32_t secondR = combined & 0xffffffff;
std::cout << "F: " << firstR << " S: " << secondR << std::endl;
}
The bit manipulation operators return a type of the first parameter. So you need to cast it to uint64_t in order for it to have room for the second value.

Appending bits in C/C++

I want to append two unsigned 32bit integers into 1 64 bit integer. I have tried this code, but it fails. However, it works for 16bit integers into 1 32 bit
Code:
char buffer[33];
char buffer2[33];
char buffer3[33];
/*
uint16 int1 = 6535;
uint16 int2 = 6532;
uint32 int3;
*/
uint32 int1 = 653545;
uint32 int2 = 562425;
uint64 int3;
int3 = int1;
int3 = (int3 << 32 /*(when I am doing 16 bit integers, this 32 turns into a 16)*/) | int2;
itoa(int1, buffer, 2);
itoa(int2, buffer2, 2);
itoa(int3, buffer3, 2);
std::cout << buffer << "|" << buffer2 << " = \n" << buffer3 << "\n";
Output when the 16bit portion is enabled:
1100110000111|1100110000100 =
11001100001110001100110000100
Output when the 32bit portion is enabled:
10011111100011101001|10001001010011111001 =
10001001010011111001
Why is it not working? Thanks
I see nothing wrong with this code. It works for me. If there's a bug, it's in the code that's not shown.
Version of the given code, using standardized type declarations and iostream manipulations, instead of platform-specific library calls. The bit operations are identical to the example given.
#include <iostream>
#include <iomanip>
#include <stdint.h>
int main()
{
uint32_t int1 = 653545;
uint32_t int2 = 562425;
uint64_t int3;
int3 = int1;
int3 = (int3 << 32) | int2;
std::cout << std::hex << std::setw(8) << std::setfill('0')
<< int1 << " "
<< std::setw(8) << std::setfill('0')
<< int2 << "="
<< std::setw(16) << std::setfill('0')
<< int3 << std::endl;
return (0);
}
Resulting output:
0009f8e9 000894f9=0009f8e9000894f9
The bitwise operation looks correct to me. When working with bits, hexadecimal is more convenient. Any bug, if there is one, is in the code that was not shown in the question. As far as "appending bits in C++" goes, what you have in your code appears to be correct.
Try declaring buffer3 as buffer3[65]
Edit:
Sorry.
But I don't understand what the complaint is about.
In fact the answer is just as expected. You can infer it from your own result for the 16 bit input.
Since when you are oring the 32 '0' bits in lsb with second integer it will have leading zeroes in msb (when assigned to a 32 bit int which is in the signature of atoi) which are truncated in atoi (only the integer value equivalent will be read in the string, hence the string has to be 0X0 terminated, otherwise it would have a determinable size), giving the result.

C++ How to create byte[] array from file (I don't mean reading file byte by byte)?

I have a problem I neither can solve on my own nor find answer anywhere. I have a file contains such a string:
01000000d08c9ddf0115d1118c7a00c04
I would like to read the file in the way, that I would do manually like that:
char fromFile[] =
"\x01\x00\x00\x00\xd0\x8c\x9d\xdf\x011\x5d\x11\x18\xc7\xa0\x0c\x04";
I would really appreciate any help.
I want to do it in C++ (the best would be vc++).
Thank You!
int t194(void)
{
// imagine you have n pair of char, for simplicity,
// here n is 3 (you should recognize them)
char pair1[] = "01"; // note:
char pair2[] = "8c"; // initialize with 3 char c-style strings
char pair3[] = "c7"; //
{
// let us put these into a ram based stream, with spaces
std::stringstream ss;
ss << pair1 << " " << pair2 << " " << pair3;
// each pair can now be extracted into
// pre-declared int vars
int i1 = 0;
int i2 = 0;
int i3 = 0;
// use formatted extractor to convert
ss >> i1 >> i2 >> i3;
// show what happened (for debug only)
std::cout << "Confirm1:" << std::endl;
std::cout << "i1: " << i1 << std::endl;
std::cout << "i2: " << i2 << std::endl;
std::cout << "i3: " << i3 << std::endl << std::endl;
// output is:
// Confirm1:
// i1: 1
// i2: 8
// i3: 0
// Shucks, not correct.
// We know the default radix is base 10
// I hope you can see that the input radix is wrong,
// because c is not a decimal digit,
// the i2 and i3 conversions stops before the 'c'
}
// pre-delcare
int i1 = 0;
int i2 = 0;
int i3 = 0;
{
// so we try again, with radix info added
std::stringstream ss;
ss << pair1 << " " << pair2 << " " << pair3;
// strings are already in hex, so we use them as is
ss >> std::hex // change radix to 16
>> i1 >> i2 >> i3;
// now show what happened
std::cout << "Confirm2:" << std::endl;
std::cout << "i1: " << i1 << std::endl;
std::cout << "i2: " << i2 << std::endl;
std::cout << "i3: " << i3 << std::endl << std::endl;
// output now:
// i1: 1
// i2: 140
// i3: 199
// not what you expected? Though correct,
// now we can see we have the wrong radix for output
// add output radix to cout stream
std::cout << std::hex // add radix info here!
<< "i1: " << i1 << std::endl
// Note: only need to do once for std::cout
<< "i2: " << i2 << std::endl
<< "i3: " << i3 << std::endl << std::endl
<< std::dec;
// output now looks correct, and easily comparable to input
// i1: 1
// i2: 8c
// i3: c7
// So: What next?
// read the entire string of hex input into a single string
// separate this into pairs of chars (perhaps using
// string::substr())
// put space separated pairs into stringstream ss
// extract hex values until ss.eof()
// probably should add error checks
// and, of course, figure out how to use a loop for these steps
//
// alternative to consider:
// read 1 char at a time, build a pairing, convert, repeat
}
//
// Eventually, you should get far enough to discover that the
// extracts I have done are integers, but you want to pack them
// into an array of binary bytes.
//
// You can go back, and recode to extract bytes (either
// unsigned char or uint8_t), which you might find interesting.
//
// Or ... because your input is hex, and the largest 2 char
// value will be 0xff, and this fits into a single byte, you
// can simply static_cast them (I use unsigned char)
unsigned char bin[] = {static_cast<unsigned char>(i1),
static_cast<unsigned char>(i2),
static_cast<unsigned char>(i3) };
// Now confirm by casting these back to ints to cout
std::cout << "Confirm4: "
<< std::hex << std::setw(2) << std::setfill('0')
<< static_cast<int>(bin[0]) << " "
<< static_cast<int>(bin[1]) << " "
<< static_cast<int>(bin[2]) << std::endl;
// you also might consider a vector (and i prefer uint8_t)
// because push_back operations does a lot of hidden work for you
std::vector<uint8_t> bytes;
bytes.push_back(static_cast<uint8_t>(i1));
bytes.push_back(static_cast<uint8_t>(i2));
bytes.push_back(static_cast<uint8_t>(i3));
// confirm
std::cout << "\nConfirm5: ";
for (size_t i=0; i<bytes.size(); ++i)
std::cout << std::hex << std::setw(2) << std::setfill(' ')
<< static_cast<int>(bytes[i]) << " ";
std::cout << std::endl;
Note: The cout (or ss) of bytes or char can be confusing, not always giving the result you might expect. My background is embedded software, and I have surprisingly small experience making stream i/o of bytes work. Just saying this tends to bias my work when dealing with stream i/o.
// other considerations:
//
// you might read 1 char at a time. this can simplify
// your loop, possibly easier to debug
// ... would you have to detect and remove eoln? i.e. '\n'
// ... how would you handle a bad input
// such as not hex char, odd char count in a line
//
// I would probably prefer to use getline(),
// it will read until eoln(), and discard the '\n'
// then in each string, loop char by char, creating char pairs, etc.
//
// Converting a vector<uint8_t> to char bytes[] can be an easier
// effort in some ways. A vector<> guarantees that all the values
// contained are 'packed' back-to-back, and contiguous in
// memory, just right for binary stream output
//
// vector.size() tells how many chars have been pushed
//
// NOTE: the formatted 'insert' operator ('<<') can not
// transfer binary data to a stream. You must use
// stream::write() for binary output.
//
std::stringstream ssOut;
// possible approach:
// 1 step reinterpret_cast
// - a binary block output requires "const char*"
const char* myBuff = reinterpret_cast<const char*>(&myBytes.front());
ssOut.write(myBuff, myBytes.size());
// block write puts binary info into stream
// confirm
std::cout << "\nConfirm6: ";
std::string s = ssOut.str(); // string with binary data
for (size_t i=0; i<s.size(); ++i)
{
// because binary data is _not_ signed data,
// we need to 'cancel' the sign bit
unsigned char ukar = static_cast<unsigned char>(s[i]);
// because formatted output would interpret some chars
// (like null, or \n), we cast to int
int intVal = static_cast<int>(ukar);
// cast does not generate code
// now the formatted 'insert' operator
// converts and displays what we want
std::cout << std::hex << std::setw(2) << std::setfill('0')
<< intVal << " ";
}
std::cout << std::endl;
//
//
return (0);
} // int t194(void)
The below snippet should be helpful!
std::ifstream input( "filePath", std::ios::binary );
std::vector<char> hex((
std::istreambuf_iterator<char>(input)),
(std::istreambuf_iterator<char>()));
std::vector<char> bytes;
for (unsigned int i = 0; i < hex.size(); i += 2) {
std::string byteString = hex.substr(i, 2);
char byte = (char) strtol(byteString.c_str(), NULL, 16);
bytes.push_back(byte);
}
char* byteArr = bytes.data()
The way I understand your question is that you want just the binary representation of the numbers, i.e. remove the ascii (or ebcdic) part. Your output array will be half the length of the input array.
Here is some crude pseudo code.
For each input char c:
if (isdigit(c)) c -= '0';
else if (isxdigit(c) c -= 'a' + 0xa; //Need to check for isupper or islower)
Then, depending on the index of c in your input array:
if (! index % 2) output[outputindex] = (c << 4) & 0xf0;
else output[outputindex++] = c & 0x0f;
Here is a function that takes a string as in your description, and outputs a string that has \x in front of each digit.
#include <iostream>
#include <algorithm>
#include <string>
std::string convertHex(const std::string& str)
{
std::string retVal;
std::string hexPrefix = "\\x";
if (!str.empty())
{
std::string::const_iterator it = str.begin();
do
{
if (std::distance(it, str.end()) == 1)
{
retVal += hexPrefix + "0";
retVal += *(it);
++it;
}
else
{
retVal += hexPrefix + std::string(it, it+2);
it += 2;
}
} while (it != str.end());
}
return retVal;
}
using namespace std;
int main()
{
cout << convertHex("01000000d08c9ddf0115d1118c7a00c04") << endl;
cout << convertHex("015d");
}
Output:
\x01\x00\x00\x00\xd0\x8c\x9d\xdf\x01\x15\xd1\x11\x8c\x7a\x00\xc0\x04
\x01\x5d
Basically it is nothing more than a do-while loop. A string is built from each pair of characters encountered. If the number of characters left is 1 (meaning that there is only one digit), a "0" is added to the front of the digit.
I think I'd use a proxy class for reading and writing the data. Unfortunately, the code for the manipulators involved is just a little on the verbose side (to put it mildly).
#include <vector>
#include <algorithm>
#include <iterator>
#include <iostream>
#include <iomanip>
#include <string>
#include <sstream>
struct byte {
unsigned char ch;
friend std::istream &operator>>(std::istream &is, byte &b) {
std::string temp;
if (is >> std::setw(2) >> std::setprecision(2) >> temp)
b.ch = std::stoi(temp, 0, 16);
return is;
}
friend std::ostream &operator<<(std::ostream &os, byte const &b) {
return os << "\\x" << std::setw(2) << std::setfill('0') << std::setprecision(2) << std::hex << (int)b.ch;
}
};
int main() {
std::istringstream input("01000000d08c9ddf115d1118c7a00c04");
std::ostringstream result;
std::istream_iterator<byte> in(input), end;
std::ostream_iterator<byte> out(result);
std::copy(in, end, out);
std::cout << result.str();
}
I do really dislike how verbose iomanipulators are, but other than that it seems pretty clean.
You can try a loop with fscanf
unsigned char b;
fscanf(pFile, "%2x", &b);
Edit:
#define MAX_LINE_SIZE 128
FILE* pFile = fopen(...);
char fromFile[MAX_LINE_SIZE] = {0};
char b = 0;
int currentIndex = 0;
while (fscanf (pFile, "%2x", &b) > 0 && i < MAX_LINE_SIZE)
fromFile[currentIndex++] = b;

How to initialize bitfields with a C++ Constructor?

First off, I’m not concerned with portability, and can safely assume that the endianness will not change. Assuming I read a hardware register value, I would like to overlay that register value over bitfields so that I can refer to the individual fields in the register without using bit masks.
EDIT: Fixed problems pointed out by GMan, and adjusted the code so it's clearer for future readers.
SEE: Anders K. & Michael J's answers below for a more eloquent solution.
#include <iostream>
/// \class HardwareRegister
/// Abstracts out bitfields in a hardware register.
/// \warning This is non-portable code.
class HardwareRegister
{
public:
/// Constructor.
/// \param[in] registerValue - the value of the entire register. The
/// value will be overlayed onto the bitfields
/// defined in this class.
HardwareRegister(unsigned long registerValue = 0)
{
/// Lots of casting to get registerValue to overlay on top of the
/// bitfields
*this = *(reinterpret_cast<HardwareRegister*>(&registerValue));
}
/// Bitfields of this register.
/// The data type of this field should be the same size as the register
/// unsigned short for 16 bit register
/// unsigned long for 32 bit register.
///
/// \warning Remember endianess! Order of the following bitfields are
/// important.
/// Big Endian - Start with the most signifcant bits first.
/// Little Endian - Start with the least signifcant bits first.
unsigned long field1: 8;
unsigned long field2:16;
unsigned long field3: 8;
}; //end class Hardware
int main()
{
unsigned long registerValue = 0xFFFFFF00;
HardwareRegister testRegister(registerValue);
// Prints out for little endianess machine
// Field 1 = 0
// Field 2 = 65535
// Field 3 = 255
std::cout << "Field 1 = " << testRegister.field1 << std::endl;
std::cout << "Field 2 = " << testRegister.field2 << std::endl;
std::cout << "Field 3 = " << testRegister.field3 << std::endl;
}
don't do this
*this = *(reinterpret_cast<HW_Register*>(&registerValue));
the 'this' pointer shouldn't be fiddled with in that way:
HW_Register reg(val)
HW_Register *reg = new HW_Register(val)
here 'this' is in two different places in memory
instead have an internal union/struct to hold the value, that way its easy to convert
back and forth (since you are not interested in portability)
e.g.
union
{
struct {
unsigned short field1:2;
unsigned short field2:4;
unsigned short field3:2;
...
} bits;
unsigned short value;
} reg
edit: true enough with the name 'register'
Bitfields don't work that way. You can't assign a scalar value to a struct full of bitfields. It looks like you already know this since you used reinterpret_cast, but since reinterpret_cast isn't guaranteed to do very much, it's just rolling the dice.
You need to encode and decode the values if you want to translate between bitfield structs and scalars.
HW_Register(unsigned char value)
: field1( value & 3 ),
field2( value >> 2 & 3 ),
field3( value >> 4 & 7 )
{}
Edit: The reason you don't get any output is that the ASCII characters corresponding to the numbers in the fields are non-printing. Try this:
std::cout << "Field 1 = " << (int) testRegister.field1 << std::endl;
std::cout << "Field 2 = " << (int) testRegister.field2 << std::endl;
std::cout << "Field 3 = " << (int) testRegister.field3 << std::endl;
Try this:
class HW_Register
{
public:
HW_Register(unsigned char nRegisterValue=0)
{
Init(nRegisterValue);
}
~HW_Register(void){};
void Init(unsigned char nRegisterValue)
{
nVal = nRegisterValue;
}
unsigned Field1() { return nField1; }
unsigned Field2() { return nField2; }
unsigned Field3() { return nField3; }
private:
union
{
struct
{
unsigned char nField1:2;
unsigned char nField2:4;
unsigned char nField3:2;
};
unsigned char nVal;
};
};
int main()
{
unsigned char registerValue = 0xFF;
HW_Register testRegister(registerValue);
std::cout << "Field 1 = " << testRegister.Field1() << std::endl;
std::cout << "Field 2 = " << testRegister.Field2() << std::endl;
std::cout << "Field 3 = " << testRegister.Field3() << std::endl;
return 0;
}
HW_Register(unsigned char registerValue) : field1(0), field2(0), field3(0)