I'm trying to construct a string from a byte array (libcrypto++) but I have issues with '0' in order to connect to SQS in c++
The result is almost correct except some '0' go at the end of the string.
std::string shaDigest(const std::string &key = "") {
byte out[64] = {0};
CryptoPP::SHA256().CalculateDigest(out, reinterpret_cast<const byte*>(key.c_str()), key.size());
std::stringstream ss;
std::string rep;
for (int i = 0; i < 64; i++) {
ss << std::hex << static_cast<int>(out[i]);
}
ss >> rep;
rep.erase(rep.begin()+64, rep.end());
return rep;
}
output:
correct : c46268185ea2227958f810a84dce4ade54abc4f42a03153ef720150a40e2e07b
mine : c46268185ea2227958f810a84dce4ade54abc4f42a3153ef72015a40e2e07b00
^ ^
Edit: I'm trying to do the same that hashlib.sha256('').hexdigest() in python does.
If that indeed works, here's the solution with my suggestions incorporated.
std::string shaDigest(const std::string &key = "") {
std::array<byte, 64> out {};
CryptoPP::SHA256().CalculateDigest(out.data(), reinterpret_cast<const byte*>(key.c_str()), key.size());
std::stringstream ss;
ss << std::hex << std::setfill('0');
for (byte b : out) {
ss << std::setw(2) << static_cast<int>(b);
}
// I don't think `.substr(0,64)` is needed here;
// hex ASCII form of 64-byte array should always have 128 characters
return ss.str();
}
You correctly convert bytes in hexadecimal, and it works correctly as soon as the byte value is greater than 15. But below, the first hexa digit is a 0 and is not printed by default. The two absent 0 are for 0x03 -> 3 and 0x0a -> a.
You should use :
for (int i = 0; i < 64; i++) {
ss << std::hex << std::setw(2) << std::setfill('0') << static_cast<int>(out[i]);
}
You need to set the width for the integer numbers for the proper zero-padding of numbers with otherwise less than two hexadecimal digits. Note that you need to re-set the width before every number that is inserted into the stream.
Example:
#include <iostream>
#include <iomanip>
int main() {
std::cout << std::hex << std::setfill('0');
for (int i=0; i<0x11; i++)
std::cout << std::setw(2) << i << "\n";
}
Output:
$ g++ test.cc && ./a.out
00
01
02
03
04
05
06
07
08
09
0a
0b
0c
0d
0e
0f
10
For reference:
http://en.cppreference.com/w/cpp/io/manip/setw
http://en.cppreference.com/w/cpp/io/manip/setfill
Related
In C++ I can initialize a vector using
std::vector<uint8_t> data = {0x01, 0x02, 0x03};
For convenience (I have python byte strings that naturally output in a dump of hex), I would like to initialize for a non-delimited hex value of the form:
std::vector<uint8_t> data = 0x229597354972973aabbe7;
Is there a variant of this that is valid c++?
Combining comments from Evg, JHbonarius and 1201ProgramAlarm:
The answer is that there is no direct way to group but a long hex value into a vector, however, using user defined literals provides a clever notation improvement.
First, using RHS 0x229597354972973aabbe7 anywhere in the code will fail because because unsuffixed literals are assumed to be of type int and will fail to be contained in the register. In MSVC, result in E0023 "integer constant is too large". Limiting to smaller hex sequences or exploring large data types may be possible with suffixed notation, but this would ruin any desire for simplicity.
Manual conversion is necessary, but user defined literals may provide a slightly more elegant notation. For example, we can enable conversion of a hex sequence to a vector with
std::vector<uint8_t> val1 = 0x229597354972973aabbe7_hexvec;
std::vector<uint8_t> val2 = "229597354972973aabbe7"_hexvec;
using the following code:
#include <vector>
#include <iostream>
#include <string>
#include <algorithm>
// Quick Utlity function to view results:
std::ostream & operator << (std::ostream & os, std::vector<uint8_t> & v)
{
for (const auto & t : v)
os << std::hex << (int)t << " ";
return os;
}
std::vector<uint8_t> convertHexToVec(const char * str, size_t len)
{
// conversion takes strings of form "FFAA54" or "0x11234" or "0X000" and converts to a vector of bytes.
// Get the first two characters and skip them if the string starts with 0x or 0X for hex specification:
std::string start(str, 2);
int offset = (start == "0x" || start == "0X") ? 2 : 0;
// Round up the number of groupings to allow for ff_hexvec fff_hexvec and remove the offset to properly count 0xfff_hexvec
std::vector<uint8_t> result((len + 1 - offset) / 2);
size_t ind = result.size() - 1;
// Loop from right to left in in pairs of two but watch out for a lone character on the left without a pair because 0xfff_hexvec is valid:
for (const char* it = str + len - 1; it >= str + offset; it -= 2) {
int val = (str + offset) > (it - 1); // check if taking 2 values will run off the start and use this value to reduce by 1 if we will
std::string s(std::max(it - 1, str + offset), 2 - val);
result[ind--] = (uint8_t)stol(s, nullptr, 16);
}
return result;
}
std::vector<uint8_t> operator"" _hexvec(const char*str, std::size_t len)
{
// Handles the conversion form "0xFFAABB"_hexvec or "12441AA"_hexvec
return convertHexToVec(str, len);
}
std::vector<uint8_t> operator"" _hexvec(const char*str)
{
// Handles the form 0xFFaaBB_hexvec and 0Xf_hexvec
size_t len = strlen(str);
return convertHexToVec(str, len);
}
int main()
{
std::vector<uint8_t> v;
std::vector<uint8_t> val1 = 0x229597354972973aabbe7_hexvec;
std::vector<uint8_t> val2 = "229597354972973aabbe7"_hexvec;
std::cout << val1 << "\n";
std::cout << val2 << "\n";
return 0;
}
The coder must decide whether this is superior to implementing and using a more traditional convertHexToVec("0x41243124FF") function.
Is there a variant of this that is valid c++?
I think not.
The following code is valid C++, and uses a more "traditional hex conversion" process.
Confirm and remove the leading '0x', also confirm that all chars are
hex characters.
modifyFor_SDFE() - 'space delimited format extraction'
This function inserts spaces around the two char byte descriptors.
Note that this function also adds a space char at front and back of the modified string. This new string is used to create and initialize a std::stringstream (ss1).
By inserting the spaces, the normal stream "formatted extraction" works cleanly
The code extracts each hex value, one by one, and pushes each into the vector, and ends when last byte is pushed (stream.eof()). Note the vector automatically grows as needed (no overflow will occur).
Note that the '0x' prefix is not needed .. because the stream mode is set to hex.
Note that the overflow concern (expressed above as "0x22...be7 is likely to overflow." has been simply side-stepped, by reading only a byte at a time. It might be convenient in future efforts to use much bigger hex strings.
#include <iostream>
using std::cout, std::cerr, std::endl, std::hex,
std::dec, std::cin, std::flush; // c++17
#include <iomanip>
using std::setw, std::setfill;
#include <string>
using std::string;
#include <sstream>
using std::stringstream;
#include <vector>
using std::vector;
typedef vector<uint8_t> UI8Vec_t;
#include <cstdint>
#include <cassert>
class F889_t // Functor ctor and dtor use compiler provided defaults
{
bool verbose;
public:
int operator()(int argc, char* argv[]) // functor entry
{
verbose = ( (argc > 1) ? ('V' == toupper(argv[1][0])) : false );
return exec(argc, argv);
}
// 2 lines
private:
int exec(int , char** )
{
UI8Vec_t resultVec; // output
// example1 input
// string data1 = "0x229597354972973aabbe7"; // 23 chars, hex string
// to_ui8_vec(resultVec, data1);
// cout << (verbose ? "" : "\n") << " vector result "
// << show(ui8Vec); // show results
// example2 input 46 chars (no size limit)
string data = "0x330508465083084bBCcf87eBBaa379279543795922fF";
to_ui8_vec (resultVec, data);
cout << (verbose ? " vector elements " : "\n ")
<< show(resultVec) << endl; // show results
if(verbose) { cout << "\n F889_t::exec() (verbose) ("
<< __cplusplus << ")" << endl; }
return 0;
} // int exec(int, char**)
// 7 lines
void to_ui8_vec(UI8Vec_t& retVal, // output (pass by reference)
string sData) // input (pass by value)
{
if(verbose) { cout << "\n input data '" << sData
<< "' (" << sData.size() << " chars)" << endl;}
{ // misc format checks:
size_t szOrig = sData.size();
{
// confirm leading hex indicator exists
assert(sData.substr(0,2) == string("0x"));
sData.erase(0,2); // discard leading "0x"
}
size_t sz = sData.size();
assert(sz == (szOrig - 2)); // paranoia
// to test that this will detect any typos in data:
// temporarily append or insert an invalid char, i.e. sData += 'q';
assert(sData.find_first_not_of("0123456789abcdefABCDEF") == std::string::npos);
}
modifyFor_SDFE (sData); // SDFE - 'Space Delimited Formatted Extraction'
stringstream ss1(sData); // create / initialize stream with SDFE
if(verbose) { cout << " SDFE data '" << ss1.str() // echo init
<< "' (" << sData.size() << " chars)" << endl; }
extract_values_from_SDFE_push_back_into_vector(retVal, ss1);
} // void to_ui8_vec (vector<uint8_t>&, string)
// 13 lines
// modify s (of any size) for 'Space Delimited Formatted Extraction'
void modifyFor_SDFE (string& s)
{
size_t indx = s.size();
while (indx > 2)
{
indx -= 2;
s.insert (indx, 1, ' '); // indx, count, delimiter
}
s.insert(0, 1, ' '); // delimiter at front of s
s += ' '; // delimiter at tail of s
} // void modifyFor_SDFE (string&)
// 6 lines
void extract_values_from_SDFE_push_back_into_vector(UI8Vec_t& retVal,
stringstream& ss1)
{
do {
uint n = 0;
ss1 >> hex >> n; // use SDFE, hex mode - extract one field at a time
if(!ss1.good()) // check ss1 state
{
if(ss1.eof()) break; // quietly exit, this is a normal stream exit
// else make some noise before exit loop
cerr << "\n err: data input line invalid [" << ss1.str() << ']' << endl; break;
}
retVal.push_back(static_cast<uint8_t>(n & 0xff)); // append to vector
} while(true);
} // void extract_from_SDFE_push_back_to_vector(UI8Vec_t& , string)
// 6 lines
string show(const UI8Vec_t& ui8Vec)
{
stringstream ss ("\n ");
for (uint i = 0; i < ui8Vec.size(); ++i) {
ss << setfill('0') << setw(2) << hex
<< static_cast<int>(ui8Vec[i]) << ' '; }
if(verbose) { ss << " (" << dec << ui8Vec.size() << " elements)"; }
return ss.str();
}
// 5 lines
}; // class F889_t
int main(int argc, char* argv[]) { return F889_t()(argc, argv); }
Typical outputs when invoked with 'verbose' second parameter
$ ./dumy889 verbose
input data '0x330508465083084bBCcf87eBBaa379279543795922fF' (46 chars)
SDFE data ' 33 05 08 46 50 83 08 4b BC cf 87 eB Ba a3 79 27 95 43 79 59 22 fF ' (67 chars)
vector elements 33 05 08 46 50 83 08 4b bc cf 87 eb ba a3 79 27 95 43 79 59 22 ff (22 elements)
When invoked with no parameters
$ ./dumy889
33 05 08 46 50 83 08 4b bc cf 87 eb ba a3 79 27 95 43 79 59 22 ff
The line counts do not include empty lines, nor lines that are only a comment or only a brace. You may count the lines as you wish.
I have written some code that loads some files containing a list of words (one word pr line). each word is added to a multiset. later I try to search the multiset with multiset.find("aWord"). where I look for the word and substrings of the word in the multiset.
This code works fine if I compile it with qt on a windows system.
But don't work if i compile it in qt on my mac !
my goal is to make it work from qt on my mac.
I am woking on macbook Air (13" early 2018) with a
macOS Majave version 10.14.4 instalation
Buil version 18E226
local 18.5.0 Darwin Kernel Version 18.5.0: Mon Mar 11 20:40:32 PDT
2019; root:xnu-4903.251.3~3/RELEASE_X86_64 x86_64
Using a qt installation:
QTKit:
Version: 7.7.3
Obtained from: Apple
Last Modified: 13/04/2019 12.11
Kind: Intel
64-Bit (Intel): Yes
Get Info String: QTKit 7.7.3, Copyright 2003-2012, Apple Inc.
Location: /System/Library/Frameworks/QTKit.framework
Private: No
And xcode installation:
Xcode 10.2
Build version 10E125
I have tried to print out:
every strings that i am searching for
and every string i should find in the multiset as hex format
and concluded that some of the letters do not match.
in there hex value. despite i think my whole system run utf-8 and the file also is utf-8 encoded.
Dictionary.h
#ifndef DICTIONARY_H
#define DICTIONARY_H
#include <iostream>
#include <vector>
#include <set>
class Dictionary
{
public:
Dictionary();
void SearchForAllPossibleWordsIn(std::string searchString);
private:
std::multiset<std::string, std::less<std::string>> mDictionary;
void Initialize(std::string folder);
void InitializeLanguage(std::string folder, std::string languageFileName);
};
#endif // DICTIONARY_H
Dictionary.cpp
#include "Dictionary.h"
#include <vector>
#include <set>
#include <iostream>
#include <fstream>
#include <exception>
Dictionary::Dictionary()
{
Initialize("../Lektion10Projekt15-1/");
}
void Dictionary::Initialize(std::string folder)
{
InitializeLanguage(folder,"da-utf8.wl");
}
void Dictionary::InitializeLanguage(std::string folder, std::string languageFileName)
{
std::ifstream ifs;
ifs.open(folder+languageFileName,std::ios_base::in);
if (ifs.fail()) {
std::cerr <<"Error! Class: Dictionary. Function: InitializeLanguage(...). return: ifs.fail to load file '" + languageFileName + "'" << std::endl;
}else {
std::string word;
while (!ifs.eof()) {
std::getline(ifs,word);
mDictionary.insert(word);
}
}
ifs.close();
}
void Dictionary::SearchForAllPossibleWordsIn(std::string searchString)
{
std::vector<std::string> result;
for (unsigned int a = 0 ; a <= searchString.length(); ++a) {
for (unsigned int b = 1; b <= searchString.length()-a; ++b) {
std::string substring = searchString.substr(a,b);
if (mDictionary.find(substring) != mDictionary.end())
{
result.push_back(substring);
}
}
}
if (!result.empty()) {
for (unsigned int i = 0; i < result.size() ;++i) {
std::cout << result[i] << std::endl;
}
}
}
main.cpp
#include <iostream>
#include "Dictionary.h"
int main()
{
Dictionary myDictionary;
myDictionary.SearchForAllPossibleWordsIn("byggearbejderen");
return 0;
}
I have tried to change the following line in main.cpp
myDictionary.SearchForAllPossibleWordsIn("byggearbejderen");
to (OBS: the first word in the word list is byggearbejderen)
std::ifstream ifs;
ifs.open("../Lektion10Projekt15-1/da-utf8.wl",std::ios::in);
if (ifs.fail()) {
std::cerr <<"Error!" << std::endl;
}else {
std::getline(ifs,searchword);
}
ifs.close();
myDictionary.SearchForAllPossibleWordsIn(searchword);
And then in the main.cpp add som print out with the expected string and substring in hex value.
std::cout << " cout as hex test:" << std::endl;
myDictionary.SearchForAllPossibleWordsIn(searchword);
std::cout << "Suposet search resul for ''bygearbejderen''" << std::endl;
for (char const elt: "byggearbejderen")
std::cout << std::hex << std::setw(2) << std::setfill('0') << static_cast<int>(elt) << " ";
std::cout << "byggearbejderen" << std::endl;
for (char const elt: "arbejderen")
std::cout << std::hex << std::setw(2) << std::setfill('0') << static_cast<int>(elt) << " ";
std::cout << "arbejderen" << std::endl;
for (char const elt: "ren")
std::cout << std::hex << std::setw(2) << std::setfill('0') << static_cast<int>(elt) << " ";
std::cout << "ren" << std::endl;
for (char const elt: "en")
std::cout << std::hex << std::setw(2) << std::setfill('0') << static_cast<int>(elt) << " ";
std::cout << "en" << std::endl;
for (char const elt: "n")
std::cout << std::hex << std::setw(2) << std::setfill('0') << static_cast<int>(elt) << " ";
std::cout << "n" << std::endl;
And also added the same print in the outprint of result in Dictonary.cpp
std::cout << "result of seartchword as hex" << std::endl;
if (!result.empty()) {
for (unsigned int i = 0; i < result.size() ;++i)
{
for (char const elt: result[i] )
{
std::cout << std::hex << std::setw(2) << std::setfill('0') << static_cast<int>(elt) << " ";
}
std::cout << result[i] << std::endl;
}
}
which gave the following output:
result of seartchword as hex
ffffffef ffffffbb ffffffbf 62 79 67 67 65 61 72 62 65 6a 64 65 72 65 6e 0d byggearbejderen
61 72 62 65 6a 64 65 72 65 6e 0d arbejderen
72 65 6e 0d ren
65 6e 0d en
6e 0d n
Suposet search resul for ''bygearbejderen''
62 79 67 67 65 61 72 62 65 6a 64 65 72 65 6e 00 byggearbejderen
61 72 62 65 6a 64 65 72 65 6e 00 arbejderen
72 65 6e 00 ren
65 6e 00 en
6e 00 n
where I notice that some values were different.
I don't know why this is the case when i am on a macOS but not the case on windows. I do not know if there are any settings of encoding in my environment I need to change or set correct.
I would like i my main.cpp looked liked this:
#include <iostream>
#include "Dictionary.h"
int main()
{
Dictionary myDictionary;
myDictionary.SearchForAllPossibleWordsIn("byggearbejderen");
return 0;
}
resulting in the following output:
byggearbejderen
arbejderen
ren
en
n
Line endings for text files are different on Windows than they are on a Mac. Windows uses both CR/LF characters (ASCII codes 13 and 10, respectively). Old Macs used the CR character alone, Linux systems use just the LF. If you create a text file on Windows, then copy it to your Mac, the line endings might not be handled correctly.
If you look at the last character in your output, you'll see it is a 0d, which would be the CR character. I don't know how you generated that output, but it is possible that the getline on the Mac is treating that as a normal character, and including it in the string that has been read in.
The simplest solution is to either process that text file beforehand to get the line endings correct, or strip the CR off the end of the words after they are read in.
The server needs to send a std::vector<float> to a Qt application over a TCP socket. I am using Qt 5.7.
On the server side, using boost::asio:
std::vector<float> message_ = {1.2, 8.5};
asio::async_write(socket_, asio::buffer<float>(message_),
[this, self](std::error_code ec, std::size_t)
This works and I manage to get it back on my client using boost::asio's read_some(). As both Qt and asio have their own event manager, I want to avoid using asio in my Qt app.
So on the client side I have (which does not work):
client.h:
#define FLOATSIZE 4
QTcpSocket *m_socket;
QDataStream m_in;
QString *m_string;
QByteArray m_buff;
client.cpp (constructor):
m_in.setDevice(m_socket);
m_in.setFloatingPointPrecision(QDataStream::SinglePrecision);
// m_in.setByteOrder(QDataStream::LittleEndian);
client.cpp (read function, which is connected via QObject::connect(m_socket, &QIODevice::readyRead, this, &mywidget::ask2read); ):
uint availbytes = m_socket->bytesAvailable(); // which is 8, so that seems good
while (availbytes >= FLOATSIZE)
{
nbytes = m_in.readRawData(m_buff.data(), FLOATSIZE);
bool conv_ok = false;
const float f = m_buff.toFloat(&conv_ok);
availbytes = m_socket->bytesAvailable();
m_buff.clear();
}
The m_buff.toFloat() call returns 0.0 which is a fail according to the Qt doc. I have tried to change the float precision, little or big endian, but I can not manage to get my std::vector<float> back. Any hints?
Edit: everything runs on the same PC/compiler.
Edit: see my answer for a solution and sehe's for more detail on what is going on
I managed to resolve the issue, by editing the Qt side (client), to read the socket:
uint availbytes = m_socket->bytesAvailable();
while (availbytes >= 4)
{
char buffer[FLOATSIZE];
nbytes = m_in.readRawData(buffer, FLOATSIZE);
float f = bytes2float(buffer);
availbytes = m_socket->bytesAvailable();
}
I use those two conversion functions, bytes2float and bytes2int:
float bytes2float(char* buffer)
{
union {
float f;
uchar b[4];
} u;
u.b[3] = buffer[3];
u.b[2] = buffer[2];
u.b[1] = buffer[1];
u.b[0] = buffer[0];
return u.f;
}
and:
int bytes2int(char* buffer)
{
int a = int((unsigned char)(buffer[3]) << 24 |
(unsigned char)(buffer[2]) << 16 |
(unsigned char)(buffer[1]) << 8 |
(unsigned char)(buffer[0]));
return a;
}
I also found that function to display bytes, which is useful to see what is going on behind the scene (from https://stackoverflow.com/a/16063757/7272199):
template <typename T>
void print_bytes(const T& input, std::ostream& os = std::cout)
{
const unsigned char* p = reinterpret_cast<const unsigned char*>(&input);
os << std::hex << std::showbase;
os << "[";
for (unsigned int i=0; i<sizeof(T); ++i)
os << static_cast<int>(*(p++)) << " ";
os << "]" << std::endl;;
}
Re. your answer: Which side is this on? Also, are your platforms not the same (OS/architecture?). I had assumed from the question that both processes run on the same PC and compiler etc.
For one thing, you can see that ASIO does not do anything related to endianness.
#include <boost/asio.hpp>
#include <iostream>
#include <iomanip>
namespace asio = boost::asio;
#include <iostream>
void print_bytes(unsigned char const* b, unsigned char const* e)
{
std::cout << std::hex << std::setfill('0') << "[ ";
while (b!=e)
std::cout << std::setw(2) << static_cast<int>(*b++) << " ";
std::cout << "]\n";
}
template <typename T> void print_bytes(const T& input) {
using namespace std;
print_bytes(reinterpret_cast<unsigned char const*>(std::addressof(*begin(input))),
reinterpret_cast<unsigned char const*>(std::addressof(*end(input))));
}
int main() {
float const fs[] { 1.2, 8.5 };
std::cout << "fs: "; print_bytes(fs);
{
std::vector<float> gs(2);
asio::buffer_copy(asio::buffer(gs), asio::buffer(fs));
for (auto g : gs) std::cout << g << " "; std::cout << "\n";
std::cout << "gs: "; print_bytes(gs);
}
{
std::vector<char> binary(2*sizeof(float));
asio::buffer_copy(asio::buffer(binary), asio::buffer(fs));
std::cout << "binary: "; print_bytes(binary);
std::vector<float> gs(2);
asio::buffer_copy(asio::buffer(gs), asio::buffer(binary));
for (auto g : gs) std::cout << g << " "; std::cout << "\n";
std::cout << "gs: "; print_bytes(gs);
}
}
Prints
fs: [ 9a 99 99 3f 00 00 08 41 ]
1.2 8.5
gs: [ 9a 99 99 3f 00 00 08 41 ]
binary: [ 9a 99 99 3f 00 00 08 41 ]
1.2 8.5
gs: [ 9a 99 99 3f 00 00 08 41 ]
Theory
I suspect the Qt side ruins things. Since the naming of the function readRawData certainly implies a lack of endianness awareness, I'd guess your system's endianness wreaks havoc (https://stackoverflow.com/a/2945192/85371, also the comment).
Suggestion
In that case, consider using Boost Endian.
I think it's a bad idea to use high level send method server side (you try to send a c++ vector) and low level client side.
I'm quite sure there is an endianness problem somewhere.
Anyway try to do this client side:
char buffer[FLOATSIZE];
bytes = m_in.readRawData(buffer, FLOATSIZE);
if (bytes != FLOATSIZE)
return ERROR;
const float f = (float)(ntohl(*((int32_t *)buffer)));
If boost::asio uses the network byte order for the floats (as it should), this will work.
Trying to understand why I get the following output from my program:
$ ./chartouintest
UInts: 153 97 67 49 139 0 3 129
Hexes: 99 61 43 31 8b 00 03 81
uint64 val: 8103008b31436199
$
I am trying to output just the actual UInt64 numerical value, but can't seem to do it (the output is not right)
here is the code:
#include <iostream>
#include <iomanip>
#include <stdlib.h>
union bytes {
unsigned char c[8];
uint64_t l;
} memobj;
int main() {
//fill with random bytes
for(unsigned int i=0; i < sizeof(memobj.c); ++i) { memobj.c[i] = (unsigned char)rand();}
//see values of all elements as unsigned int8's and hex vals
std::cout << "UInts: ";
for (int x=0; x < sizeof(memobj.c); ++x) { std::cout << (unsigned int)memobj.c[x] << " "; }
std::cout << std::endl;
std::cout << "Hexes: ";
for (int x=0; x < sizeof(memobj.c); ++x) { std::cout << std::setw(2) << std::setfill('0') << std::hex << (unsigned int)memobj.c[x] << " "; }
std::cout << std::endl;
std::cout << "uint64 val: " << memobj.l << std::endl;
}
what am i doing wrong???
thanks in advance for the help!
J
Writing one member of a union and reading another is undefined behavior (with exceptions, but in this case it's UB).
You shouldn't expect anything. The compiler can do WTF it wants with your code, for instance giving a nice "expected" result in debug mode and garbage or crashing in release mode. Of course another compiler might play another trick. You'll never know for sure, so why bother ?
Why not doing it the right way ? memcpy perhaps ?
EDIT:
To really answer the question, a note about std::cout : the std::hex sets the stream to an hexadecimal representation, that's why the final "uint64 val: " display is in hex base (and not in decimal as the OP expects). Other than that, nothing is wrong with the output, despite the UB threat.
I've got a file containing a large string of hexidecimal. Here's the first few lines:
0000038f
0000111d
0000111d
03030303
//Goes on for a long time
I have a large struct that is intended to hold that data:
typedef struct
{
unsigned int field1: 5;
unsigned int field2: 11;
unsigned int field3: 16;
//Goes on for a long time
}calibration;
What I want to do is read the above string and store it in the struct. I can assume the input is valid (it's verified before I get it).
I've already got a loop that reads the file and puts the whole item in a string:
std::string line = "";
std::string hexText = "";
while(!std::getline(readFile, line))
{
hexText += line;
}
//Convert string into calibration
//Convert string into long int
long int hexInt = strtol(hexText.c_str(), NULL, 16);
//Here I get stuck: How to get from long int to calibration...?
How to get from long int to calibration...?
Cameron's answer is good, and probably what you want.
I offer here another (maybe not so different) approach.
Note1: Your file input needs re-work. I will suggest
a) use getline() to fetch one line at a time into a string
b) convert the one entry to a uint32_t (I would use stringstream instead of atol)
once you learn how to detect and recover from invalid input,
you could then work on combining a) and b) into one step
c) then install the uint32_t in your structure, for which my
offering below might offer insight.
Note2: I have worked many years with bit fields, and have developed a distaste for them.
I have never found them more convenient than the alternatives.
The alternative I prefer is bit masks and field shifting.
So far as we can tell from your problem statement, it appears your problem does not need bit-fields (which Cameron's answer illustrates).
Note3: Not all compilers will pack these bit fields for you.
The last compiler I used require what is called a "pragma".
G++ 4.8 on ubuntu seemed to pack the bytes just fine (i.e. no pragma needed)
The sizeof(calibration) for your original code is 4 ... i.e. packed.
Another issue is that packing can unexpectedly change when you change options or upgrade the compiler or change the compiler.
My team's work-around was to always have an assert against struct size and a few byte offsets in the CTOR.
Note4: I did not illustrate the use of 'union' to align a uint32_t array over your calibration struct.
This may be preferred over the reinterpret cast approach. Check your requirements, team lead, professor.
Anyway, in the spirit of your original effort, consider the following additions to your struct calibration:
typedef struct
{
uint32_t field1 : 5;
uint32_t field2 : 11;
uint32_t field3 : 16;
//Goes on for a long time
// I made up these next 2 fields for illustration
uint32_t field4 : 8;
uint32_t field5 : 24;
// ... add more fields here
// something typically done by ctor or used by ctor
void clear() { field1 = 0; field2 = 0; field3 = 0; field4 = 0; field5 = 0; }
void show123(const char* lbl=0) {
if(0 == lbl) lbl = " ";
std::cout << std::setw(16) << lbl;
std::cout << " " << std::setw(5) << std::hex << field3 << std::dec
<< " " << std::setw(5) << std::hex << field2 << std::dec
<< " " << std::setw(5) << std::hex << field1 << std::dec
<< " 0x" << std::hex << std::setfill('0') << std::setw(8)
<< *(reinterpret_cast<uint32_t*>(this))
<< " => " << std::dec << std::setfill(' ')
<< *(reinterpret_cast<uint32_t*>(this))
<< std::endl;
} // show
// I did not create show456() ...
// 1st uint32_t: set new val, return previous
uint32_t set123(uint32_t nxtVal) {
uint32_t* myVal = reinterpret_cast<uint32_t*>(this);
uint32_t prevVal = myVal[0];
myVal[0] = nxtVal;
return (prevVal);
}
// return current value of the combined field1, field2 field3
uint32_t get123(void) {
uint32_t* myVal = reinterpret_cast<uint32_t*>(this);
return (myVal[0]);
}
// 2nd uint32_t: set new val, return previous
uint32_t set45(uint32_t nxtVal) {
uint32_t* myVal = reinterpret_cast<uint32_t*>(this);
uint32_t prevVal = myVal[1];
myVal[1] = nxtVal;
return (prevVal);
}
// return current value of the combined field4, field5
uint32_t get45(void) {
uint32_t* myVal = reinterpret_cast<uint32_t*>(this);
return (myVal[1]);
}
// guess that next 4 fields fill 32 bits
uint32_t get6789(void) {
uint32_t* myVal = reinterpret_cast<uint32_t*>(this);
return (myVal[2]);
}
// ... tedious expansion
} calibration;
Here is some test code to illustrate the use:
uint32_t t125()
{
const char* lbl =
"\n 16 bits 11 bits 5 bits hex => dec";
calibration cal;
cal.clear();
std::cout << lbl << std::endl;
cal.show123();
cal.field1 = 1;
cal.show123("field1 = 1");
cal.clear();
cal.field1 = 31;
cal.show123("field1 = 31");
cal.clear();
cal.field2 = 1;
cal.show123("field2 = 1");
cal.clear();
cal.field2 = (2047 & 0x07ff);
cal.show123("field2 = 2047");
cal.clear();
cal.field3 = 1;
cal.show123("field3 = 1");
cal.clear();
cal.field3 = (65535 & 0x0ffff);
cal.show123("field3 = 65535");
cal.set123 (0xABCD6E17);
cal.show123 ("set123(0x...)");
cal.set123 (0xffffffff);
cal.show123 ("set123(0x...)");
cal.set123 (0x0);
cal.show123 ("set123(0x...)");
std::cout << "\n";
cal.clear();
std::cout << "get123(): " << cal.get123() << std::endl;
std::cout << " get45(): " << cal.get45() << std::endl;
// values from your file:
cal.set123 (0x0000038f);
cal.set45 (0x0000111d);
std::cout << "get123(): " << "0x" << std::hex << std::setfill('0')
<< std::setw(8) << cal.get123() << std::endl;
std::cout << " get45(): " << "0x" << std::hex << std::setfill('0')
<< std::setw(8) << cal.get45() << std::endl;
// cal.set6789 (0x03030303);
// std::cout << "get6789(): " << cal.get6789() << std::endl;
// ...
return(0);
}
And the test code output:
16 bits 11 bits 5 bits hex => dec
0 0 0 0x00000000 => 0
field1 = 1 0 0 1 0x00000001 => 1
field1 = 31 0 0 1f 0x0000001f => 31
field2 = 1 0 1 0 0x00000020 => 32
field2 = 2047 0 7ff 0 0x0000ffe0 => 65,504
field3 = 1 1 0 0 0x00010000 => 65,536
field3 = 65535 ffff 0 0 0xffff0000 => 4,294,901,760
set123(0x...) abcd 370 17 0xabcd6e17 => 2,882,366,999
set123(0x...) ffff 7ff 1f 0xffffffff => 4,294,967,295
set123(0x...) 0 0 0 0x00000000 => 0
get123(): 0
get45(): 0
get123(): 0x0000038f
get45(): 0x0000111d
The goal of this code is to help you see how the bit fields map into the lsbyte through msbyte of the data.
If you care at all about efficiency, don't read the whole thing into a string and then convert it. Simply read one word at a time, and convert that. Your loop should look something like:
calibration c;
uint32_t* dest = reinterpret_cast<uint32_t*>(&c);
while (true) {
char hexText[8];
// TODO: Attempt to read 8 bytes from file and then skip whitespace
// TODO: Break out of the loop on EOF
std::uint32_t hexValue = 0; // TODO: Convert hex to dword
// Assumes the structure padding & packing matches the dump version's
// Assumes the structure size is exactly a multiple of 32-bytes (w/ padding)
static_assert(sizeof(calibration) % 4 == 0);
assert(dest - &c < sizeof(calibration) && "Too much data");
*dest++ = hexValue;
}
assert(dest - &c == sizeof(calibration) && "Too little data");
Converting 8 chars of hex to an actual 4-byte int is a good exercise and is well-covered elsewhere, so I've left it out (along with the file reading, which is similarly well-covered).
Note the two assumptions in the loop: the first one cannot be checked either at run-time or compile time, and must be either agreed upon in advance or extra work has to be done to properly serialize the structure (handling structure packing and padding, etc.). The last one can at least be checked at compile time with the static_assert.
Also, care has to be taken to ensure that the endianness of the hex bytes in the file matches the endianness of the architecture executing the program when converting the hex string. This will depend on whether the hex was written in a specific endianness in the first place (in which case you can convert it from the know endianness to the current architecture's endianness quite easily), or whether it's architecture-dependent (in which case you have no choice but to assume the endianness is the same as your current architecture).