Read/write binary object as hex? - c++

I need to serialize various structs to a file.
If possible I'd like the files to be pure ASCII. I could write some kind of serializer for each struct, but there are hundreds and many contain floats and doubles which I'd like to represent accurately.
I can't use a third-party serialization library and I don't have the time to write hundreds of serializiers.
How can I ASCII-safe serialize this data?
Also streams please, I hate the look of C-style printf("%02x",data).

I found this solution online and it addresses just this problem:
https://jdale88.wordpress.com/2009/09/24/c-anything-tofrom-a-hex-string/
Reproduced below:
#include <string>
#include <sstream>
#include <iomanip>
// ------------------------------------------------------------------
/*!
Convert a block of data to a hex string
*/
void toHex(
void *const data, //!< Data to convert
const size_t dataLength, //!< Length of the data to convert
std::string &dest //!< Destination string
)
{
unsigned char *byteData = reinterpret_cast<unsigned char*>(data);
std::stringstream hexStringStream;
hexStringStream << std::hex << std::setfill('0');
for(size_t index = 0; index < dataLength; ++index)
hexStringStream << std::setw(2) << static_cast<int>(byteData[index]);
dest = hexStringStream.str();
}
// ------------------------------------------------------------------
/*!
Convert a hex string to a block of data
*/
void fromHex(
const std::string &in, //!< Input hex string
void *const data //!< Data store
)
{
size_t length = in.length();
unsigned char *byteData = reinterpret_cast<unsigned char*>(data);
std::stringstream hexStringStream; hexStringStream >> std::hex;
for(size_t strIndex = 0, dataIndex = 0; strIndex < length; ++dataIndex)
{
// Read out and convert the string two characters at a time
const char tmpStr[3] = { in[strIndex++], in[strIndex++], 0 };
// Reset and fill the string stream
hexStringStream.clear();
hexStringStream.str(tmpStr);
// Do the conversion
int tmpValue = 0;
hexStringStream >> tmpValue;
byteData[dataIndex] = static_cast<unsigned char>(tmpValue);
}
}
This can be easily adapted to read/write to file streams, although the stringstream used in fromHex is still necessary, the conversion must be done two read characters at a time.

Any way you do it, you're going to need serialization code for
each struct type. You can't just bit-copy a struct to the
external world, and expect it to work.
And if you want pure ascii, don't bother with hex. For
serializing float and double, set the output stream to
scientific, and the precision to 8 for float, and 16 for
double. (It will take a few more bytes, but it will actually
work.)
For the rest: if the struct are written cleanly, according to
some in house programming guidelines, and only contain basic
types, you should be able to parse them directly. Otherwise,
the simplest solution is generally to design a very simple
descriptor language, describe each struct in it, and run a code
generator over it to get the serialization code.

Related

save struct into a binary file and read it

I have array of struct in class,and I want save that in file.
if I put the input ac.mem [i] .username except the username is stored in the file
And if I put the input ac.mem [i] nothing will be saved.
This is part of my code:
const int len=5;
class account {
public:
struct members {
string username;
string password;
int acsess;
}mem[len];
};
class account ac;
....
ac.mem[0] = { "admin","soran",5 };
ac.mem[1] = { "hamid","hamid",4 };
fstream acc1("account", ios::binary);
for (int i = 0; i <= 1; i++) {
acc1.write((char*)&ac.mem[i].username, sizeof(ac.mem[i].username));
}
acc1.close();
....
ifstream acc2("account", ios::binary);
for (int i = 0; i <= len; ++i) {
acc1.read((char*)&ac.mem[i].username, sizeof(ac.mem[i].username));
cout << i + 1 << "." << setw(10) << ac.mem[i].username << setw(20) << ac.mem[i].password << setw(20) << ac.mem[i].acsess << endl;
}
acc2.close();
std::string objects are pretty complex types – they internally maintain pointers to memory. When you just write the internal representation to a file (casting address of to char*) all you write out are these pointers plus possibly some additional management data.
The actual string contents, though, are stored at the locations these pointers point to. When reading back you cannot ever assume to find the same data at the address you've just restored from file (unless the original string object written to still exists – but then playing around with the internals will, if two different std::string objects involved, with 100% probability lead to undefined behaviour due to double deletion, if not reading and writing them from/to memory that way already is).
What you actually want to print to file are the contents of the string – which you get by either std::string::c_str or alternatively std::string::data. You might additionally want to include the terminating null character (hopefully there are no internal ones within the string...) to be able to read back multiple strings, stopping reading each one at exactly the null terminator, then writing to file might look like:
std::string s; // assign some content...
std::ofstream f; // open some path
if(f) // stream opened successfully?
{
f.write(s.c_str(), s.length() + 1);
}
Note that std::string::length returns the length without the terminating null character, so if you want/need to include it, you need to add one to as done above.
Alternatively you can write out the string's length first and then skip writing the null character – with the advantage that on reading back you already know in advance how many characters to read and thus to pre-allocate within your objects (std::string::reserve). For compatibilty reasons over different compilers and especially machines make sure to write out fixed-size data types from <cstdint> header, e.g.:
uint32_t length = s.length();
f.write(reinterpret_cast<char const*>(&length), sizeof(length));
f.write(s.c_str(), s.length());
This approach covers internally existing null characters as well (though if such data exists, std::vector<unsigned char> or preferably std::vector<uint8_t> might be better alternative, std::string is intended for texts).
If you want to use C language, you could refer to the following code.
#include <stdio.h>
#include <stdlib.h>
#pragma warning(disable : 4996)
typedef struct {
char* name;
int phone;
}address;
int main(void)
{
int i;
address a[3];
for (i = 0; i < 3; i++)
{
a[i].name = "jojo";
a[i].phone = "123456";
}
FILE* fp;
fp = fopen("list.txt", "ab");
for (i = 0; i < 3; i++)
{
printf(" % s, % d",a[i].name,a[i].phone);
fwrite(&a[i], sizeof(address), 1, fp);
}
fclose(fp);
return 0;
}

C++ Unable to convert std::vector<BYTE> to a string or char array

I am currently working with the Registry using this GitHub library:
https://github.com/GiovanniDicanio/WinReg
I am trying to convert this vector<BYTE> to a char array or a string, to make a hash out of it with help of SHA-512. But I am stuck with converting it, I tried different methods. I don´t get any compiler errors, just the app crashes at runtime. I am using a DLL that I load into my process.
RegKey NetworkInterface_key(HKEY_LOCAL_MACHINE, L"SYSTEM\\CurrentControlSet\\Control\\Class\\{4d36e972-e325-11ce-bfc1-08002be10318}\\0001");
const std::vector<BYTE> InstallTimeStamp = NetworkInterface_key.GetBinaryValue(L"InstallTimeStamp");
MY SOLUTION:
Changed std::vector<BYTE> -> std::vector<unsigned char>
Used this methode:
template <typename T>
std::string to_hex(T data)
{
std::ostringstream result;
result << std::setw(2) << std::setfill('0') << std::hex << std::uppercase << static_cast<int>(data);
return result.str();
}
std::string dump(const std::vector<unsigned char>& data)
{
if (data.empty()) return "";
auto size = data.size();
std::ostringstream result;
for(u32 i =0; i < size; i++)
{
result << "0x" + to_hex(data[i]);
if (i != size)
result << " ";
}
return result.str();
}
Credits: U. Bulle -> C++ Converting Vector<BYTE> to string where first vector byte is 0
You don't need that library, just do this:
HKEY key = 0;
BYTE timestamp[16] = { 0 };
LRESULT err = ::RegOpenKeyEx(HKEY_LOCAL_MACHINE, L"SYSTEM\\CurrentControlSet\\Control\\Class\\{4d36e972-e325-11ce-bfc1-08002be10318}\\0001", 0, KEY_READ, &key);
if (err == 0)
{
DWORD dwType = 0;
DWORD dwSize = 16;
::RegQueryValueEx(key, L"InstallTimeStamp", NULL, &dwType, timestamp, &dwSize);
RegCloseKey(key);
}
As for converting those 16 bytes into "string". That doesn't make a lot of sense given that that those 16 bytes are binary data. You could do this:
std::string strTimestamp((char*)timestamp, 16);
But I suspect you just want a pointer to pass to a sha512 function that expects a char* data type. If that's the case, just do this:
const char* ts = (char*)timestamp;
Just remember the length of that array is fixed and is not a null terminated string. So your hash function should take a length parameter as well.
The RegKey::GetBinaryValue() method returns a std::vector<BYTE>. To convert that data to a char[] array, you don't really have to actually convert it at all, you can simply type-cast a pointer to the data instead:
const std::vector<BYTE> InstallTimeStamp = ...;
const char *pInstallTimeStamp = reinterpret_cast<const char*>(InstallTimeStamp.data());
But, if you want to convert the data to a std::string, then std::string has constructors that are appropriate for that purpose, eg:
const std::vector<BYTE> InstallTimeStamp = ...;
std::string sInstallTimeStamp(reinterpret_cast<const char*>(InstallTimeStamp.data()), InstallTimeStamp.size());
const std::vector<BYTE> InstallTimeStamp = ...;
std::string sInstallTimeStamp(InstallTimeStamp.begin(), InstallTimeStamp.end());
However, that being said, hashes operate on bytes, not on characters or strings, so you really should not need to convert the vector data to anything else at all, just hash its contents as-is. Unless you are using a hashing API that requires char/string input (if so, you should find a better hash API), in which case the above should suffice.

How to store a unsigned char array to float value?

I am trying to take the sensor data from Arduino & Raspberry Pi using RS232 serial communication. I have searched for this small thing and found something related on this below link but was unable get the full idea.
The os (kernel) has an internal buffer of 4096 bytes. If this buffer is full and a new character arrives on the serial port, the oldest character in the buffer will be overwritten and thus will be lost. After a successful call to RS232_OpenComport(), the os will start to buffer incoming characters.
The values are properly coming from Arduino to Raspberry Pi (output attached below) and it is storing in a pointer to unsigned char[] which is defined as unsigned char *buf[4096].
int main()
{
int i, n,
cport_nr=0, /* /dev/ttyS0 (COM1 on windows) */
bdrate=9600; /* 9600 baud */
unsigned char buf[4096];
char mode[]={'8','N','1',0};
while(1)
{
n = RS232_PollComport(cport_nr, buf, 4095);
if(n > 0)
{
buf[n] = 0;
for(i=0; i < n; i++)
{
if(buf[i] < 32) /* replace unreadable control-codes by dots */
{
buf[i] = '.';
}
}
printf("received %i bytes: %s\n", n, (char *)buf);
}
}
Now I want to store these values in another float/double variable so that I can perform further operations on it. How to store a value suppose 0.01 to a float/double which is later used to create stuff.
From the output in the screenshot it looks like you are sending the string representation of the numbers rather than the actual numbers. You just need to detect those "unreadable control-codes" that you are just replacing with a . as they will probably tell you when a number ends and another begins. Just make QSerialPort * serial; a proper class member.
Also, check for errors on opening the port: serial->open(QIODevice::ReadWrite); Then, insert some qDebug() in serialreceived() to see if the slot is called at all and if the canReadLine() works. you should use QByteArray to read your data. If there's any char in the response, that is not String conform, the resulting QString will be prematurely terminated, use readLine() instead readAll() like this:
QByteArray data = serial -> readLine();
qDebug() < data.toHex(' '); // prints the hex representation of your char array
QString str(data);
qDebug() << str;
First, it will be better if you use some other ASCII character (e.g. space) to separate the numbers, because . dot is a part of floating point number. Then, you can construct std::string object from your raw unsigned char array, split it in a multiple strings and convert each string to float.
#include <boost/algorithm/string/classification.hpp>
#include <boost/algorithm/string/split.hpp>
int main() {
// imagine that this buff is already after read and preprocessing
unsigned char buff[1024] = "13.60 13.60 -11.12 -0.3 and let's say that the rest is garbage";
int n = 28; // let's say that you received 28 bytes
std::string strBuff(reinterpret_cast<char*>(buff), n); // construct a string from buff using just first 28 bytes
std::vector<std::string> numbers;
boost::split(numbers, strBuff, boost::is_any_of(" "), boost::token_compress_on);
for (const auto& n : numbers) {
try {
std::cout << std::stof(n) << std::endl;
} catch (const std::exception& e) {
std::cout << n << " is not convertible to float: " << e.what() << std::endl;
}
}
return 0;
}
I took the string splitting method from this answer but you can use anything that works for you.
I used reinterpret_cast because std::string accepts char instead of unsigned char as a CTor arg.

Read blocks of a binary file buffer into different types

I am trying to read a binary file into memory, and then use it like so:
struct myStruct {
std::string mystring; // is 40 bytes long
uint myint1; // is 4 bytes long
};
typedef unsigned char byte;
byte *filedata = ReadFile(filename); // reads file into memory, closes the file
myStruct aStruct;
aStruct.mystring = filedata.????
I need a way of accessing the binary file with an offset, and getting a certain length at that offset.
This is easy if I store the binary file data in a std::string, but i figured that using that to store binary data is not as good way of doing things. (filedata.substr(offset, len))
Reasonably extensive (IMO) searching hasn't turned anything relevant up, any ideas? I am willing to change storage type (e.g. to std::vector) if you think it is necessary.
If you're not going to use a serialization library, then I suggesting adding serialization support to each class:
struct My_Struct
{
std::string my_string;
unsigned int my_int;
void Load_From_Buffer(unsigned char const *& p_buffer)
{
my_string = std::string(p_buffer);
p_buffer += my_string.length() + 1; // +1 to account for the terminating nul character.
my_int = *((unsigned int *) p_buffer);
p_buffer += sizeof(my_int);
}
};
unsigned char * const buffer = ReadFile(filename);
unsigned char * p_buffer = buffer;
My_Struct my_variable;
my_variable.Load_From_Buffer(p_buffer);
Some other useful interface methods:
unsigned int Size_On_Stream(void) const; // Returns the size the object would occupy in the stream.
void Store_To_Buffer(unsigned char *& p_buffer); // Stores object to buffer, increments pointer.
With templates you can extend the serialization functionality:
void Load_From_Buffer(std::string& s, unsigned char *& p_buffer)
{
s = std::string((char *)p_buffer);
p_buffer += s.length() + 1;
}
void template<classtype T> Load_From_Buffer(T& object, unsigned char *& p_buffer)
{
object.Load_From_Buffer(p_buffer);
}
Edit 1: Reason not to write structure directly
In C and C++, the size of a structure may not be equal to the sum of the size of its members.
Compilers are allowed to insert padding, or unused space, between members so that the members are aligned on an address.
For example, a 32-bit processor likes to fetch things on 4 byte boundaries. Having one char in a structure followed by an int would make the int on relative address 1, which is not a multiple of 4. The compiler would pad the structure so that the int lines up on relative address 4.
Structures may contain pointers or items that contain pointers.
For example, the std::string type may have a size of 40, although the string may contain 3 characters or 300. It has a pointer to the actual data.
Endianess.
With multibyte integers some processors like the Most Significant Byte (MSB), a.k.a. Big Endian, first (the way humans read numbers) or the Least Significant Byte first, a.k.a. Little Endian. The Little Endian format takes less circuitry to read than the Big Endian.
Edit 2: Variant records
When outputting things like arrays and containers, you must decide whether you want to output the full container (include unused slots) or output only the items in the container. Outputting only the items in the container would use a variant record technique.
Two techniques for outputting variant records: quantity followed by items or items followed by a sentinel. The latter is how C-style strings are written, with the sentinel being a nul character.
The other technique is to output the quantity of items, followed by the items. So if I had 6 numbers, 0, 1, 2, 3, 4, 5, the output would be:
6 // The number of items
0
1
2
3
4
5
In the above Load_From_Buffer method, I would create a temporary to hold the quantity, write that out, then follow with each item from the container.
You could overload the std::ostream output operator and std::istream input operator for your structure, something like this:
struct Record {
std::string name;
int value;
};
std::istream& operator>>(std::istream& in, Record& record) {
char name[40] = { 0 };
int32_t value(0);
in.read(name, 40);
in.read(reinterpret_cast<char*>(&value), 4);
record.name.assign(name, 40);
record.value = value;
return in;
}
std::ostream& operator<<(std::ostream& out, const Record& record) {
std::string name(record.name);
name.resize(40, '\0');
out.write(name.c_str(), 40);
out.write(reinterpret_cast<const char*>(&record.value), 4);
return out;
}
int main(int argc, char **argv) {
const char* filename("records");
Record r[] = {{"zero", 0 }, {"one", 1 }, {"two", 2}};
int n(sizeof(r)/sizeof(r[0]));
std::ofstream out(filename, std::ios::binary);
for (int i = 0; i < n; ++i) {
out << r[i];
}
out.close();
std::ifstream in(filename, std::ios::binary);
std::vector<Record> rIn;
Record record;
while (in >> record) {
rIn.push_back(record);
}
for (std::vector<Record>::iterator i = rIn.begin(); i != rIn.end(); ++i){
std::cout << "name: " << i->name << ", value: " << i->value
<< std::endl;
}
return 0;
}

Outputting bit data to binary file C++

I am writing a compression program, and need to write bit data to a binary file using c++. If anyone could advise on the write statement, or a website with advice, I would be very grateful.
Apologies if this is a simple or confusing question, I am struggling to find answers on web.
Collect the bits into whole bytes, such as an unsigned char or std::bitset (where the bitset size is a multiple of CHAR_BIT), then write whole bytes at a time. Computers "deal with bits", but the available abstraction – especially for IO – is that you, as a programmer, deal with individual bytes. Bitwise manipulation can be used to toggle specific bits, but you're always handling byte-sized objects.
At the end of the output, if you don't have a whole byte, you'll need to decide how that should be stored. Both iostreams and stdio can write unformatted data using ostream::write and fwrite, respectively.
Instead of a single char or bitset<8> (8 being the most common value for CHAR_BIT), you might consider using a larger block size, such as an array of 4-32, or more, chars or the equivalent sized bitset.
For writing binary, the trick I have found most helpful is to store all the binary as a single array in memory and then move it all over to the hard drive. Doing a bit at a time, or a byte at a time, or an unsigned long long at a time is not as fast as having all the data stored in an array and using one instance of "fwrite()" to store it to the hard drive.
size_t fwrite ( const void * ptr, size_t size, size_t count, FILE * stream );
Ref: http://www.cplusplus.com/reference/clibrary/cstdio/fwrite/
In English:
fwrite( [array* of stored data], [size in bytes of array OBJECT. For unsigned chars -> 1, for unsigned long longs -> 8], [number of instances in array], [FILE*])
Always check your returns for validation of success!
Additionally, an argument can be made that having the object type be as large as possible is the fastest way to go ([unsigned long long] > [char]). While I am not versed in the coding behind "fwrite()", I feel the time to convert from the natural object used in your code to [unsigned long long] will take more time when combined with the writing than the "fwrite()" making due with what you have.
Back when I was learning Huffman Coding, it took me a few hours to realize that there was a difference between [char] and [unsigned char]. Notice for this method that you should always use unsigned variables to store the pure binary.
by below class you can write and read bit by bit
class bitChar{
public:
unsigned char* c;
int shift_count;
string BITS;
bitChar()
{
shift_count = 0;
c = (unsigned char*)calloc(1, sizeof(char));
}
string readByBits(ifstream& inf)
{
string s ="";
char buffer[1];
while (inf.read (buffer, 1))
{
s += getBits(*buffer);
}
return s;
}
void setBITS(string X)
{
BITS = X;
}
int insertBits(ofstream& outf)
{
int total = 0;
while(BITS.length())
{
if(BITS[0] == '1')
*c |= 1;
*c <<= 1;
++shift_count;
++total;
BITS.erase(0, 1);
if(shift_count == 7 )
{
if(BITS.size()>0)
{
if(BITS[0] == '1')
*c |= 1;
++total;
BITS.erase(0, 1);
}
writeBits(outf);
shift_count = 0;
free(c);
c = (unsigned char*)calloc(1, sizeof(char));
}
}
if(shift_count > 0)
{
*c <<= (7 - shift_count);
writeBits(outf);
free(c);
c = (unsigned char*)calloc(1, sizeof(char));
}
outf.close();
return total;
}
string getBits(unsigned char X)
{
stringstream itoa;
for(unsigned s = 7; s > 0 ; s--)
{
itoa << ((X >> s) & 1);
}
itoa << (X&1) ;
return itoa.str();
}
void writeBits(ofstream& outf)
{
outf << *c;
}
~bitChar()
{
if(c)
free(c);
}
};
for example
#include <iostream>
#include <sstream>
#include <fstream>
#include <string>
#include <stdlib.h>
using namespace std;
int main()
{
ofstream outf("Sample.dat");
ifstream inf("Sample.dat");
string enCoded = "101000001010101010";
//write to file
cout << enCoded << endl ; //print 101000001010101010
bitChar bchar;
bchar.setBITS(enCoded);
bchar.insertBits(outf);
//read from file
string decoded =bchar.readByBits(inf);
cout << decoded << endl ; //print 101000001010101010000000
return 0;
}