Serialize and deserialize vector in binary - c++

I am having problems trying to serialise a vector (std::vector) into a binary format and then correctly deserialise it and be able to read the data. This is my first time using a binary format (I was using ASCII but that has become too hard to use now) so I am starting simple with just a vector of ints.
Whenever I read the data back the vector always has the right length but the data is either 0, undefined or random.
class Example
{
public:
std::vector<int> val;
};
WRITE:
Example example = Example();
example.val.push_back(10);
size_t size = sizeof BinaryExample + (sizeof(int) * example.val.size());
std::fstream file ("Levels/example.sld", std::ios::out | std::ios::binary);
if (file.is_open())
{
file.seekg(0);
file.write((char*)&example, size);
file.close();
}
READ:
BinaryExample example = BinaryExample();
std::ifstream::pos_type size;
std::ifstream file ("Levels/example.sld", std::ios::in | std::ios::binary | std::ios::ate);
if (file.is_open())
{
size = file.tellg();
file.seekg(0, std::ios::beg);
file.read((char*)&example, size);
file.close();
}
Does anyone know what I am doing wrong or what to do or be able to point me in the direction that I need to do?

You can't unserialise a non-POD class by overwriting an existing instance as you seem to be trying to do - you need to give the class a constructor that reads the data from the stream and constructs a new instance of the class with it.
In outline, given something like this:
class A {
A();
A( istream & is );
void serialise( ostream & os );
vector <int> v;
};
then serialise() would write the length of the vector followed by the vector contents. The constructor would read the vector length, resize the vector using the length, then read the vector contents:
void A :: serialise( ostream & os ) {
size_t vsize = v.size();
os.write((char*)&vsize, sizeof(vsize));
os.write((char*)&v[0], vsize * sizeof(int) );
}
A :: A( istream & is ) {
size_t vsize;
is.read((char*)&vsize, sizeof(vsize));
v.resize( vsize );
is.read((char*)&v[0], vsize * sizeof(int));
}

You're using the address of the vector. What you need/want is the address of the data being held by the vector. Writing, for example, would be something like:
size = example.size();
file.write((char *)&size, sizeof(size));
file.write((char *)&example[0], sizeof(example[0] * size));

I would write in network byte order to ensure file can be written&read on any platform. So:
#include <fstream>
#include <iostream>
#include <iomanip>
#include <vector>
#include <arpa/inet.h>
int main(void) {
std::vector<int32_t> v = std::vector<int32_t>();
v.push_back(111);
v.push_back(222);
v.push_back(333);
{
std::ofstream ofs;
ofs.open("vecdmp.bin", std::ios::out | std::ios::binary);
uint32_t sz = htonl(v.size());
ofs.write((const char*)&sz, sizeof(uint32_t));
for (uint32_t i = 0, end_i = v.size(); i < end_i; ++i) {
int32_t val = htonl(v[i]);
ofs.write((const char*)&val, sizeof(int32_t));
}
ofs.close();
}
{
std::ifstream ifs;
ifs.open("vecdmp.bin", std::ios::in | std::ios::binary);
uint32_t sz = 0;
ifs.read((char*)&sz, sizeof(uint32_t));
sz = ntohl(sz);
for (uint32_t i = 0; i < sz; ++i) {
int32_t val = 0;
ifs.read((char*)&val, sizeof(int32_t));
val = ntohl(val);
std::cout << i << '=' << val << '\n';
}
}
return 0;
}

Read the other's answer to see how you should read/write a binary structure.
I add this one because I believe your motivations for using a binary format are mistaken. A binary format won't be easier that an ASCII one, usually it's the other way around.
You have many options to save/read data for long term use (ORM, databases, structured formats, configuration files, etc). The flat binary file is usually the worst and the harder to maintain except for very simple structures.

Related

Writing and reading a file

I'm trying to use ifstream/ofstream to read/write but for some reason, the data gets corrupted along the way. Heres the read/write methods and the test:
void FileWrite(const char* FilePath, std::vector<char> &data) {
std::ofstream os (FilePath);
int len = data.size();
os.write(reinterpret_cast<char*>(&len), 4);
os.write(&(data[0]), len);
os.close();
}
std::vector<char> FileRead(const char* FilePath) {
std::ifstream is(FilePath);
int len;
is.read(reinterpret_cast<char*>(&len), 4);
std::vector<char> ret(len);
is.read(&(ret[0]), len);
is.close();
return ret;
}
void test() {
std::vector<char> sample(1024 * 1024);
for (int i = 0; i < 1024 * 1024; i++) {
sample[i] = rand() % 256;
}
FileWrite("C:\\test\\sample", sample);
auto sample2 = FileRead("C:\\test\\sample");
int err = 0;
for (int i = 0; i < sample.size(); i++) {
if (sample[i] != sample2[i])
err++;
}
std::cout << err << "\n";
int a;
std::cin >> a;
}
It writes the length correctly, reads it correctly and starts reading the data correctly but at some point(depending on input, usually at around the 1000'th byte) it goes wrong and everything to follow is wrong. Why is that?
for starter, you should open the file stream for binary read and write :
std::ofstream os (FilePath,std::ios::binary);
(edit: assuming char really means "signed char")
Do notice that regular char can hold up to CHAR_MAX/2 value, which is 127.
If the random number is bigger - the result will wrap around, resulting negative value. the stream will try to write this character as a text character, which is invalid value to write. binary format should at least fix this problem.
Also, you shouldn't close the stream yourself here, the destructor does it for you.
Two more simple points:
1) &(data[0]) should be just &data[0], the () are redundant
2) try keep the same convention. you write upper-camel-case for FilePath variable, but lower-camel-case for all the other variables.

Reading Binary Files into an array of ints c++

I have a method which writes a binary file from an int array. (it could be wrong too)
void bcdEncoder::writeBinaryFile(unsigned int packedBcdArray[], int size)
{
fstream binaryIo;
binaryIo.open("PridePrejudice.bin", ios::out| ios::binary | ios::trunc);
binaryIo.seekp(0);
binaryIo.write((char*)packedBcdArray, size * sizeof(packedBcdArray[0]));
binaryIo.seekp(0);
binaryIo.close();
}
I need to now read that binary file back. And preferably have it read it back into another array of unsigned ints without any information loss.
I have something like the following code, but I have no idea on how reading binary files really works, and no idea how to read it into an array of ints.
void bcdEncoder::readBinaryFile(string fileName)
{
// myArray = my dnynamic int array
fstream binaryIo;
binaryIo.open(fileName, ios::in | ios::binary | ios::trunc);
binaryIo.seekp(0);
binaryIo.seekg(0);
binaryIo.read((int*)myArray, size * sizeof(myFile));
binaryIo.close();
}
Question:
How to complete the implementation of the function that reads binary files?
If you're using C++, use the nice std library.
vector<unsigned int> bcdEncoder::readBinaryFile(string fileName)
{
vector<unsigned int> ret; //std::list may be preferable for large files
ifstream in{ fileName };
unsigned int current;
while (in.good()) {
in >> current;
ret.emplace_back(current);
}
return ret;
}
Writing is just as simple (for this we'll accept an int[] but an std library would be preferable):
void bcdEncoder::writeBinaryFile(string fileName, unsigned int arr[], size_t len)
{
ofstream f { fileName };
for (size_t i = 0; i < len; i++)
f << arr[i];
}
Here's the same thing but with an std::vector
void bcdEncoder::writeBinaryFile(string fileName, vector<unsigned int> arr)
{
ofstream f { fileName };
for (auto&& i : arr)
f << i;
}
To simplify read operation consider storing size (i.e the number of elements in the array) before the data:
void bcdEncoder::writeBinaryFile(unsigned int packedBcdArray[], int size)
{
fstream binaryIo;
binaryIo.open("PridePrejudice.bin", ios::out| ios::binary | ios::trunc);
binaryIo.seekp(0);
binaryIo.write(&size, sizeof(size));
binaryIo.write((char*)packedBcdArray, size * sizeof(packedBcdArray[0]));
binaryIo.close();
}
The read would look something like:
void bcdEncoder::readBinaryFile(string fileName)
{
std::vector<unsigned int> myData;
int size;
fstream binaryIo;
binaryIo.open(fileName, ios::in | ios::binary | ios::trunc);
binaryIo.read(&size, sizeof(size)); // read the number of elements
myData.resize(size); // allocate memory for an array
binaryIo.read(myData.data(), size * sizeof(myData.value_type));
binaryIo.close();
// todo: do something with myData
}
Modern alternative using std::array
Here's a code snippet that uses more modern C++ to read a binary file into an std::array.
const int arraySize = 9216; // Hard-coded
std::array<uint8_t, arraySize> fileArray;
std::ifstream binaryFile("<my-binary-file>", std::ios::in | std::ios::binary);
if (binaryFile.is_open()) {
binaryFile.read(reinterpret_cast<char*>(fileArray.data()), arraySize);
}
Because you're using an std::array you'll need to know the exact size of the file during compile-time. If you don't know the size of the file ahead of time (or rather, you'll need to know that the file has at least X bytes available), use a std::vector and look at this example here: https://stackoverflow.com/a/36661779/1576548
Thanks for the tips guys, looks like I worked it out!! A major part of my problem was that half the arguments and syntax I added to the methods were not required, and actually messed things up. Here are my working methods.
void bcdEncoder::writeBinaryFile(unsigned int packedBcdArray[], int size, string fileName)
{
ofstream binaryIo;
binaryIo.open(fileName.substr(0, fileName.length() - 4) + ".bin", ios::binary);
if (binaryIo.is_open()) {
binaryIo.write((char*)packedBcdArray, size * sizeof(packedBcdArray[0]));
binaryIo.close();
// Send binary file to reader
readBinaryFile(fileName.substr(0, fileName.length() - 4) + ".bin", size);
}
else
cout << "Error writing bin file..." << endl;
}
And the read:
void bcdEncoder::readBinaryFile(string fileName, int size)
{
AllocateArray packedData(size);
unsigned int *packedArray = packedData.createIntArray();
ifstream binaryIo;
binaryIo.open(fileName, ios::binary);
if (binaryIo.is_open()) {
binaryIo.read((char*)packedArray, size * sizeof(packedArray[0]));
binaryIo.close();
decodeBCD(packedArray, size * 5, fileName);
}
else
cout << "Error reading bin file..." << endl;
}
With the AllocateArray being my class that creates dynamic arrays without vectors somewhat safely with destructors included.

c++ writing and reading objects to binary files

I'm trying to read an array object (Array is a class I've made using read and write functions to read and write from binary files. So far the write functions works but it won't read from the file properly for some reason. This is the write function :
void writeToBinFile(const char* path) const
{
ofstream ofs(path, ios_base::out | ios_base::app | ios_base::binary);
if (ofs.is_open())
{
ostringstream oss;
for (unsigned int i = 0; i < m_size; i++)
{
oss << ' ';
oss << m_data[i];
}
ofs.write(oss.str().c_str(), oss.str().size());
}
}
This is the read function :
void readFromBinFile(const char* path)
{
ifstream ifs(path, ios_base::in | ios_base::binary || ios_base::ate);
if (ifs.is_open())
{
stringstream ss;
int charCount = 0, spaceCount = 0;
ifs.unget();
while (spaceCount != m_size)
{
charCount++;
if (ifs.peek() == ' ')
{
spaceCount++;
}
ifs.unget();
}
ifs.get();
char* ch = new char[sizeof(char) * charCount];
ifs.read(ch, sizeof(char) * charCount);
ss << ch;
delete[] ch;
for (unsigned int i = 0; i < m_size; i++)
{
ss >> m_data[i];
m_elementCount++;
}
}
}
those are the class fields :
T* m_data;
unsigned int m_size;
unsigned int m_elementCount;
I'm using the following code to write and then read (1 execution for reading another for writing):
Array<int> arr3(5);
//arr3[0] = 38;
//arr3[1] = 22;
//arr3[2] = 55;
//arr3[3] = 7;
//arr3[4] = 94;
//arr3.writeToBinFile("binfile.bin");
arr3.readFromBinFile("binfile.bin");
for (unsigned int i = 0; i < arr3.elementCount(); i++)
{
cout << "arr3[" << i << "] = " << arr3[i] << endl;
}
The problem is now at the readFromBinFile function, it get stuck in an infinite loop and peek() returns -1 for some reason and I can't figure why.
Also note I'm writing to the binary file using spaces to make a barrier between each element so I would know to differentiate between objects in the array and also a space at the start of the writing to make a barrier between previous stored binary data in the file to the array binary data.
The major problem, in my mind, is that you write fixed-size binary data in variable-size textual form. It could be so much simpler if you just stick to pure binary form.
Instead of writing to a string stream and then writing that output to the actual file, just write the binary data directly to the file:
ofs.write(reinterpret_cast<char*>(m_data), sizeof(m_data[0]) * m_size);
Then do something similar when reading the data.
For this to work, you of course need to save the number of entries in the array/vector first before writing the actual data.
So the actual write function could be as simple as
void writeToBinFile(const char* path) const
{
ofstream ofs(path, ios_base::out | ios_base::binary);
if (ofs)
{
ofs.write(reinterpret_cast<const char*>(&m_size), sizeof(m_size));
ofs.write(reinterpret_cast<const char*>(&m_data[0]), sizeof(m_data[0]) * m_size);
}
}
And the read function
void readFromBinFile(const char* path)
{
ifstream ifs(path, ios_base::in | ios_base::binary);
if (ifs)
{
// Read the size
ifs.read(reinterpret_cast<char*>(&m_size), sizeof(m_size));
// Read all the data
ifs.read(reinterpret_cast<char*>(&m_data[0]), sizeof(m_data[0]) * m_size);
}
}
Depending on how you define m_data you might need to allocate memory for it before reading the actual data.
Oh, and if you want to append data at the end of the array (but why would you, in the current code you show, you rewrite the whole array anyway) you write the size at the beginning, seek to the end, and then write the new data.

Writing/reading a struct from a file to a std::vector<> [duplicate]

This question already has answers here:
STATUS_ACCESS_VIOLATION when reading file int struct
(4 answers)
dynamically allocating memory to struct when reading from file in C++
(3 answers)
Closed 9 years ago.
I have successfully followed the answer posted here to write a structure (of type image_info_t) to a file.
I repeat the process in a loop for N number of image_info_t's and all the data is serialized and added to the file correctly.
I now need to read the file, but I need to be able to read an arbitrary number, M, image_info_t structs to read from the file (all in order). The answer referenced above explicitly hardcodes the number of structures to read back from the file (i.e., student_t master[3];). However, I need this number to be dynamic.
I have read here that "C++ standard requires that arrays use either an integer literal or a integer constant when declaring its size. Use <vector> instead"
My question is: how can I do this? How can I read the set of image_info_t structs back from the file into a std::vector?
Here is my current (non-working) code that I am using to read the image_info_t data back from the file.
std::ifstream input_file(path, std::ios::binary);
const int kpts_size = kpts.size();
feature_t master[kpts_size]; //DOES NOT WORK. If I change to `feature_t master[10];` it works.
input_file.read((char*)&master, sizeof(master));
input_file.close();
Note: this is not an access violation question, and is not related to the "Possible dup" answer. When you tag it as such, people stop reading my question which certainly doesn't help anyone.
#include <iostream>
#include <fstream>
#include <sstream>
#include <vector>
//Some kind of structure containing many sample data types..
typedef struct image_info
{
char image_type;
std::uint32_t md5_hash;
std::string image_name;
std::vector<std::uint8_t> bytes;
} image_info_t;
//Used for writing the above structure to a stream..
std::ostream& operator << (std::ostream& os, const image_info &entry)
{
std::size_t image_name_size = entry.image_name.size();
std::size_t image_bytes_size = entry.bytes.size();
os.write(&entry.image_type, sizeof(entry.image_type));
os.write(reinterpret_cast<const char*>(&entry.md5_hash), sizeof(entry.md5_hash));
os.write(reinterpret_cast<const char*>(&image_name_size), sizeof(image_name_size));
os.write(entry.image_name.c_str(), entry.image_name.size());
os.write(reinterpret_cast<const char*>(&image_bytes_size), sizeof(image_bytes_size));
os.write(reinterpret_cast<const char*>(&entry.bytes[0]), entry.bytes.size());
return os;
}
//Used for reading the above structure from a stream..
std::istream& operator >> (std::istream& is, image_info &entry)
{
std::size_t image_name_size = 0;
std::size_t image_bytes_size = 0;
is.read(&entry.image_type, sizeof(entry.image_type));
is.read(reinterpret_cast<char*>(&entry.md5_hash), sizeof(entry.md5_hash));
is.read(reinterpret_cast<char*>(&image_name_size), sizeof(image_name_size));
entry.image_name.resize(image_name_size);
is.read(&entry.image_name[0], image_name_size);
is.read(reinterpret_cast<char*>(&image_bytes_size), sizeof(image_bytes_size));
entry.bytes.resize(image_bytes_size);
is.read(reinterpret_cast<char*>(&entry.bytes[0]), image_bytes_size);
return is;
}
//Used for writing an array/vector of the above structure to a stream..
std::ostream& operator << (std::ostream& os, const std::vector<image_info> &entry)
{
std::size_t entry_size = entry.size();
os.write(reinterpret_cast<const char*>(&entry_size), sizeof(entry_size));
for (std::size_t i = 0; i < entry_size; ++i)
os << entry[i];
return os;
}
//Used for reading an array/vector of the above structure from a stream..
std::istream& operator >> (std::istream& is, std::vector<image_info> &entry)
{
std::size_t entry_size = 0;
is.read(reinterpret_cast<char*>(&entry_size), sizeof(entry_size));
entry.resize(entry_size);
for (std::size_t i = 0; i < entry_size; ++i)
is >> entry[i];
return is;
}
int main()
{
std::vector<image_info_t> outdata;
std::vector<image_info_t> indata;
image_info_t one;
image_info_t two;
one.image_name = "one";
one.image_type = 'a';
one.md5_hash = 1;
one.bytes.push_back(0);
two.image_name = "two";
two.image_type = 'b';
two.md5_hash = 2;
two.bytes.push_back(1);
outdata.push_back(one);
outdata.push_back(two);
std::fstream out("C:/Users/School/Desktop/Image_Info_T.bin", std::ios::out | std::ios::binary);
if (out.is_open())
{
out << outdata;
out.close();
}
std::fstream in("C:/Users/School/Desktop/Image_Info_T.bin", std::ios::in | std::ios::binary);
if (in.is_open())
{
in >> indata;
}
std::cout<<indata[0].image_name<<" "<<indata[1].image_name;
}
If you want to avoid using a vector, you can initialize your array by doing the following:
feature_t* master = new feature_t[kpts.size()];
//code
delete[] master;
Alternatively, with a vector you can simply make a vector of feature_t, IE:
std::vector<feature_t> master;
Generally I find the easiest way to add structs or classes to a vector is to make an instance of them, then fill all the values and add it to the vector, so I might do:
feature_t temp;
while (getline(file, str))
{
temp.a = ...;
temp.b = ...;
master.push_back(temp);
}
In C, new would be replaced with malloc (or one of its derivative functions), so you would use:
feature_t* master = malloc(sizeof(master) * kpts.size());
//code
free(master);

Split a binary file into chunks c++

I've been bashing my head against trying to first divide up a file into chunks, for the purpose of sending over sockets. I can read / write a file easily without splitting it into chunks. The code below runs, works, kinda. It will write a textfile and has a garbage character. Which if this was just for txt, no problem. Jpegs aren't working with said garbage.
Been at it for a few days, so I've done my research, and it's time to get some help. I do want to stick strictly to binary readers, as this need to handle any file.
I've seen a lot of slick examples out there. (none of them worked for me with jpgs) Mostly something along the lines of while(file)... I subscribe to the, if you know the size, use a for-loop, not a while-loop camp.
Thank you for the help!!
vector<char*> readFile(const char* fn){
vector<char*> v;
ifstream::pos_type size;
char * memblock;
ifstream file;
file.open(fn,ios::in|ios::binary|ios::ate);
if (file.is_open()) {
size = fileS(fn);
file.seekg (0, ios::beg);
int bs = size/3; // arbitrary. Actual program will use the socket send size
int ws = 0;
int i = 0;
for(i = 0; i < size; i+=bs){
if(i+bs > size)
ws = size%bs;
else
ws = bs;
memblock = new char [ws];
file.read (memblock, ws);
v.push_back(memblock);
}
}
else{
exit(-4);
}
return v;
}
int main(int argc, char **argv) {
vector<char*> v = readFile("foo.txt");
ofstream myFile ("bar.txt", ios::out | ios::binary);
for(vector<char*>::iterator it = v.begin(); it!=v.end(); ++it ){
myFile.write(*it,strlen(*it));
}
}
The problem is that you are using a strlen to calculate the size of array to be written. A 0 to be a part of binary there you would not be writing the right size. Instead, use a pair of char*,int where int specifies the size that is to be written and you will be golden.
Like:
#include <iostream>
#include <vector>
#include <fstream>
#include <stdlib.h>
#include <string.h>
using namespace std;
ifstream::pos_type fileS(const char* fn)
{
ifstream file;
file.open(fn,ios::in|ios::binary);
file.seekg(0, ios::end);
ifstream::pos_type ret= file.tellg();
file.seekg(0,ios::beg);
ret=ret-file.tellg();
file.close();
return ret;
}
vector< pair<char*,int> > readFile(const char* fn){
vector< pair<char*,int> > v;
ifstream::pos_type size;
char * memblock;
ifstream file;
file.open(fn,ios::in|ios::binary|ios::ate);
if (file.is_open()) {
size = fileS(fn);
file.seekg (0, ios::beg);
int bs = size/3; // arbitrary. Actual program will use the socket send size
int ws = 0;
int i = 0;
cout<<"size:"<<size<<" bs:"<<bs<<endl;
for(i = 0; i < size; i+=bs){
if(i+bs > size)
ws = size%bs;
else
ws = bs;
cout<<"read:"<<ws<<endl;
memblock = new char [ws];
file.read (memblock, ws);
v.push_back(make_pair(memblock,ws));
}
}
else{
exit(-4);
}
return v;
}
int main(int argc, char **argv) {
vector< pair<char*,int> > v = readFile("a.png");
ofstream myFile ("out.png", ios::out | ios::binary);
for(vector< pair<char*,int> >::iterator it = v.begin(); it!=v.end(); ++it ){
pair<char*,int> p=*it;
myFile.write(p.first,p.second);
}
}
myFile.write(*it,strlen(*it));
Is using string length on binary data. I suspect that is your culprit. If not, it's certainly a code-smell.
You should never do this:
myFile.write(*it,strlen(*it));
on binary data. strlen counts bytes until it hits a byte which contains a 0 (NUL as we like to say, but it's an honest 0). If you read enough binary data, you will hit a NUL, and you'll get a short count. But actually the situation could be a lot worse, because nowhere do you store the NUL for strlen to find. You're just counting on there being one beyond the end of the datablock you acquire to read the file into.
So don't do that. Remember the number of bytes in each block (you could use a vector> but there are a lot of more C++-like possibilities) and use that to write the data.