How to serialize and deserialize an object into/from binary files manually? - c++

I've been trying to write the below object into a file and got lot of trouble since strings are dynamically allocated.
class Student{
string name, email, telephoneNo;
int addmissionNo;
vector<string> issued_books;
public:
// There are some methods to initialize name, email, etc...
};
So I got to know that I can't just write into a file or read from a file an object with serialization. So I searched all over the internet about serialization with cpp and got to know about Boost library.But I wanted to do it my own (I know writing a library that already exist is not good, but I wanna see what's going on inside the code). So I got to know about overloading iostream << and >>. And I also know that serialize/deserialize into/from text.
But I want to serialize into a binary file. So I tried overloading ostream write and istream read. But then I got size issues(as write and read needs the sizeof the object it writes/reads).Then I also got to know about stringstream can help to serialize/deserialize objects into/from binary. But I don't how to do that?
So my real question is How to serialize and deserialize an object into/from binary files without third party libraries?

I have found a solution serialize and deserialize an object into/from a file. Here is an explaination
As I told you this is my class. And I have added two functions which overload the iostream's write and read.
class Student{
string name, email, telephoneNo;
int addmissionNo;
vector<string> issuedBooks;
public:
void create(); // initialize the private members
void show(); // showing details
// and some other functions as well...
// here I'm overloading the iostream's write and read
friend ostream& write(ostream& out, Student& obj);
friend istream& read(istream& in, Student& obj);
};
But I have also told you that I have tried this already. The problem I have was how to read without object member's size. So I made changes as below (Please read comments also).
// write: overload the standard library write function and return an ostream
// #param out: an ostream
// #param obj: a Student object
ostream& write(ostream& out, Student& obj){
// writing the objet's members one by one.
out.write(obj.name.c_str(), obj.name.length() + 1); // +1 for the terminating '\0'
out.write(obj.email.c_str(), obj.email.length() + 1);
out.write(obj.telephoneNo.c_str(), obj.telephoneNo.length() + 1);
out.write((char*)&obj.addmissionNo, sizeof(obj.addmissionNo)); // int are just cast into a char* and write into the object's member
// writing the vector of issued books
for (string& book: obj.issuedBooks){
out.write(book.c_str(), book.length() + 1);
}
return out;
}
// read: overload the standard library read function and return an istream
// #param in: an istream
// #param obj: a Student object
istream& read(istream& in, Student& obj){
// getline is used rather than read
// since getline reads a whole line and can be give a delim character
getline(in, obj.name, '\0'); // delimiting character is '\0'
getline(in, obj.email, '\0');
getline(in, obj.telephoneNo, '\0');
in.read((char*)&obj.addmissionNo, sizeof(int));
for (string& book: obj.issuedBooks){
getline(in, book, '\0');
}
return in;
}
As you can see I have wrote length+1 for the terminating '\0'. It is usefull in read function as we have used getline instead of read. So getline reads until the '\0'. So no need of a size. And here I'm writing and reading into/from a file.
void writeStudent(Student s, ofstream& f){
char ch; // flag for the loop
do{
s.create(); // making a student
f.open("students", ios::app | ios::binary); // the file to be written
write(f, s); // the overloaded function
f.close();
cout << "Do you want to add another record? (y/n): ";
cin >> ch;
cin.ignore();
} while(toupper(ch) == 'Y'); // loop until user stop adding records.
}
void readStudent(Student s, ifstream& f){
char ch; // flag for the loop
do{
f.open("students", ios::in | ios::binary);
cout << "Enter the account no of the student: ";
int no;
cin >> no;
int found = 0;
while (read(f, s)){
if (s.retAddmissionNo() == no){
found = 1;
s.show();
}
}
if (!found)
cout << "Account Not found!\n";
f.close();
cout << "Do you want another record? (y/n): ";
cin >> ch;
} while(toupper(ch) == 'Y');
}
That's how I solved my problem. If something wrong here please comment. Thank you!

Related

C++: Problems reading input from a binary file

I have a class AccountManagement in AccountManagement.cpp. I have another class called Account in Account.cpp. I have a template that Orders the given data inside the list using OrdereList class, which also has it's own iterator. The AccountManagement class outputs the Accounts list in a binary file as shown below:
void AccountManagement::saveData(const char * file) //saves data in the specified binary file
{
ofstream out(file, ios::out | ios::binary);
if(!out)
{
cerr<<"Problem opening output file!"<<endl;
}
OrderedList<Account>::iterator it = this->account_manager.begin();
for(int i = 0; i < this->total_accounts; i++)
{
Account temp = *it;
out.write((char*)&temp, sizeof(Account));
it++;
}
out.close();
}
I have defined a following function inside AccountManagement class that reads all the data from binary file and outputs it. This function works perfectly fine. It is shown here:
void AccountManagement::output()
{
ifstream in("accounts.dat", ios::in | ios::binary);
if(!in)
{
cerr<<"File doesn't exist!"<<endl;
exit(1);
}
Account acc;
while(in.read((char*)&acc, sizeof(Account)))
{
cout<<acc;
}
in.close();
}
However, when I use this same function (with different name) in another file, which has Account.h header file as well to retrieve data from the same "account.dat" file it gives me segmentation fault. What could be the problem? Following is the function:
void loadData()
{
ifstream in("accounts.dat", ios::in | ios::binary);
if(!in)
{
cerr<<"File doesn't exist!"<<endl;
exit(1);
}
Account acc;
while(in.read((char*)&acc, sizeof(Account)))
{
cout<<acc;
}
in.close();
}
Account's class declaration:
class Account
{
friend ostream& operator<<(ostream&,const Account&); //overloading << operator
friend istream& operator>>(istream&,Account&); //overloading >> operator
public:
void operator=(const Account&); //overloading = operator
bool operator<=(const Account&); //overloading <= operator
bool operator<(const Account&); //overloading < operator
private:
string number; //Account Number
char name[100]; //Account holder's name
char sex; //M or F indicating the gender of account holder
MYLIB::Date dob; //date of birth of account holder
char address[100]; //address of account holder
char balance[20]; //balance of account holder
};
I don't know about the MYLIB::Date class, but it's enough that you have a std::string object in there.
The std::string object allocates memory dynamically to fit the string it contains. And memory allocated on the heap is available only to the current process, you can't save a pointer (which is inside the std::string class) and load it from some other process and hope there will be valid memory at that pointer.
If you save a pointer to dynamically allocated memory in one process, and load and use it from another process then you will have undefined behavior.
You need to serialize the string in order to save it. Possible the MYLIB::Data object as well.
Disclaimer: It will work on small embedded systems with a single unified address map, unfortunately all the bid user-oriented operating systems (like Windows, OSX and Linux) have separate address-spaces and walls between processes.
Your function AccountManagement::output() gives the impression it works perfectly, if you save the object and load it again in the same object and provided the string hasn't changed in the meantime.
What's wrong ?
As soon as your object is no longer a POD object (i.e. it contains data that use pointers, or use virtual functions, etc...), you can't just save it just by writing its memory to the disk.
In your case, the second function fails for this reason. The first function only gives the impression that it works. The string is a complex object that stores somewhere pointers to dynamically allocated memory. If you write the object and read it back as you did, without changing the object, the values that are in memory are simply re-read. The value of the hidden pointer that is read is exactly what it was before the read. That's a very lucky situation. But in most cases it will fail.
How to solve it ?
To save your object, you should serialize it: write/reade each component to the file separatly, using an appropriate function.
THe easiest way to do this is to use some existing serialisation libraries, such as boost serialization.

Extract information from a colon delimited file - C++

I am trying to extract information into class objects from a colon delimited file. Each line of the file is set up in the same format. Here are the first few lines of the file:
s:Charles:Babbage:80:530213286:1133764834:mechanical engineering:3.8
e:Marissa:Meyer:37:549114177:53321:ceo:4456000
s:Alonzo:Church:92:586312110:1100539644:mathematics:4.0
e:Dana:Ulery:74:573811211:23451:engineer:124569
This is a school project and the purpose is to teach us about class inheritance. We have a base class Person and two child classes Student and Employee. We are supposed to import and store the information for students into Student objects and employee into Employee objects. I have an array of objects for each class; I'm sorting the students into the array of Student objects, same for employees, and in addition adding all people to the array of People objects.
I don't know what to do to get each piece of information with the delimiting commas. Right now I'm trying to use .getline but it doesn't seem to be working. How do I use this function (or another function) to extract information between delimiters into char arrays? Here's what I have so far for the case that the data is for an employee:
ifstream fin;
char* tempImport;
tempImport = new char[50];
int* tempIntArray;
tempIntArray = new int[10];
double tempDouble;
int tempInt;
// get the specifier of student or employee
fin.getline(tempImport, ':');
if(tempImport[0]=='e'){
// get first name
fin.getline(tempImport, ':');
employees[employeeIndex].setFirstName(tempImport);
allPeople[personIndex].setFirstName(tempImport);
// get last name
fin.getline(tempImport, ':');
employees[employeeIndex].setFirstName(tempImport);
allPeople[personIndex].setFirstName(tempImport);
// get age
fin.getline(tempImport, ':');
employees[employeeIndex].setAge(tempImport[0] - 0);
allPeople[personIndex].setAge(tempImport[0] - 0);
// get SSN
fin.getline(tempImport, ':');
for(int i=0;i<9;i++){
tempIntArray[i] = tempImport[i] - 0;
}
employees[employeeIndex].setSsn(tempIntArray);
allPeople[personIndex].setSsn(tempIntArray);
// get Employee ID
fin.getline(tempImport, ':');
for(int i=0;i<5;i++){
tempIntArray[i] = tempImport[i] - 0;
}
employees[employeeIndex].setEmpID(tempIntArray);
// get title
fin.getline(tempImport, ':');
employees[employeeIndex].setTitle(tempImport);
// get salary
fin >> tempDouble;
employees[employeeIndex].setSalary(tempInt);
employeeIndex++;
personIndex++;
}
It looks like you're missing a parameter when you call ifstream::getline(). See:
http://www.cplusplus.com/reference/istream/istream/getline/
You need the 3-parameter version of the method in order to specify a delimeter. When you call the 2-parameter version it interprets ':' as the streamsize. (Basically, ':' just resolves to the ASCII code for a colon, so that number gets passed in. What you really want for streamsize is the length of your tempImport buffer.)
However, if I may suggest (and your assignment allows it), the std::getline() version of the function may be better. (It allows you to use std::string instead of char*, which is a more C++ish way of doing things. Also you don't have to worry about if your input is bigger than your buffer.) Here's the documentation on that:
http://www.cplusplus.com/reference/string/string/getline/
So basically you could do something like this:
std::string tempImport;
std::getline(fin, tempImport, ':');
As a debugging suggestion, you could print tempImport after each time you call getline() on it (regardless of which kind you use). Take those out before you submit, but those print statements could help you debug your parsing in the meantime.
std::stderr << "getline(): " << tempImport << std::endl;
Edit:
Regarding the comment below, I was able to get this to compile. (It doesn't do anything useful, but shows that std::getline() is indeed present and compiles.) Does it compile for you?
#include <fstream>
int main (int argc, char** argv)
{
std::ifstream ifs;
std::string str;
std::getline(ifs, str, ':');
return 0;
}
If you'll pardon my saying so, you seem to have been taught one of the worst possible imitations of 'object oriented programming' (though if it's any comfort, it's also a fairly common one).
Personally, I think I'd write things quite a bit differently.
I'd probably start by eliminating all the setSalary, setTitle, etc. They're a horrible perversion of what OOP was supposed to do, losing a great deal in readability while gaining nothing in encapsulation.
Rather than providing member functions to manipulate all the members of the class, the class should provide a higher-level member to reconstitute an instance of itself from a stream.
When you do get the data, you probably do not want to create separate objects for your arrays of People/Employees/Students. Rather, each item will go into the array of either Employees or Students. Then the People will be just an array of pointers to the items in the other two arrays.
As to the details of reading the data: personally I'd probably write a ctype class that classified : as whitespace, and then just read data. For your class, you probably want to stick to using getline though.
class Person {
virtual std::istream &read(std::istream &is);
friend std::istream &operator>>(std::istream &is, Person &p) {
return p.read(is);
}
};
class Student : public Person {
std::string first_name;
std::string last_name;
std::string age;
std::string ssn;
std::string ID;
std::string title;
std::string salary;
virtual std::istream &read(std::istream &is) {
std::getline(is, first_name, ':');
std::getline(is, last_name, ':');
std::getline(is, age, ':');
// ...
return is;
}
};
With those in place, reading data from the file will normally be pretty simple:
std::string t;
Employee e;
Student s;
while (std::getline(infile, t, ':'))
if (t == "e") {
infile >> e;
Employees.push_back(e);
}
else if (t =="s") {
infile >> s;
Students.push_back(s);
}

Reading and then working on stream data

I'm working on a problem 4-6 from Accelerated C++. The question asks that I rewrite the Student_info struct, read() function, and grade() function, so that the final grade is calculated immediately and then stored as the only grade in Student_info.
Previously, the program worked as follows:
read() reads from an input stream and stores the data into a Student_info object
Each object is added to a vector
Once every object is read and added, grade() is called on every Student_info object in the vector
With the new constraints I feel I must combine the read() and grade() functions, so there is no need to store intermediate grades. The problem is when reading from the stream I don't know I have run into the end of file, until I do. When doing this I try to call the grade() function on the end of file data.
I don't see a workaround considering the constraint is to read and then immediately work on the data. How can this be handled?
struct Student_info
{
std::string name;
double final_grade;
};
istream& read(istream& is, Student_info& s)
{
double midterm, final;
is >> s.name >> midterm >> final;
// Error, when EOF is read, grade() will process bad data
s.final_grade = grade(midterm, final);
return is;
}
void main()
{
vector<Student_info> students;
Student_info record;
while (read(cin, record))
students.push_back(record);
}
You can check whether the record was successfully read inside the read function. For example like this:
istream& read(istream& is, Student_info& s)
{
string name;
double midterm, final;
if( is >> name >> midterm >> final ) {
s.name = name;
s.final_grade = grade(midterm, final);
}
return is;
}
Note that you could read directly into s.name as in your original code, but my implementation has transaction semantics: it either reads the whole structure or leaves it alone in case it failed to read all the fields.

Serializing a class with a pointer in C++

I want to serialize an object of type Person. I want to use it later on for data saving or even game saving. I know how to do it for primitives like int, char, bool, and even c-strings like char[].
The problem is, I want the string to be as big as it needs to rather than declaring a char array of size 256 and hoping no one enters something too big. I read that serializing a class with std::string as a member doesn't work because it has an internal pointer, but is there a way to serialize my class which has a char* as a member?
I realize Boost has a serialization library, but I'd like to do this without the need of external libraries, it seems like a good activity to try.
Here's my Person class:
class Person
{
private:
char* _fname;
char* _lname;
public:
Person();
Person(const char* fname, const char* lname);
Person(const string& fname, const string& lname);
string fname() const;
void fname(const char* fname);
void fname(const string& fname);
string lname() const;
void lname(const char* lname);
void lname(const string& lname);
};
First: Use std::string in your class it will make your life so much easier in the long run.
But this advice works for both std::string and char* (with minor tweaks that should be obvious).
Basically you want to serialize data of unknown size (at compile time). This means when you de-serialize the data you must either have a technique that tells you how long the data is (prefix the object with a size) or a way to find the end of the data (a termination marker).
A termination marker is easier for serialization. But harder for de-serialization (as you must seek forward to find the end). Also you must escape any occurrences of the termination marker within your object and the de-serialization must know about the escaping and remove it.
Thus because of this complications I prefer not to use a termination marker. As a result I prefix the object with a size. The cost of this is that I must encode the size of the object in a way that will not break.
So if we prefix an object with its size you can do this:
// Place a ':' between the string and the size.
// There must be a marker as >> will continue reading if
// fname contains a digit as its first character.
// I don;t like using a space as >> skips spaces if you are not carefull
// and it is hard to tell the start of the string if the first characters in fname
// are the space character.
std::cout << strlen(fname) << ":" << fname;
Then you can de-serialize like this:
size_t size;
char mark;
std::cint >> size >> mark;
if (!std::cin || mark != ':')
{ throw BadDataException;
}
result = new char[size+1](); // Note the () to zero fill the array.
std::cin.read(result, size)
Edit 1 (based on comments) Update: to use with string:
size_t size;
char mark;
std::cint >> size >> mark;
if (!std::cin || mark != ':')
{ throw BadDataException;
}
std::string result(' ', size); // Initialize string with enough space.
std::cin.read(&result[0], size) // Just read directly into the string
Edit 2 (based on commented)
Helper function to serialize a string
struct StringSerializer
{
std::string& value;
StringSerializer(std::string const& v):value(const_cast<std::string&>(v)){}
friend std::ostream& operator<<(std::ostream& stream, StringSerializer const& data)
{
stream << data.value.size() << ':' << data.value;
}
friend std::istream& operator>>(std::istream& stream, StringSerializer const& data)
{
std::size_t size;
char mark(' ');
stream >> size >> mark;
if (!stream || mark != ':')
{ stream.setstate(std::ios::badbit);
return stream;
}
data.value.resize(size);
stream.read(&data.value[0], size);
}
};
Serialize a Person
std::ostream& operator<<(std::ostream& stream, Person const& data)
{
return stream << StringSerializer(data.fname) << " "
<< StringSerializer(data.lname) << " "
<< data.age << "\n";
}
std::istream& operator>>(std::istream& stream, Person& data)
{
stream >> StringSerializer(data.fname)
>> StringSerializer(data.lname)
>> data.age;
std::string line;
std::getline(stream, line);
if (!line.empty())
{ stream.setstate(std::ios::badbit);
}
return stream;
}
Usage:
int main()
{
Person p;
std::cin >> p;
std::cout << p;
std::ofstream f("data");
f << p;
}
You can't serialize pointer, you need to serialize data pointer points to.
You'll need to serialize whole web of objects, starting from Person (or Game) and looking into each object, which is reachable from your start object.
When deserializing, you reading data from your storage, allocate memory for that data and use address of this freshly allocated memory as a member of Person/Game object
Pointer fields make it bit harder, but not impossible to serialize. If you don't want to use any of the serialization libraries, here is how you can do it.
You should determine the size of what is pointed to at the time of serialization (e.g. it may be of fixed size or it may be a C-string with null character at the end), then you can save a mark indicating that you're serializing an indirect object together with size and the actual content of the area pointed to.
When you stumble upon that mark during deserialization, you can allocate the right amount of memory, copy the object into it and store the pointer to the area in the deserialized object.
I recommend using a vector to encapsulate strings for
serialization.
#include <vector>
using namespace std;
map vector<unsigned char> cbuff;
inline cbuff vchFromString(const std::string &str) {
unsigned char *strbeg = (unsigned char*) str.c_str();
return cbuff(strbeg, strbeg + str.size());
}
inline std::string stringFromVch(const cbuff &vch) {
std::string res;
std::vector<unsigned char>::const_iterator vi = vch.begin();
while (vi != vch.end()) {
res += (char) (*vi);
vi++;
}
return res;
}
class Example
{
cbuff label;
Example(string labelIn)
{
SetLabel(labelIn);
}
IMPLEMENT_SERIALIZE
(
READWRITE(label);
)
void SetLabel(string labelIn)
{
label = vchFromString(labelIn);
}
string GetLabel()
{
return (stringFromVch(label));
}
};

SIGABRT in binary read/write

I wrote a very small code snippet and have already gotten the following error:
malloc: *** error for object 0x100100080: pointer being freed was not allocated
Problem is, I have no idea what pointer the compiler's talking about. I pass a variable in by address to the read/write functions, but I never freed it as far as I know. Where's the error in my code? I ran it with Leaks and Zombies, but got nothing.
Here's my program:
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <algorithm>
using namespace std;
class Bank
{
private:
string __name;
public:
Bank()
{
__name = "";
}
Bank(string name)
{
__name = name;
}
string getName() const { return __name; }
};
int main (int argc, char * const argv[])
{
Bank bank("Bank of America");
Bank bank2;
cout << "Bank1: " << bank.getName() << endl;
string filename = bank.getName() + ".bank";
ofstream fout(filename.c_str(), ios::binary);
if (fout.good())
fout.write((char *)&bank, sizeof(bank));
fout.close();
ifstream fin(filename.c_str(), ios::binary);
if (fin.good())
fin.read((char *)&bank2, sizeof(bank2));
fin.close();
cout << "Bank2: " << bank2.getName() << endl;
return 0;
}
You can't read an object that contains a std::string (or anything that's not Plain Ol' Data) with fin.read()--
The object is read and written as a stream of bytes, but std:string contains a pointer to memory that is stored elsewhere and is not written with your fout.write() and is not initialized properly with your fin.read()
It is because it is not initialized properly with your fin.read() that you are getting the heap error; when the object goes out of scope, the destructor of the improperly initialized std::string is being called, and trying to free memory that it doesn't own.
You probably want to write a custom i/o method for your object and save or load it piece-by-piece. For a shortcut to doing this, use the Boost serialization library.
Because your Bank class contains a std::string, you can't read/write it as binary like you are thinking. A std::string has internal pointers. If you write it as binary, you are just going to be writing pointers and not the actual string contents. Likewise, when you read the string, you are going to be reading a pointer. In this case, you end up making both your bank and bank2 objects have strings which point to the same memory, so when that memory is freed it gets freed twice.
You'll need to have some other way of writing your bank data to a file. In this case, a simple ASCII file with the bank name would be fine.
You cannot do what you are doing, simply because std::string cannot be copied like that. Internally a string object allocates memory and a simple copy of the outer structure doesn't do what you expect.
You need to serialize this structure properly.
Don't use underscores, please
Pass objects by reference: Bank(string& name), please
This is evil: fout.write((char *)&bank, sizeof(bank));
You may want to write << and >> ostream operators of your Bank class.
For example:
friend std::ostream& operator<<(std::ostream &out, const Bank& b);
friend std::istream& operator>>(std::istream &out, const Bank& b);
Members functions write of ostream and read of istream are specifically designed to input and output binary data. If you do want to manipulate binary data, use the following:
ifstream fin(filename.c_str(), ios::in|ios::binary|ios::ate);
size = fin.tellg();
memblock = new char [size];
fin.seekg(0, ios::beg);
if (fin.good()){
fin.read(memblock, size);
fin.close();
}
delete[] memblock;