Writing and reading objects with virtual methods to a binary file - c++

Hi I'm currently working on a simulation program that tries to save the state (variables and objects) of the program to a binary file when requested so that it can resume the simulation if needed.
Just as a note: I know that this is not compatible across different CPU architectures and that is absolutely fine!
Everything seemed to be working fine until it came to writing an object that has virtual methods to a file and then trying to reading it back.
The following code illustrates this problem:
header.hpp
using namespace std;
class parent
{
public:
int mValue;
virtual string getName() =0;
virtual size_t getSize() =0;
parent(int value) : mValue(value)
{
}
};
class bob : public parent
{
public:
bob(int value) : parent(value)
{
}
string getName();
size_t getSize() { return sizeof(bob); }
};
string bob::getName()
{
string name("bob");
return name;
}
class sarah : public parent
{
public:
sarah(int value) : parent(value)
{
}
string getName();
size_t getSize() { return sizeof(sarah); }
};
string sarah::getName()
{
string name("sarah");
return name;
}
write.cpp
#include <iostream>
#include <fstream>
#include <string>
#include "header.hpp"
int main()
{
sarah girl(1);
bob boy(2);
parent* child1 = &girl;
parent* child2 = &boy;
cout << "Created child called " << child1->getName() << endl;
cout << "Created child called " << child2->getName() << endl;
//save sarah and bob to a binary file
ofstream file("temp.bin", ios::binary | ios::trunc);
if(!file.is_open())
return 1;
//format <size><data><size><data>....
size_t tempSize=0;
//write child1
tempSize = child1->getSize();
file.write( (char*) &tempSize,sizeof(size_t));
file.write( (char*) child1,tempSize);
tempSize = child2->getSize();
file.write( (char*) &tempSize,sizeof(size_t));
file.write( (char*) child2,tempSize);
file.close();
return 0;
}
read.cpp
#include <iostream>
#include <fstream>
#include <string>
#include <cstdlib>
#include "header.hpp"
int main()
{
//read sarah and bob from a binary file
ifstream file("temp.bin", ios::binary);
//format <size><data><size><data>....
size_t tempSize=0;
//get size of child1
file.read( (char*) &tempSize, sizeof(size_t));
//allocate memory for child1
parent* child1= (parent*) malloc(tempSize);
//read child 1 back
file.read( (char*) child1,tempSize);
//get size of child2
file.read( (char*) &tempSize, sizeof(size_t));
//allocate memory for child2
parent* child2= (parent*) malloc(tempSize);
//read child 2 back
file.read( (char*) child2,tempSize);
file.close();
//Using virtual methods causes SEGFAULT
cout << "Recreated child" << child1->getName() << endl;
cout << "Recreated child" << child2->getName() << endl;
return 0;
}
And building and running as follows:
g++ -g write.cpp -o write ; ./write
g++ -g read.cpp -o read ; ./read
When I step through the read program in gdb I've noticed the problem appears to be the v-table pointer. When I recreate "sarah" (child1) in the read program the v-table pointer is the one that existed for the write program, not the read program. So presumably this v-table pointer for "sarah" in the write program points to an invalid region of memory which is causing the SEGFAULT.
I have two questions:
Is it possible to save the v-table pointer information to the binary file in the "write" program so that my objects are perfectly recreated in the "right" program without resorting to a library such as Boost::Serialization or POST++ to handle this for me?
If it isn't possible ... or if it's quite complicated then I will have to add a constructor and a "saveState()" method (that can act on a ifstream and ofstream object respectively) so that each class (in this case sarah and bob) handles saving and reading it's state from a binary file. The problem with this is that I have multiple classes that are derived from the class "parent" so I would need a way for the "read" program to work out which constructor to call from reading the binary file.
I came up with one way of working out which constructor to call. This would be
Giving each class that derives from "parent" a unique ID
In the "write" program add unique ID to the binary file
In the "read" program read each unique ID and then use a switch statement to call the relevant constructor.
This isn't very elegant though as every time I add a new class that derives from "parent" I have to give it an ID and add it to the switch statement in "read". Is there a better way of doing it?
Thanks for reading, I know my post is long!

Every time your program gets compiled it puts functions in different places in memory. Also, on some operating system configurations, functions might even move around every time you restart the program. It's a security feature called address space layout randomization. If you know for sure that you will be reading and writing an object from the exact same binary, you might be able to do what you want by putting your read and write functions in the same program instead of two different ones. However, even this is fraught with the problem that if you make a change and recompile, you can no longer read your old data files anymore.
Boost::Serialization was created specifically to avoid all these issues, including I'm sure some I'm not even aware of, is heavily peer reviewed and tested, and has an extremely liberal license as a bonus. Use of such a library is not something to be "resorted" to, it's a privilege.

Related

Why does my class std::vector member always throw a segfault?

I've searched endlessly on SE for a logical explanation for why this is happening. It is probably something very simple that I've overlooked, however I cannot spot it and would really appreciate some assistance with this.
Last week I implemented a class to read the output of a system call from a .ini file and then find and store the required information into custom objects that are then stored in a vector inside a Config class. It is a Singleton config class storing a unique_ptr for each instance of my custom class that is created.
The thing is, when I implemented this last week on my laptop, I had zero issues reading and writing to my member vector and was able to get it working exactly how I needed it. Since pulling to my desktop computer, this vector, and any STL container that I use as a member of my class, throws a segmentation fault when I try to do anything on it, even get it's size.
I've tried to shorten the code below to only include sections that actually use this vector. I have replaced my config with A, and custom class with T, and no matter where I try to use my member container, or any other test STL containers that I add to the class, I get a segfault.
For the record, I am using Qt with C++11.
Update: This example breaks on line 50 of c.cpp when debugging, and anywhere that tries to call the vector.
Debug points to this line in stl_vector.h
// [23.2.4.2] capacity
/** Returns the number of elements in the %vector. */
size_type
size() const _GLIBCXX_NOEXCEPT
/*-> this line */ { return size_type(this->_M_impl._M_finish - this->_M_impl._M_start); }
main.cpp
#include "c.h"
int main(int argc, char *argv[])
{
C *c = C::getInstance();
delete c;
return 0;
}
t.h - Class stores information from file
#include <string>
class T
{
public:
T();
bool Active();
std::string getA();
void setA(std::string);
private:
std::string a;
};
t.cpp
#include "t.h"
T::T()
{
}
bool T::Active()
{
if(a == "")
{
return false;
}
return true;
}
std::string T::getA()
{
return this->a;
}
void T::setA(std::string newa)
{
this->a = newa;
}
c.h - Class stores T objects and parses file for information
#include "t.h"
#include <QDebug>
#include <vector>
#include <algorithm>
#include <iostream>
#include <memory>
#include <sstream>
#include <fstream>
class C
{
public:
static C* getInstance();
private:
C();
static C* instance;
static bool init;
std::vector<std::unique_ptr<T>> t_list;
void readLines(const std::string&);
};
c.cpp
#include "c.h"
bool C::init = false;
C* C::instance = nullptr;
C::C()
{
system("echo this is a test command > a.ini");
instance->readLines("a.ini");
}
C* C::getInstance()
{
if(!init)
{
instance = new C;
init = true;
}
return instance;
}
void C::readLines(const std::string &path)
{
T* new_t;
std::ifstream file(path.c_str());
if(!file.is_open())
{
qDebug() << "Unable to open " << path.c_str();
}
std::ofstream o("test.txt");
std::string line;
while(std::getline(file, line))
{
// Split string before searching
std::stringstream ss(line);
std::string seg;
std::vector<std::string> split;
std::string left, right;
// Search patterns
size_t find_a = line.find("a");
size_t del = line.find(':');
if(find_a != std::string::npos)
{
o << "test_Size: " << t_list.size() << std::endl;
if(new_t->Active())
{
T* temp = new_t;
std::unique_ptr<T> move_t(temp);
t_list.push_back(std::move(move_t));
}
o << "test: " << t_list.size() << std::endl;
std::string n;
// Check if previous ahas any null elements
// Split string to find a
n = line.substr(line.find("a "));
n = n.substr(n.find(" ", +2));
new_t->setA(n);
}
else
{
continue;
}
}
// Add last a
T* t = new_t;
std::unique_ptr<T> move_t(t);
//t_list.push_back(std::move(move_t));
o << "a: " << t_list.back().get()->getA() << std::endl;
o << t_list.size() << std::endl;
o.close();
file.close();
}
UPDATE after code change:
I see two things now: One is that new_t in C::readlines is never initialized, so this could break when new_t->Active() is called a bit later in the function. However, I believe that the main problem you're running into is in C::C(), where it says
instance->readLines("a.ini");
At this point in the execution, C::instance is not yet initialized -- you're only just constructing the object that would later be assigned to it. Because of this, this in the readlines call is invalid, and any attempt to access object members will cause UB. This latter problem can be fixed by just calling
readLines("a.ini");
in which case the currently constructed object (that will later be instance) is used for this. I have no idea what you want to happen for the first, though, so all I can say is: If you want to have a vector<unique_ptr<T>>, you will have to create objects of type T with either new T() or (arguably preferrably) std::make_unique<T>() and put them in there.
I'll also say that this is a rather ugly way to implement a singleton in C++. I mean, singletons are never really pretty, but if you're going to do it in C++, the usual way is something like the accepted answer of C++ Singleton design pattern .
Old answer:
The problem (if it is the only one, which I cannot verify because you didn't provide an MCVE) is in the lines
T move_t = new_T;
std::unique_ptr<Adapter> ptr_t(&move_t); // <-- particularly this one
m_ts.push_back(std::move(ptr_t));
You're passing a pointer to a local object into a std::unique_ptr, but the whole purpose of std::unique_ptr is to handle objects allocated with new to avoid memory leaks. Not only will the pointer you pass into it be invalid once the scope surrounding this declaration is left, even if that weren't the case the unique_ptr would attempt to delete an object that's not on the heap at the end of its lifecycle. Both problems cause undefined behavior.
To me, it looks as though you really want to use a std::vector<T> instead of std::vector<std::unique_ptr<T>>, but that's a design issue you'll have to answer yourself.
Answering my own question here. I am trying to call a member variable from within the constructor of the object that holds it, so the vector I am trying to access is not yet instantiated and doesn't exist in memory. That is what causes the Segmentation fault to occur, I am trying to access memory that is not allocated yet, hence any call acting on any member of my C class was causing this issue.
I fixed this problem by adding a public function to the class that then calls the private readLines() function. I call that public function from the object that will take ownership of it, and since this occurs after it has been instantiated, the memory is accessible and the problem disappears.

Persist C++ class instance through Matlab mex function calls

Currently trying to figure out how to solve this problem.
Right now I am trying to create mex functions for a C++ library which talks to a microcontroller over serial ports on a Windows 10 PC so that I can call functions in that library in matlab. I am currently working through how to persist an instance of my class through multiple matlab mexFunction calls.
So far the only thing I could come up with is to write a wrapper around the class, declare a global extern unique pointer to my class instance, and include it in my mexFunction() files.
Can anyone tell me if this might work, and if so, how exactly does matlab/C++ handle the mexFunction files and their method calls? The scope of my class instance is what I'm unsure of.
A concrete example might be...
What would happen if I declared an extern unique pointer to an object in a .cpp file and included it in my mexFunction files? Would the pointer stay in scope throughout the matlab script that calls multiple different mexFunctions that manipulate that object?
If I need to rephrase the question or provide more information, please let me know.
Yes, you can do this. If the MEX-files all link to the same shared library (DLL), then they all have access to global variables defined in it. You would need to define your global object in the shared library, not in one of the MEX-files.
MEX-files stay loaded in memory after first execution, until you call clear functions (or clear all). The global object will be destructed when the shared object is cleared from memory. To prevent undesired clearing of your state, you can lock one of the MEX-files in memory using mexLock. I would recommend having one ‘initialize` MEX-file, that constructs the object and locks itself in memory. With a special parameter you can make it unlock itself and destroy the object.
Here is an example:
libXYZ.dylib / libXYZ.so / XYZ.dll -- a shared library, contains a std::shared_ptr<XYZ>.
XYZ_set.mex... -- a MEX-file that initializes the XYZ object, and locks itself in memory. Links to the libXYZ shared library.
XYZ_get.mex... -- another MEX-file that links to the libXYZ shared library and accesses the XYZ object created by the other MEX-file.
XYZ_lib.h:
#include <memory>
#include <iostream>
struct XYZ {
XYZ(double a);
~XYZ();
double get();
private:
double a_;
};
extern std::unique_ptr<XYZ> XYZ_data;
XYZ_lib.cpp:
#include "XYZ_lib.h"
std::unique_ptr<XYZ> XYZ_data;
XYZ::XYZ(double a) : a_(a) {
std::cout << "Constructing XYZ with " << a_ << '\n';
}
XYZ::~XYZ() {
std::cout << "Destructing XYZ, value was " << a_ << '\n';
}
double XYZ::get() {
return a_;
}
XYZ_set.cpp:
#include "XYZ_lib.h"
#include <mex.h>
/// \brief An output stream buffer for MEX-files.
///
/// Creating an object of this class replaces the stream buffer in `std::cout` with the newly
/// created object. This buffer will be used as long as the object exists. When the object
/// is destroyed (which happens automatically when it goes out of scope), the original
/// stream buffer is replaced.
///
/// Create an object of this class at the beginning of any MEX-file that uses `std::cout` to
/// print information to the *MATLAB* terminal.
class streambuf : public std::streambuf {
public:
streambuf() {
stdoutbuf = std::cout.rdbuf( this );
}
~streambuf() {
std::cout.rdbuf( stdoutbuf );
}
protected:
virtual std::streamsize xsputn( const char* s, std::streamsize n ) override {
mexPrintf( "%.*s", n, s );
return n;
}
virtual int overflow( int c = EOF ) override {
if( c != EOF ) {
mexPrintf( "%.1s", &c );
}
return 1;
}
private:
std::streambuf* stdoutbuf;
};
void mexFunction( int, mxArray*[], int nrhs, const mxArray* prhs[] ) {
streambuf buf; // Allows std::cout to work in MEX-files
// Always do lots of testing for correct input in MEX-files!
if (nrhs!=1) {
mexErrMsgTxt("Requires 1 input");
}
if (mxIsChar(prhs[0])) {
// Assume it's "-unlock" or something like that. Unlock MEX-file
mexUnlock();
std::cout << "XYZ can now be cleared from memory\n";
} else {
// Here we create new data
if (!mxIsDouble(prhs[0]) || mxIsEmpty(prhs[0])) {
mexErrMsgTxt("Expected double input");
}
double a = *mxGetPr(prhs[0]);
XYZ_data = std::unique_ptr<XYZ>(new XYZ(a));
// If the MEX-file is not locked, lock it
if (!mexIsLocked()) {
mexLock();
}
}
}
(Sorry for the streambuf class here, it's noise, but I wanted to use it so you can see the constructor and destructor in the shared library being called.)
XYZ_get.cpp:
#include "XYZ_lib.h"
#include <mex.h>
void mexFunction( int, mxArray* plhs[], int, const mxArray* [] ) {
if (XYZ_data) {
plhs[0] = mxCreateDoubleScalar(XYZ_data->get());
} else {
mexErrMsgTxt("XYZ not initialized!");
}
}
Compiling:
In a shell (I'm using MacOS, hence the dylib extension, adjust as necessary):
g++ -std=c++11 -Wall -fpic XYZ_lib.cpp -shared -o libXYZ.dylib
In MATLAB:
mex XYZ_set.cpp libXYZ.dylib
mex XYZ_get.cpp libXYZ.dylib
Running:
>> XYZ_get
Error using XYZ_get
XYZ not initialized!
>> XYZ_set(4)
Constructing XYZ with 4
>> XYZ_set(6)
Constructing XYZ with 6
Destructing XYZ, value was 4
>> XYZ_get
ans =
6
>> clear all
>> XYZ_set -unlock
XYZ can now be cleared from memory
>> clear all
Destructing XYZ, value was 6
As you can see, XYZ_get accesses the value in an object that was newed by XYZ_set. clear all typically clears everything from memory, but here the locked MEX-file stays. XYZ_set -unlock calls it with a string argument, which causes it to unlock itself. clear all now clears that MEX-file from memory also, and now the XYZ object is destroyed.
I need to mention here that C++ doesn't have a consistent ABI, and these MEX-files will only load if the shared library was compiled with the same compiler.
An alternative, and often simpler, is to create only one single MEX-file (statically linked with your C++ code), and a bunch of M-files that call the MEX-file. The M-files provide the nice interface (can do input checking also), and the MEX-file sits in a private/ directory where nobody can mess with it. The MEX-file can still do the locking thing, so it can hold on to objects that are preserved from call to call.

Why does this cpp program fail?

Note: While I was debugging, I found that until the last line, the program run normally, but when going by the last bracket, a mistake window would pop up. I'm not quite familiar with C++ so I couldn't locate the problem. Please help!
#include <iostream>
#include <fstream>
#include <vector>
using namespace std;
class test {
public:
int x;
void add_list(int);
void display();
private:
vector<int> list;
};
void test::add_list(int op)
{
list.push_back(op);
}
void test::display()
{
cout << x << endl;
for (unsigned int i=0;i<list.size(); i++)
cout << "->" << list[i];
cout << endl;
}
int main (void)
{
test test1;
test1.x = 3;
test1.add_list(2);
test1.add_list(4);
int size = sizeof (test1);
ofstream fout ("data.dat", ios_base::binary);
fout.write((char *)&test1, size);
fout.close();
ifstream fin ("data.dat", ios_base::binary);
test test2;
fin.read((char *)&test2, size);
test2.display();
fin.close();
return 0;
}
These lines
fout.write((char *)&test1, size);
and
fin.read((char *)&test2, size);
won't work because the test class contains objects that contain pointers. std::list will allocated extra memory using new to store items that are pushed on to it. It will then keep pointers to those items. When you write the object to disk, it will still contain those pointers to memory. When you load the object back again, the pointers will contain the same value, but your program may not have the same memory allocated and certainly won't have it allocated for the object.
In your case test2 appears to work because its internal pointers end up being the same as test1, however when you program finishes, the test1 destructor releases the memory that it allocated, then the test2 destructor tries to release the same memory, leading to your error.
To fix it, you should change your code to write the object in a defined format that doesn't use pointers (e.g. write out the item count followed by each items integer value). Then read them in the same way. One simple fwrite won't be able to do it.

How to read pointer of class from file

I am trying to write pointers of a class into file and then reading it. Writing is just fine, but reading shows error of type conversion. Help please.
Take example of this(integer).. If we use int instead of int* then code executes but not fine.
#include<iostream>
#include<windows.h>
#include<fstream>
using namespace std;
void save(int *ptr)
{
ofstream data;
data.open("info.txt",ios::app);
if (data.is_open())
{
data<<ptr;
data.close();
}
else
{
cout << "Unable to open file";
}
}
int* loaddata()
{
ifstream data;
int ptr;
data.open("info.txt");
if (data.is_open())
{
while (!data.eof() )
{
data>>ptr;
}
data.close();
}
else
{
cout << "Unable to open file";
}
return ptr;
}
void main()
{
int a=0;
save(&a);
int *ptr=loaddata();
}
A pointer is just a memory address. You can write it just fine, as you said, but when you read it, it is still just a memory address. Unless the object that it was pointing to is at the exact same memory location when you read it, you will be "reading" a pointer to random data, which you cannot convert to the class of the object it was pointing to before.
It's like storing the location (lat/long) of a butterfly, then trying to find that butterfly just from that position. The butterfly is most likley in a completely different place now.
What you are trying is that was normally called serialization.
The idea is to write class instances ( all data contained ) and an ID which can be the address of the instance because this is a very well unique id. Your serialization library takes care that only one instance is written ( as only one data set is needed ) and all later writes of this instance are done only by writing the pointer.
Reading back is quite simple as well. You serialization library knows that it needs a instance of a class, generate a new one with the content as written before if not already done with the unique id ( maybe the pointer/address as mentioned before ). After that every try to get a "read pointer" results in setting the pointer the actual value of the new generated instance.
Have a look for serializer pattern or a concrete implementation like boost::serialize http://www.boost.org/doc/libs/1_58_0/libs/serialization/doc/index.html

SIGABRT in binary read/write

I wrote a very small code snippet and have already gotten the following error:
malloc: *** error for object 0x100100080: pointer being freed was not allocated
Problem is, I have no idea what pointer the compiler's talking about. I pass a variable in by address to the read/write functions, but I never freed it as far as I know. Where's the error in my code? I ran it with Leaks and Zombies, but got nothing.
Here's my program:
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <algorithm>
using namespace std;
class Bank
{
private:
string __name;
public:
Bank()
{
__name = "";
}
Bank(string name)
{
__name = name;
}
string getName() const { return __name; }
};
int main (int argc, char * const argv[])
{
Bank bank("Bank of America");
Bank bank2;
cout << "Bank1: " << bank.getName() << endl;
string filename = bank.getName() + ".bank";
ofstream fout(filename.c_str(), ios::binary);
if (fout.good())
fout.write((char *)&bank, sizeof(bank));
fout.close();
ifstream fin(filename.c_str(), ios::binary);
if (fin.good())
fin.read((char *)&bank2, sizeof(bank2));
fin.close();
cout << "Bank2: " << bank2.getName() << endl;
return 0;
}
You can't read an object that contains a std::string (or anything that's not Plain Ol' Data) with fin.read()--
The object is read and written as a stream of bytes, but std:string contains a pointer to memory that is stored elsewhere and is not written with your fout.write() and is not initialized properly with your fin.read()
It is because it is not initialized properly with your fin.read() that you are getting the heap error; when the object goes out of scope, the destructor of the improperly initialized std::string is being called, and trying to free memory that it doesn't own.
You probably want to write a custom i/o method for your object and save or load it piece-by-piece. For a shortcut to doing this, use the Boost serialization library.
Because your Bank class contains a std::string, you can't read/write it as binary like you are thinking. A std::string has internal pointers. If you write it as binary, you are just going to be writing pointers and not the actual string contents. Likewise, when you read the string, you are going to be reading a pointer. In this case, you end up making both your bank and bank2 objects have strings which point to the same memory, so when that memory is freed it gets freed twice.
You'll need to have some other way of writing your bank data to a file. In this case, a simple ASCII file with the bank name would be fine.
You cannot do what you are doing, simply because std::string cannot be copied like that. Internally a string object allocates memory and a simple copy of the outer structure doesn't do what you expect.
You need to serialize this structure properly.
Don't use underscores, please
Pass objects by reference: Bank(string& name), please
This is evil: fout.write((char *)&bank, sizeof(bank));
You may want to write << and >> ostream operators of your Bank class.
For example:
friend std::ostream& operator<<(std::ostream &out, const Bank& b);
friend std::istream& operator>>(std::istream &out, const Bank& b);
Members functions write of ostream and read of istream are specifically designed to input and output binary data. If you do want to manipulate binary data, use the following:
ifstream fin(filename.c_str(), ios::in|ios::binary|ios::ate);
size = fin.tellg();
memblock = new char [size];
fin.seekg(0, ios::beg);
if (fin.good()){
fin.read(memblock, size);
fin.close();
}
delete[] memblock;