I have a program that employs an entity-component-system framework. Essentially this means that I have a collection of entities that have various components attached to them. Entities are actually just integer ID numbers, and components are attached to them by mapping the component to the specified ID number of the entity.
Now, I need to store collections of entities and the associated components to a file that can be modified later on, so basically I need a saving and loading functionality. However, being somewhat a newcomer to C++, I have hard time figuring out how to exactly do this.
Coming from Java and C#, my first choice would be to serialize the objects into, say, JSON, and then deserialize them when the JSON is loaded. However, C++ does not have any reflection features. So, the question is: how do I save and load C++ objects? I don't mean the actual file operations, I mean the way the objects and structs should be handled in order to preserve them between program launches.
One way of doing is to create Persistent Objects in C++, and store the your data.
check out the following links:
C++ object persistence library similar to eternity
http://sourceforge.net/projects/litesql/
http://en.wikipedia.org/wiki/ODB_(C%2B%2B)
http://drdobbs.com/cpp/184408893
http://tools.devshed.com/c/a/Web-Development/C-Programming-Persistence/
C++ doesn't support persistence directly (there are proposals for adding persistence and reflection to C++ in the future). Persistence support is not as trivial as it may seem at first. The size and memory layout of the same object may vary from one platform to another. Different byte ordering, or endian-ness, complicate matters even further. To make an object persistent, we have to reserve its state in a non-volatile storage device. ie: Write a persistent object to retain its state outside the scope of the program in which it was created.
Other Way, is to store the objects into an array, then push the array buffer to a file.
The advantage are that the disk platters don't have waste time ramping up and also the writing can be performed contiguously.
You can increase the performance by using threads. Dump the objects to a buffer, once done trigger a thread to handle the output.
Example:
The following code has not been compiled and is for illustrative purposes only.
#include <fstream>
#include <algorithm>
using std::ofstream;
using std::fill;
#define MAX_DATA_LEN 1024 // Assuming max size of data be 1024
class stream_interface
{
virtual void load_from_buffer(const unsigned char *& buf_ptr) = 0;
virtual size_t size_on_stream(void) const = 0;
virtual void store_to_buffer(unsigned char *& buf_ptr) const = 0;
};
struct Component
: public stream_interface,
data_length(MAX_DATA_LEN)
{
unsigned int entity;
std::string data;
const unsigned int data_length;
void load_from_buffer(const unsigned char *& buf_ptr)
{
entity = *((unsigned int *) buf_ptr);
buf_ptr += sizeof(unsigned int);
data = std::string((char *) buf_ptr);
buf_ptr += data_length;
return;
}
size_t size_on_stream(void) const
{
return sizeof(unsigned int) + data_length;
}
void store_to_buffer(unsigned char *& buf_ptr) const
{
*((unsigned int *) buf_ptr) = entity;
buf_ptr += sizeof(unsigned int);
std::fill(buf_ptr, 0, data_length);
strncpy((char *) buf_ptr, data.c_str(), data_length);
buf_ptr += data_length;
return;
}
};
int main(void)
{
Component c1;
c1.data = "Some Data";
c1.entity = 5;
ofstream data_file("ComponentList.bin", std::ios::binary);
// Determine size of buffer
size_t buffer_size = c1.size_on_stream();
// Allocate the buffer
unsigned char * buffer = new unsigned char [buffer_size];
unsigned char * buf_ptr = buffer;
// Write / store the object into the buffer.
c1.store_to_buffer(buf_ptr);
// Write the buffer to the file / stream.
data_file.write((char *) buffer, buffer_size);
data_file.close();
delete [] buffer;
return 0;
}
Related
I am trying to use the PEM_read_bio function to get data from a file.
The version of SSLeay we are using is from 1997, so documentation is a bit thin on the ground. Thankfully in this case it seems there is a matching function documented here: https://www.openssl.org/docs/man1.1.0/crypto/PEM_read_bio.html
I originally tried this:
char ** names;
char ** headers;
unsigned char ** data;
long len;
BIO *in = BIO_new_file("C:\\filename.txt", "r");
if (!in)
{
// error
}
else
{
int result = PEM_read_bio(in, names, headers, data, &len);
}
BIO_free(in);
OPENSSL_free(names);
OPENSSL_free(headers);
OPENSSL_free(data);
However this results in a run-time check failure: The variable 'names' is being used without being initialized.
The documentation mentions OPENSSL_malloc( num ) is used to initialize memory, but it fails to mention whether it does this behind the scenes, or the user does it.
OPENSSL_malloc is similar in usage to C's malloc, but how are we supposed to know how much memory to allocate in advance, before reading the file?
I have tried the following at the beginning:
char ** names = reinterpret_cast<char **>(OPENSSL_malloc(2));
char ** headers = reinterpret_cast<char **>(OPENSSL_malloc(2));
unsigned char ** data = reinterpret_cast<unsigned char **>(OPENSSL_malloc(2));
long len;
This results in apparently random data.
The documentation you linked to says:
The name, header and data pointers are allocated via OPENSSL_malloc() and should be freed by the caller via OPENSSL_free() when no longer needed.
That means PEM_read_bio() calls OPENSSL_malloc() for you, and then you call OPENSSL_free() on the allocated memory it returns when you are doing with it.
You are passing uninitialized pointers to PEM_read_bio(), that is why it is failing. The name, header and data parameters are all output parameters. You need to pass in the addresses of your own pointer variables to receive the memory that PEM_read_bio() allocates for you, eg:
char *name;
char *headers;
unsigned char *data;
long len;
BIO *in = BIO_new_file("C:\\filename.txt", "r");
if (!in)
{
// error
}
else
{
int result = PEM_read_bio(in, &name, &headers, &data, &len);
if (!result)
{
// error
}
else
{
...
OPENSSL_free(name);
OPENSSL_free(headers);
OPENSSL_free(data);
}
BIO_free(in);
}
gcc compilers allows to declare some as:
struct Msg : public BaseMsg // BaseMsg could contain the message code and common stuff to all the diferent messages frames
{
// some class state stuff whose layout must be contiguous
size_t len; // length of variable data
char buffer[0]; // here one could put data of variable size
};
In the past, I have used this style in order to manage messages frames. For example, I could do:
Msg * msg = (Msg*) malloc(sizeof(Msg) + additional_length);
memcpy(msg->buffer, /* some src addr for additional data */, additional_length);
In this way, I could put the constant message state and some additional data contiguous to the whole message, whose size often is variable, in a object of type Msg. Then I perform sending/receive once.
It is not a trivial technique, but in my modest experience is more concise, clear and efficient way than other alternatives (sending two separated messages and or to do more casting tricks).
Well, my interest is then to ask whether there are more efficient techniques or even if there is already a design pattern or library that simplifies the solution of such problems.
Thanks in advance for your attention
If you are using c++ I cannot see why this would be inefficient
class Msg
{
public:
Msg(size_t size, const char *const data)
{
m_data = new char[size + sizeof(size)];
// Be careful with endiannes
memcpy(m_data, &size, sizeof(size));
memcpy(m_data + size, data, size);
}
const char *
data() const
{
return m_data + sizeof(size_t);
}
size_t
length() const
{
return *reinterpret_cast<size_t *>(m_data);
}
~Msg()
{
delete[] m_data;
}
private:
char *m_data;
};
You could even add send() and receive() methods, if there is any good reason why this is bad I would love to know.
Consider a typical function that fills in a buffer:
const char* fillMyBuffer( const char* buf, int size );
Suppose this function fills the buffer with some useful data, that I want to use almost immediately after the call, and then I want to get rid of the buffer.
An efficient way of doing this is to allocate on the stack:
doStuff();
{
char myBuf[BUF_LEN];
const char* pBuf = fillMyBuffer( myBuf, BUF_LEN );
processBuffer( pBuf );
}
doOtherStuff();
So this is great for my library because the buffer is allocated on the stack - being essentially no cost to allocate, use and discard. It lasts the entire scope of the containing braces.
But I have a library where I do this pattern all the time. I'd like to automate this a little. Ideally I'd like code that looks like this:
doStuff();
{
// tricky - the returned buffer lasts the entire scope of the braces.
const char* pBuf = fillMyBufferLocal();
processBuffer( pBuf );
}
doOtherStuff();
But how to achieve this?
I did the following, which seems to work, but I know is counter to the standard:
class localBuf
{
public:
operator char* () { return &mBuf[0]; }
char mBuf[BUF_LEN];
};
#define fillMyBufferLocal() fillMyBuffer( localBuf(), BUF_LEN );
As a practical matter, the buffer is lasting on the stack during the entire lifetime of the containing braces. But the standard says that the object only has to last until the function returns. E.g. technically its just as unsafe as if I'd allocated the buffer on the stack inside the function.
Is there a safe way to achieve this?
I would generally recommend your original solution. It separates the allocation of the buffer from filling it. However, if you want to implement this fillMyBufferLocal alternative, it will have to dynamically allocate the buffer and return a pointer to it. Of course, if you return a raw pointer to dynamically allocated memory, it's very unclear that the memory should later be destroyed. Instead, return a smart pointer that encapsulates the appropriate ownership:
std::unique_ptr<char[]> fillMyBufferLocal()
{
std::unique_ptr<char[]> buffer(new char[BUF_LEN]);
// Fill it
return buffer;
}
Then you can use it like so:
auto buffer = fillMyBufferLocal();
processBuffer(buffer.get());
I do not think you should want to do this. It just makes the code harder to understand.
Automatic storage duration means that when an object goes out of scope, it is destroyed. Here you want trick the system into something that behaves like creating an object with automatic storage duration (i.e. allocates on the stack), but without respecting the corresponding rules (i.e. without being destroyed when returning from fillMyBuffer()).
The closest, meaningful thing you can do in my opinion is to use a global buffer that fillMyBuffer() can reuse, or let that buffer be a static variable inside fillMyBuffer(). For instance:
template<int BUF_LEN = 255>
const char* fill_my_buffer()
{
static char myBuf[BUF_LEN];
// Fill...
return myBuf;
}
However, I strongly suggest reconsidering your requirements, and either:
Keep using the solution you are currently adopting (i.e. transparently allocate on the stack); or
Allocate the buffer dynamically inside fillMyBuffer() and return a RAII wrapper (like a unique_ptr) to this dynamically allocated buffer.
UPDATE:
As a last, desperate attempt, you could define a macro that does the allocation and the invocation of fill_my_buffer() for you:
#define PREPARE_BUFFER(B, S) \
char buffer[S]; \
const char* B = fill_my_buffer(buffer, S);
You would then use it this way:
PREPARE_BUFFER(pBuf, 256);
processBuffer(pBuf);
You could write a class that contains a stack-based buffer and converts to char const *, e.g.
void processBuffer(char const * buffer);
char const * fillMyBuffer(char const * buffer, int size);
int const BUF_LEN = 123;
class Wrapper
{
public:
Wrapper(char const * (*fill)(char const *, int))
{
fill(&m_buffer[0], m_buffer.size());
}
operator char const * () const { return &m_buffer[0]; }
private:
std::array<char, BUF_LEN> m_buffer;
};
void foo()
{
Wrapper wrapper(fillMyBuffer);
processBuffer(wrapper);
}
i have a packet struct which have a variable len for a string example:
BYTE StringLen;
String MyString; //String is not a real type, just trying to represent an string of unknown size
My question is how i can make the implementation of this packet inside an struct without knowing the size of members (in this case strings). Here is an example of how i want it to "look like"
void ProcessPacket (PacketStruct* packet)
{
pointer = &packet.MyString;
}
I think its not possible to make since the compiler doesn't know the size of the string until run time. So how can make it look high level and comprehensible?.
The reason i need structs its for document every packet without the user actually have to look any of the functions that analyze the packet.
So i can resume the question to: is there a way to declare an struct of undefined size members or something close as a struct?
I would recommend a shell class that just interprets the packet data.
struct StringPacket {
char *data_;
StringPacket (char *data) : data_(data) {}
unsigned char len () const { return *data_; }
std::string str () const { return std::string(data_+1, len());
};
As mentioned in comments, you wanted a way to treat a variable-sized packet like a struct. The old C way to do that was to create a struct that looked like this:
struct StringPacketC {
unsigned char len_;
char str_[1]; /* Modern C allows char str_[]; but C++ doesn't */
};
And then, cast the data (remember, this is C code):
struct StringPacketC *strpack = (struct StringPacketC *)packet;
But, you are entering undefined behavior, since to access the full range of data in strpack, you would have to read beyond the 1 byte array boundary defined in the struct. But, this is a commonly used technique in C.
But, in C++, you don't have to resort to such a hack, because you can define accessor methods to treat the variable length data appropriately.
you can copy the string into a high-level std::string (at least, if my guess that String is a typedef for const char* is correct):
void ProcessPacket( const PacketStruct& packet )
{
std::string highLevelString( packet.MyString,
static_cast< size_t >( packet.StringLen ) );
...
}
A simple variant according to your posting would be:
struct PacketStruct {
std::string MyString;
size_t length () const { return MyString.length(); }
const char* operator & () const { return MyString.c_str(); }
};
This can be used (almost) as you desired above:
void ProcessPacket (const PacketStruct& packet)
{
const char * pointer = &packet;
size_t length = packet.length();
std::cout << pointer << '\t' << length << std::endl;
}
and should be invoked like:
int main()
{
PacketStruct p;
p.MyString ="Hello";
ProcessPacket(p);
}
If we have a POD struct say A, and I do this:
char* ptr = reinterpret_cast<char*>(A);
char buf[20];
for (int i =0;i<20; ++i)
buf[i] = ptr[i];
network_send(buf,..);
If the recieving end remote box, is not necessarily same hardware or OS, can I safely do this to 'unserialize':
void onRecieve(..char* buf,..) {
A* result = reinterpret_cast<A*>(buf); // given same bytes in same order from the sending end
Will the 'result' always be valid? The C++ standard states with POD structures, the result of reinterpret_cast should point to the first member, but does it mean the actual byte order will be correct also, even if the recieving end is a different platform?
No, you cannot. You can only ever cast "down" to char*, never back to an object pointer:
Source Destination
\ /
\ /
V V
read as char* ---> write as if to char*
In code:
Foo Source;
Foo Destination;
char buf[sizeof(Foo)];
// Serialize:
char const * ps = reinterpret_cast<char const *>(&Source);
std::copy(ps, ps + sizeof(Foo), buf);
// Deserialize:
char * pd = reinterpret_cast<char *>(&Destination);
std::copy(buf, buf + sizeof(Foo), pd);
In a nutshell: If you want an object, you have to have an object. You cannot just pretend a random memory location is an object if it really isn't (i.e. if it isn't the address of an actual object of the desired type).
You may consider using a templatefor this and letting the compiler handle it for you
template<typename T>
struct base_type {
union {
T scalar;
char bytes[sizeof(T)];
};
void serialize(T val, byte* dest) {
scalar = val;
if is_big_endian { /* swap bytes and write */ }
else { /* just write */ }
}
};