I have a situation where I'm parsing an xml file in c++(using libxml) and with the extracted info I'm creating a data structure on the fly & modifying the D.S according to the further extracted info from the parsed file. Now I need to save the D.S as it is, in secondary memory and I want to retrieve back the D.S from memory later so that I can continue working further without the need of creating the D.S once again. Can someone please help me out on how to do this?
I suggest using a library like Boost.Serialization for this.
I assume that 'secondary memory' is a hard drive or something. In that case use fwrite and fread, or in C++ overload << and >> if you want. How you do this depends on your data structure, if it has members that are pointers then things get a bit more complicated.
There's not enough info in your question to really help you.
Related
I am creating a large map in messagepack using c++. I need multiple languages to be able to have access to the data.
How would I store this as a string in redis? Is there an idiomatic way to put this in memory or should I use the following?
msgpack::packer<msgpack::sbuffer> pk2(&buffer2);
pk2.pack_map(2);
pk2.pack(std::string("x"));
pk2.pack(3);
pk2.pack(std::string("y"));
pk2.pack(3.4321);
Redox rdx;
rdx.connect()
rdx.command<int>({"rpush", "key_name", buffer2.data()})
Sensible depends on what you're trying to achieve. You didn't explain why you're using the Redis List data structure to store your msgpack data, so unless there's some unspecified reason to do so, I'd go with simple Strings.
Also, the example provided does not make sense IMO because you're not providing the key name to rpush into.
Edit: thanks for correcting the snippet
Lastly, if you're using msgpack for your data, you can do really interesting things with Lua scripting as Redis provides the cmsgpack library to manipulate packed messages.
I'm using a binary file to recover an object using boost::binary_iarchive_ia but it is too heavy (18GB) and that object loads the entire file to memory. Is there a way to read the file by parts (a lazy load) to avoid the memory use?
What I have:
std::ifstream ifs(filename);
boost::archive::binary_iarchive_ia(ifs);
MyObject obj;
ia >> obj;
Upgrading my comment to an answer:
#cmaster got really close to an approach that can workm but he accidentally put the problem upside down.
The raw file was never the issue (it was streaming all along).
The problem is that deserialization tries to put the data all in memory (the vector, e.g.). So the only real solutions would be to
is to put this data into a (shared?) memory map. You can use the allocators from Boost Interprocess to help you achieve this. This is a lot of effort, but relatively straight forward, conceptually.
one could modify the deserialization code to convert to a different on-disk format on the fly (instead of inserting into e.g. that vector), which would then allow mmap as cmaster suggested it.
In other words, you'd "canibalize" the boost serialization implementation to migrate the data away from boost serialization towards a raw binary format that affords using it directly in mapped memory.
You can use mmap() to map the file into your address space. With that, it doesn't matter that the file is too large because the kernel knows that any data in the mapped region is just a copy of the file on the hard disk. Consequently, it does not even need to swap the data out when it needs the memory for something else. The kernel will just lazily load the parts of the file that you need as you touch them, which is especially good if you don't need everything in the file.
The nice thing about mmap() is that you have the entire file contents accessible as a huge char array, which is quite convenient for many use cases. The only precondition that must be met is that your process runs as a 64 bit process, otherwise your virtual address space will be too small to fit the file into it.
I have an application that takes a handle and performs some tasks. The handle is currently being created with CreateFile. My problem is that CreateFile takes a filepath as one of the arguments. I am looking for a way to return a handle from a byte array because the data in I need to process is not on disk. Does anyone know of any functions that take a byte array and return a handle or how I would go about doing this?
You have a few choices:
re-design your processing logic to read data from a memory pointer instead of a HANDLE, then you can pass your byte array as-is to your processing logic. In case you also need to process a file, you can read the file data into a byte array, then process it accordingly.
re-design your processing logic to read data from an IStream interface, then you can use SHCreateStreamOnFileEx() and SHCreateMemStream(), like Jonathan Potter suggested.
if you must read data from a HANDLE using ReadFile() or related function, you can either:
a. write your byte array to a temp file, then read back from that file.
b. create an anonymous pipe using CreatePipe(), then write the byte array to one end and read the data from the other end, like Harry Johnston suggested.
Using CreateFile() with the FILE_ATTRIBUTE_TEMPORARY attribute allows the operating system to keep the file in memory. You still have a copy happening as you have to write your memory buffer to the file, and then read that data back from that file, but if you have enough cache memory, nothing will hit the hard drive.
See for more details here:
CREATEFILE2_EXTENDED_PARAMETERS structure | Caching Behavior
It is not impossible that you could also use file mapping where the data written to the file is forced to stay in memory, but that's a lot more complicated for probably no gain as it is not unlikely going to be slower overall.
I have a file in binary format having a large amount of data.
If I have knowledge of the file structure, how do I read information from the binary file, and populate a record of these structures?
The data is complex.
I'd like to do it with Qt, but I would do it in C++ as well, if required.
Thanks for your help..
If the binary file is really large then it’s better to load it as (char*) array if enough RAM is available via low level read function http://crasseux.com/books/ctutorial/Reading-files-at-a-low-level.html
and then you can parse it.
But this will only help you to load large files, not to parse complex structures.
Not sure, but you could also take a look at yacc.
This doesn't sound like yacc will be a solution, he isn't trying to parse the file, he wants to read binary formatted data to a data structure.
You can read the data in and then map it to a struct that matches the format. If the data is complex you may need to lay structs over it in a variety of ways depending on how the data layout works. So basically read the file into a char* or and then select the element where your struct starts, cast that element to a pointer to your stuct and then access the element. Without more detail it's impossible to be more specific than this.
http://courses.cs.vt.edu/~cs2604/fall00/binio.html would be help for you. I've learned from there. (hint always cast your data as(char*) ).
I am a beginning C++ student. I have a structure array that holds employee info.
I can put values into the structure, write those values to a binary dat file and
read the values back into the program so that it can be displayed to the console.
Here is my problem. Once I close the program, I can't get the file to read the data from the file back into memory - instead it reads "garbage."
I tried a few things and then read this in my book:
NOTE: Structures containing pointers cannot be correctly stored to
disk using the techniques of this section. This is because if the
structure is read into memory on a subsequent run of the program, it
cannot be guaranteed that all program variables will be at the same
memory locations.
I am pretty sure this is what is going on when I try to open a .dat file with previously stored information and try to read it into a structure array.
I can send my code examples if that would help clarify my question.
Any suggestions would be appreciated.
Speaking generally (since I don't have your code) there's two reasons you generally shouldn't just write the bytes of a struct or class to a file:
As your book mentioned, writing a pointer to disk is pointless, since you're just storing a random address and not the data at that address. You need to write out the data itself.
You especially should not attempt to write a struct/class all at once with something like
fwrite(file, myStruct, sizeof(myStruct)). Compilers sometimes put empty bytes between variables in structs in order to let the processor read them faster - this is called padding. Using a different compiler or compiling for a different computer architecture can pad structures differently, so a file that opens correctly on one computer might not open correctly on another.
There's lots of ways you can write out data to a file, be it in a binary format or in some human-readable format like XML. Regardless of what method you choose to use (each has strengths and weaknesses), every one of them involves writing each piece of data you care to save one by one, then reading them back in one by one. Higher level languages like Java or C# have ways to do this automagically by storing metadata about objects, but this comes at the cost of more memory usage and slower program execution.