boost property tree performance - c++

I am planning to use boost property tree for our application http://www.boost.org/doc/libs/1_41_0/doc/html/property_tree.html. Now I wonder, everytime we call this method pt.get("debug.level", 0); does it read the whole file again or the value is served form internal cache. Is there any performance evaluation result of this library? Does it read the whole file in memory and serves the data from there? Anybody can share their experience using this library?

The library works well. You load the file into memory, operate on the property tree (query, update, whatever), and then write it out again when you finish.
We have used it for some JSON files large enough to run out of address space when loading them on a 32 bit machine using a boost::property_tree with std::string. Replacing std::string with a caching string class worked fine.
For most applications where you're really just looking at configuration files it will be fine.

Related

QT Applications - Replacing embedded resources

Is it possible to replace embedded resources [e.g. styles, images, text] in a Linux [ELF] binary?
I noticed that I can change text but if I type more text or if I remove text, then the segmentation faults start coming up. I have not gone through the ELF spec yet but I am wondering if it is possible.
I managed to extract the images from the binary using the mediaextract
project but I need to do just the opposite without breaking the binary structure.
This answer is specific for Qt's resource system (.qrc, rcc).
From the docs:
Currently, Qt always stores the data directly in the executable, even on Windows, macOS, and iOS, where the operating system provides native support for resources. This might change in a future Qt release.
So yes, the Qt resources are contained in the binary.
rcc'ing a .qrc file yields a .cpp file containing (mainly) simple char arrays which represent resource data, the resource names and some other metadata.
Compiling such a .cpp file creates byte fields in the binary.
You can alter such resources within a binary, but only in very limited ways.
For starters, if the binary contains any kind of self-check (like hashing the data section and comparing it to some pre-calculated hash), you will not be able to change the data in a reasonable way.
If your data doesn't have the same byte length as the original data, you can't simply replace it because it would alter the internal layout of the binary and invalidate relative addresses.
In case of replacing with shorter strings you might get away with zero-padding at the end.
Resources are compressed by default (in the ZIP format). It is possible to turn off compression.
If compression was turned on during compilation (which you don't control, as it seems), you'd need to create new data which compresses to the same length as the original.

Modifying executable upon download (Like Ninite)

I'm currently developing an application (Windows) that needs internal modifications upon download time.
Also, I'm delivering it using a Linux host, so, can't compile on demand as proposed.
How does Ninite deal with it?
In Ninite.com, each time you select different options, you get the same .exe, however, with minor modifications inside.
Option 1
Compile the program with predefined data (in Windows).
Use PHP to fseek the file and replace my custom strings.
Option 2
Append the original .EXE with a different resource file
Other?
Has someone developed something like this? What would be the best approach?
Thank you.
You can just append data to the back of your original executable. The Windows PE file format is robust enough that this does not invalidate the executable itself. (It will however invalidate any existing digital signatures.)
Finding the start of this data can be a challenge if its size isn't known up front. In that case, it may be necessary to append the variable-length data, and then append the data length (itself a fixed length field - 4 bytes should do). To read the extra data, read the last 4 bytes to get the data length. Get the file length, subtract 4 for the length field, then subtract the variable length to get the start of the data.
The most portable way could be to have a plugin (whose path in wired inside your main program) inside your application. That plugin would be modified (e.g. on Linux by generating C++ code gencod.cc, forking a g++ -Wall -shared -fPIC -O gencod.cc -o gencod.so compilation, then dlopen-ing the ./gencod.so) and your application could have something to generate the C++ source code of that plugin and to compile it.
I guess that the same might be doable on Windows (which I don't know). Probably the issue is to compile it (the compilation command would be different on Windows and on Linux). Beware that AFAIK on Windows a process cannot modify its own executable (but you should check).
Qt has a portable layer for plugins. See QPluginLoader & Qt Plugins HowTo
Alternatively, don't modify the application, but use some persistent file or data (at a well defined place, -whose location or filepath is wired in the executable- preferably in textual format like JSON, or maybe using sqlite, or a real database) keeping the changing information. Read also about application checkpointing.
If you need to implement your specific application checkpointing, you'll better design your application very early with this concern. Study garbage collection algorithms (a checkpointing procedure is similar to a precise copying GC) and read more about continuations. See also this answer to a very similar question.

Is it a good idea to include a large text variable in compiled code?

I am writing a program that produces a formatted file for the user, but it's not only producing the formatted file, it does more.
I want to distribute a single binary to the end user and when the user runs the program, it will generate the xml file for the user with appropriate data.
In order to achieve this, I want to give the file contents to a char array variable that is compiled in code. When the user runs the program, I will write out the char file to generate an xml file for the user.
char* buffers = "a xml format file contents, \
this represent many block text \
from a file,...";
I have two questions.
Q1. Do you have any other ideas for how to compile my file contents into binary, i.e, distribute as one binary file.
Q2. Is this even a good idea as I described above?
What you describe is by far the norm for C/C++. For large amounts of text data, or for arbitrary binary data (or indeed any data you can store in a file - e.g. zip file) you can write the data to a file, link it into your program directly.
An example may be found on sites like this one
I'll recommend using another file to contain data other than putting data into the binary, unless you have your own reasons. I don't know other portable ways to put strings into binary file, but your solution seems OK.
However, note that using \ at the end of line to form strings of multiple lines, the indentation should be taken care of, because they are concatenated from the begging of the next lineļ¼š
char* buffers = "a xml format file contents, \
this represent many block text \
from a file,...";
Or you can use another form:
char *buffers =
"a xml format file contents,"
"this represent many block text"
"from a file,...";
Probably, my answer provides much redundant information for topic-starter, but here are what I'm aware of:
Embedding in source code: plain C/C++ solution it is a bad idea because each time you will want to change your content, you will need:
recompile
relink
It can be acceptable only your content changes very rarely or never of if build time is not an issue (if you app is small).
Embedding in binary: Few little more flexible solutions of embedding content in executables exists, but none of them cross-platform (you've not stated your target platform):
Windows: resource files. With most IDEs it is very simple
Linux: objcopy.
MacOS: Application Bundles. Even more simple than on Windows.
You will not need recompile C++ file(s), only re-link.
Application virtualization: there are special utilities that wraps all your application resources into single executable, that runs it similar to as on virtual machine.
I'm only aware of such utilities for Windows (ThinApp, BoxedApp), but there are probably such things for other OSes too, or even cross-platform ones.
Consider distributing your application in some form of installer: when starting installer it creates all resources and unpack executable. It is similar to generating whole stuff by main executable. This can be large and complex package or even simple self-extracting archive.
Of course choice, depends on what kind of application you are creating, who are your target auditory, how you will ship package to end-users etc. If it is a game and you targeting children its not the same as Unix console utility for C++ coders =)
It depends. If you are doing some small unix style utility with no perspective on internatialization, then it's probably fine. You don't want to bloat a distributive with a file no one would ever touch anyways.
But in general it is a bad practice, because eventually someone might want to modify this data and he or she would have to rebuild the whole thing just to fix a typo or anything.
The decision is really up to you.
If you just want to keep your distributive in one piece, you might also find this thread interesting: Store data in executable
Why don't you distribute your application with an additional configuration file? e.g. package your application executable and config file together.
If you do want to make it into a single file, try embed your config file into the executable one as resources.
I see it more of an OS than C/C++ issue. You can add the text to the resource part of your binary/program. In Windows programs HTML, graphics and even movie files are often compiled into resources that make part of the final binary.
That is handy for possible future translation into another language, plus you can modify resource part of the binary without recompiling the code.

Methods of storing application data/settings without the registry?

I need some methods of storing and getting data from a file (in WIN32 api c++ application, not MFC or .NET)
e.g. saving the x, y, width and height of the window when you close it, and loading the data when you open the window.
I have tried .ini files, with the functions -- WritePrivateProfileString and ReadPrivateProfileString/Int, but on MSDN it says
"This function is provided only for compatibility with 16-bit Windows-based applications. Applications should store initialization information in the registry."
and when i tried on my Windows7 64bit machine to read a ini file, i got blue screen! (in debug mode with visual studio) O.O
I notice that most other application use XML to store data, but I don't have a clue how to read/write xml data in c++, are there any libraries or windows functions which will allow me to use xml data?
Any other suggestions would be good too, thanks.
There is nothing wrong with .ini files, the only problem with them is where to write them. CIniFile from CodeProject is good enough class. Ini file should be placed in %APPDATA%/<Name Of Your Application> (or %LOCALAPPDATA%\<Same Name Here>, as described below).
EDIT: If we are talking about Windows family of operating systems from Windows 2000 onward then function SHGetFolderPath is portable way to retrieve user specific folder where application configuration files should be stored. To store data in romaing folder use CSIDL_APPDATA with SHGetFolderPath. To store data to local folder use CSIDL_LOCAL_APPDATA.
The difference between local and roaming folder is in the nature of the data to be stored. If data is too large or machine specific then store it in local folder. Your data (coordinates and size of the window) are local in nature (on other machine you may have different resolution), so you should actually use CSIDL_LOCAL_APPDATA.
Windows Vista and later have extended function SHGetKnownFolderPath with its own set of constants, but if you seek compatibility stick to the former SHGetFolderPath.
TinyXML is a popular and simple XML parser for C++.
Apart from that, you can really use any format you want to store your settings, though it's considered good practice to keep settings in text format so that they can be hand-edited if necessary.
It's fairly simple to write your own functions for reading/writing a file in INI or similar format. The format is entirely up to you, as long as it's easily comprehensible to humans. Some possibilities are:
; Comment
# Comment
Key = Value (standard INI format)
Key Value
Key: Value
You could use Boost.PropertyTree for this.
Property trees are versatile data
structures, but are particularly
suited for holding configuration data.
The tree provides its own,
tree-specific interface, and each node
is also an STL-compatible Sequence for
its child nodes.
It supports serialization, and so is well-suited to managing and persisting changeable configuration data. There is an example here on how to load and save using the XML data format that this library supports.
The library uses RapidXML internally but hides the parsing and encoding details, which would save you some implementation time (XML libraries all have their idiosyncracies), while still allowing you to use XML as the data representation on disk.
libxml2. I have seen quite a lot places where it is used. Easy to use and loads of examples to get you started and not a vast library as such. And in C, take it wherever you want.
pugixml is another good (and well documented) XML parser library. And If you like portability XML is a better option.
While INI files may not be the best format, I think you can safely ignore the warning MSDN puts on WritePrivateProfileString and ReadPrivateProfileString.
Those two functions are NEVER going away. It would break THOUSANDS of applications.
That warning has been there for years and I suspect was added when the registry was all the rage and someone naively thought it would one day completely replace INI files.
I might be wrong but it would be very unlike Microsoft to break so many existing apps like this for no good reasons. (Not that they do not occasionally break backwards compatibility, but this would cause huge problems for zero benefit.)
Ohhh My GOD? Have you ever thought of stright-forward solution rather then thinking of Super-Duper-all-can-do framework way?
Sorry...
You want to store two numbers between restarts???
Save: Open a file, write these two numbers, close the file:
std::ifstream out(file_name);
out << x << ' ' << y;
out.close();
Load: Open a file, read these two numbers, close the file:
std::ifstream in(file_name);
if(!in) return error...
in >> x >> y;
if(!in) return error...
in.close();
Libconfig is the best solution in C++ as far as I have tried.
Works multi platform with minimum coding.
You must try that!
I like the TinyXML solution suggested.
But for Windows, I like .ini even more.
So I'll suggest the inih library, free and open source on GitHub here. Very simple and easy to use - 1 header file library iirc.

Out of Core Implementation of a Quadtree

I am trying to build a Quadtree data structure(or let's just say a tree) on the secondary memory(Hard Disk).
I have a C++ program to do so and I use fopen to create the files. Also, I am using tesseral coding to store each cell in a file named with its corresponding code to store it on the disk in one directory.
The problem is that after creating about 1,100 files, fopen just returns NULL and stops creating new files. I can create further files manually in that directory, but using C++ it can not create any further files.
I know about max limit of inode on ext3 filesystem which is (from Wikipedia) 32,000 but mine is way less than that, also note that I can create files manually on the disk; just not through fopen.
Also, I really appreciate any idea regarding the best way to store a very dynamic quadtree on disk(I need the nodes to be in separate files and the quadtree might have a depth of 50).
Using nested directories is one idea, but I think it will slow down the performance because of following the links on the filesystem to access the file.
Thanks,
Nima
Whats the errno value of the failed fopen() call?
Do you keep the files you have created open? If yes you are most probably exceeding the maximum number of open files per process.
When you use directories as data structures, you delegate the work of maintaining that structure to the file system, which is not necessarily designed to do that.
Edit: Frank is probably right that you'v exceeded the number of available file descriptors. You can increase those, but that shows that you're also using internals of your ABI as a data structure. Slow and (as resources are exhausted) unstable.
Either code for a very specific OS installation, or use a SQL database.
I have no idea why fopen wouldn't work. Look at errno.
However, storing everything in one directory is a bad idea. When you add a lot of files, it will get slow. Having a directory for every level of the tree will also be slow.
Instead, combine multiple levels into one directory. You could, for example, have one directory for every four levels of the tree. This would limit the number of directories, amount of nesting, and number of files per directory, giving very good performance.
The limitation could come from:
stdio (C library). most 256 handles. Can be increased to 1024 (in VC, call _setmaxstdio)
OS kernel on the file hanldes per process (usually 1024).