Fastest small datastore on Windows

Fastest small datastore on Windows - c++

My app keeps track of the state of about 1000 objects. Those objects are read from and written to a persistent store (serialized) in no particular order.
Right now the app uses the registry to store each object's state. This is nice because:
It is simple
It is very fast
Individual object's state can be read/written without needing to read some larger entity (like pulling out a snippet from a large XML file)
There is a decent editor (RegEdit) which allow easily manipulating individual items
Having said that, I'm wondering if there is a better way. SQLite seems like a possibility, but you don't have the same level of multiple-reader/multiple-writer that you get with the registry, and no simple way to edit existing entries.
Any better suggestions? A bunch of flat files?

If what you mean by 'multiple-reader/multiple-writer' is that you keep a lot of threads writing to the store concurrently, SQLite is threadsafe (you can have concurrent SELECTs and concurrent writes are handled transparently). See the [FAQ [1]] and grep for 'threadsafe'
[1]: http://www.sqlite.org/faq.html/ FAQ

If you do begin to experiment with SQLite, you should know that "out of the box" it might not seem as fast as you would like, but it can quickly be made to be much faster by applying some established optimization tips:
SQLite optimization
Depending on the size of the data and the amount of RAM available, one of the best performance gains will occur by setting sqlite to use an all-in-memory database rather than writing to disk.
For in-memory databases, pass NULL as the filename argument to sqlite3_open and make sure that TEMP_STORE is defined appropriately
On the other hand, if you tell sqlite to use the harddisk, then you will get a similar benefit to your current usage of RegEdit to manipulate the program's data "on the fly."
The way you could simulate your current RegEdit technique with sqlite would be to use the sqlite command-line tool to connect to the on-disk database. You can run UPDATE statements on the sql data from the command-line while your main program is running (and/or while it is paused in break mode).

I doubt any sane person would go this route these days, however some of what you describe could be done with Window's Structured/Compound Storage. I only mention this since you're asking about Windows - and this is/was an official Windows way to do this.
This is how DOC files were put together (but not the new DOCX format). From MSDN it'll appear really complicated, but I've used it, it isn't the worst API in Win32.
it is not simple
it is fast, I would guess it's faster then the registry.
Individual object's state can be read/written without needing to read some larger entity.
There is no decent editor, however there are some real basic stuff (VC++ 6.0 had the "DocFile Viewer" under Tools. (yeah, that's what that thing did) I found a few more online.
You get a file instead of registry keys.
You gain some old-school Windows developer geek-cred.
Other random thoughts:
I think XML is the way to go (despite the random access issue). Heck, INI files may work. The registry gives you very fine grain security if you need it - people seem to forget this when the claim using files are better. An embedded DB seems like overkill if I'm understanding what you're doing.

Do you need to persist the objects on each change event or just in memory and store on shutdown? If so, just load them up and serialize them at the end, assuming your app runs for a long time (and you don't share that state with another program) then in memory is going to be a winner.
If you've got fixed size structures then you could consider just using a memory mapped file and allocate memory from that?

If the only thing you do is serialize/deserialize individual objects (no fancy queries), then use a btree database, for example Berkeley DB. It is very fast at storing and retrieving chunks of data by key (I assume your objects have some id that can be used as a key) and access by multiple processes is supported.

Related

Cross-platform atomic writes/renames without a transactional FS in C++

I'm working on the app that needs to ensure consistency of its data saved to disk. I need to guarantee that the data never gets corrupt when dumped to disk. I.e. a reboot or app shutdown could happen when saving the data.
I know the steps that need to be done:
http://blogs.msdn.com/b/adioltean/archive/2005/12/28/507866.aspx
But I was wondering whether there's already an implementation allowing for this preferably in a cross-platform way? I presume boost::filesystem guarantees atomic rename (on Windows and POSIX), so wondering if I missed this functionality in boost somewhere? Thanks
UPD: I had hopes for boost::interprocess::message_queue but it just hangs on reading the queue if the process is killed in the middle of adding to the queue + memory mapped file takes up maximum size on disk, which is expected to be the worst case anyway.

you can get decrease of performance and/or lose all app data, if you will use renaming. May be store some key information (record ID and fingerprint, for example) after each record, and seek last correct key information when application is starting is better way?

Detecting process memory injection on windows (anti-hack)

Standard hacking case. Hack file type injects into a started process and writes over process memory using WriteProcessMemory call. In games this is not something you would want because it can provide the hacker to change the portion of the game and give himself an advantage.
There is a possibility to force a user to run a third-party program along with the game and I would need to know what would be the best way to prevent such injection. I already tried to use a function EnumProcessModules which lists all process DLLs with no success. It seems to me that the hacks inject directly into process memory (end of stack?), therefore it is undetected. At the moment I have came down to a few options.
Create a blacklist of files, file patterns, process names and memory patterns of most known public hacks and scan them with the program. The problem with this is that I would need to maintain the blacklist and also create an update of the program to hold all avalible hacks. I also found this usefull answer Detecting memory access to a process but it could be possible that some existing DLL is already using those calls so there could be false positives.
Using ReadProcessMemory to monitor the changes in well known memory offsets (hacks usually use the same offsets to achieve something). I would need to run a few hacks, monitor the behaviour and get samples of hack behaviour when comparing to normal run.
Would it be possible to somehow rearrange the process memory after it starts? Maybe just pushing the process memory down the stack could confuse the hack.
This is an example of the hack call:
WriteProcessMemory(phandler,0xsomeoffset,&datatowrite,...);
So unless the hack is a little more smarter to search for the actual start of the process it would already be a great success. I wonder if there is a system call that could rewrite the memory to another location or somehow insert some null data in front of the stack.
So, what would be the best way to go with this? It is a really interesting and dark area of the programming so I would like to hear as much interesting ideas as possible. The goal is to either prevent the hack from working or detect it.
Best regards

Time after time compute the hash or CRC of application's image stored in memory and compare it with known hash or CRC.
Our service http://activation-cloud.com provides the ability to check integrity of application against the signature stored in database.

Store huge amount of data in memory

I am looking for a way to store several gb's of data in memory. The data is loaded into a tree structure. I want to be able to access this data through my main function, but I'm not interested in reloading the data into the tree every time I run the program. What is the best way to do this? Should I create a separate program for loading the data and then call it from the main function, or are there better alternatives?
thanks
Mads

I'd say the best alternative would be using a database - which would be then your "separate program for loading the data".

If you are using a POSIX compliant system, then take a look into mmap.
I think Windows has another function to memory map a file.

You could probably solve this using shared memory, to have one process that it long-lived build the tree and expose the address for it, and then other processes that start up can get hold of that same memory for querying. Note that you will need to make sure the tree is up to being read by multiple simultaneous processes, in that case. If the reads are really just pure reads, then that should be easy enough.

You should look into a technique called a Memory mapped file.

I think the best solution is to configure a cache server and put data there.
Look into Ehcache:
Ehcache is an open source, standards-based cache used to boost
performance, offload the database and simplify scalability. Ehcache is
robust, proven and full-featured and this has made it the most
widely-used Java-based cache.
It's written in Java, but should support any language you choose:
The Cache Server has two apis: RESTful resource oriented, and SOAP.
Both support clients in any programming language.

You must be running a 64 bit system to use more than 4 GB's of memory. If you build the tree and set it as a global variable, you can access the tree and data from any function in the program. I suggest you perhaps try an alternative method that requires less memory consumption. If you post what type of program, and what type of tree you're doing, I can perhaps give you some help in finding an alternative method.
Since you don't want to keep reloading the data...file storage and databases are out of question, but several gigs of memory seem like such a hefty price.
Also note that on Windows systems, you can access the memory of another program using ReadProcessMemory(), all you need is a pointer to use for the location of the memory.

You may alternatively implement the data loader as an executable program and the main program as a dll loaded and unloaded on demand. That way you can keep the data in the memory and be able to modify the processing code w/o reloading all the data or doing cross-process memory sharing.
Also, if you can operate on the raw data from the disk w/o making any preprocessing of it (e.g. putting it in a tree, manipulating pointers to its internals), you may want to memory-map the data and avoid loading unused portions of it.

C++ Benchmark tool

I have some application, which makes database requests. I guess it doesn't actually matter, what kind of the database I am using, but let's say it's a simple SQLite-driven database.
Now, this application runs as a service and does some amount of requests per minute (this number might actually be huge).
I'm willing to benchmark the queries to retrieve their number, maximal / minimal / average running time for some period and I wish to design my own tool for this (obviously, there are some, but I need my own for some appropriate reasons :).
So - could you advice an approach for this task?
I guess there are several possible cases:
1) I have access to the application source code. Here, obviously, I want to make some sort of cross-application integration, probably using pipes. Could you advice something about how this should be done and (if there is one) any other possible solution?
2) I don't have sources. So, is this even possible to perform some neat injection from my application to benchmark the other one? I hope there is a way, maybe hacky, whatever.
Thanks a lot.

See C++ Code Profiler for a range of profilers.
Or C++ Logging and performance tuning library for rolling your own simple version

My answer is valid just for the case 1).
In my experience profiling it is a fun a difficult task. Using professional tools can be effective but it can take a lot of time to find the right one and learn how to use it properly. I usually start in a very simple way. I have prepared two very simple classes. The first one ProfileHelper the class populate the start time in the constructor and the end time in the destructor. The second class ProfileHelperStatistic is a container with extra statistical capability (a std::multimap + few methods to return average, standard deviation and other funny stuff).
The ProfilerHelper has an reference to the container and before exit the destructor push the data in the container.You can declare the ProfileHelperStatistic in the main and if you create on the stack ProfilerHelper at the beginning of a specific function the job is done. The constructor of the ProfileHelper will store the starting time and the destructor will push the result on the ProfileHelperStatistic.
It is fairly easy to implement and with minor modification can be implemented as cross-platform. The time to create and destroy the object are not recorded, so you will not polluted the result. Calculating the final statistic can be expensive, so I suggest you to run it once at the end.
You can also customize the information that you are going to store in ProfileHelperStatistic adding extra information (like timestamp or memory usage for example).
The implementation is fairly easy, two class that are not bigger than 50 lines each. Just two hints:
1) catch all in the destructor!
2) consider to use collection that take constant time to insert if you are going to store a lot of data.
This is a simple tool and it can help you profiling your application in a very effective way. My suggestion is to start with few macro functions (5-7 logical block) and then increase the granularity. Remember the 80-20 rule: 20% of the source code use 80% of the time.
Last note about database: database tunes the performance dynamically, if you run a query several time at the end the query will be quicker than at the beginning (Oracle does, I guess other database as well). In other word, if you test heavily and artificially the application focusing on just few specific queries you can get too optimistic results.

I guess it doesn't actually matter,
what kind of the database I am using,
but let's say it's a simple
SQLite-driven database.
It's very important what kind of database you use, because the database-manager might have integrated monitoring.
I could speak only about IBM DB/2, but I beliefe that IBM DB/2 is not the only dbm with integrated monitoring tools.
Here for example an short overview what you could monitor in IBM DB/2:
statements (all executed statements, execution count, prepare-time, cpu-time, count of reads/writes: tablerows, bufferpool, logical, physical)
tables (count of reads / writes)
bufferpools (logical and physical reads/writes for data and index, read/write times)
active connections (running statements, count of reads/writes, times)
locks (all locks and type)
and many more
Monitor-data could be accessed via SQL or API from own software, like for example DB2 Monitor does.

Under Unix, you might want to use gprof and its graphical front-end, kprof. Compile your app with the -pg flag (I assume you're using g++) and run it through gprof and observe the results.
Note, however, that this type of profiling will measure the overall performance of an application, not just SQL queries. If it's the performance of queries you want to measure, you should use special tools that are designed for your DBMS - for example, MySQL has a builtin query profiler (for SQLite, see this question: Is there a tool to profile sqlite queries? )

There is a (linux) solution you might find interesting since it could be used on both cases.
It's the LD_PRELOAD trick. It's an environment variable that let's you specify a shared library to be loaded right before your program is executed. The symbols load from this library will override any other available on the system.
The basic idea is to this custom library as a wrapper around the original functions.
There is a bunch of resources available that explain how to use this trick: 1 , 2, 3

Here, obviously, I want to make some sort of cross-application integration, probably using pipes.
I don't think that's obvious at all.
If you have access to the application, I'd suggest dumping all the necessary information to a log file and process that log file later on.
If you want to be able to activate and deactivate this behavior on-the-fly, without re-starting the service, you could use a logging library that supports enabling/disabling log channels on-the-fly.
Then you'd only need to send a message to the service by whatever means (socket connection, ...) to enable/disable logging.
If you don't have access to the application, then I think the best way would be what MacGucky suggested: let the profiling/monitoring tools of the DBMS do it. E.g. MS-SQL has a nice profiler that can capture requests to the server, including all kinds of useful data (CPU time for each request, IO time, wait time etc.).
And if it's really SQLite (plus you don't have access to the source) then your chances are rather low. If the program in question uses SQLite as a DLL, then you could substitute your own version of SQLite, modified to write the necessary log files.

Use the apache jmeter.
To test performances of your sql queries under high load

Decreasing performance writing large binary file

In one of our softwares we are creating records and storing them in a binary file. Once the writing operation is completed we read back this binary file. The issue is if this binary file is less than 100 MB then its performance is good enough, but once this file grows larger its performance is hit.
So, I thought of splitting this large binary file ( > 100 MB) into smaller ones ( < 100 MB). But it seems this solution is not gaining the performance. So, I was just thinking what can be the better approach to handle this scenario?
It will be really great help from you guys to comment on this.
Thanks

Maybe you could try using an Sqlite database instead.

It is always quite the difficult to provide accurate answers with only a glimpse of the system, but have you actually tried to check the actual throughput ?
As a first solution, I would simply recommend using a dedicated disk (so there are no concurrent read/write actions from other processes), and a fast one at that. This way it would be just some cost of hardware upgrade, and we all know hardware is usually cheaper that software ;) You may even go to a RAID controller for maximizing throughput.
If you are still limited by the disk throughput, there are new technologies out there using the Flash technology: USB keys (though it may not seem very professional) or the "new" Solid State Drives may provide more throughput than a mechanical disk.
Now, if the disks approach are not fast enough or you can't get your hands on good SSDs, you have other solutions, but they involve software changes, and I propose them off the top of my hat.
A socket approach: the second utility is listening on a port and you send it the data there. On a local machine it's relatively fast, and you parallelize the work too, so even if the size of the data grows, you will still begin treating fairly quickly.
A memory mapping approach: write to a dedicated area in live memory and have the utility read from that area (Boost.Interprocess may help, there are other solutions).
Note that if the read is sequential, I find it more "natural" to try a 'pipe' approach (ala Unix) so that the two processes execute concurrently. In a traditional pipe, the data may not hit the disk after all.
A shame, isn't it, that in this age of overwhelming processing power, we are still struggling with our disk IO ?

If your App is reading the data sequential migrating to a DB would not help to increase performance. If random access is used you should consider to move the data into a DB,especially if different indices are used. You should check whether enough resources are available, if loaded completly into memory virtual memory management could have an impact to performance (swapping,paging). Depending on your OS setting a limit for file io buffers could be reached. The file system itself could be fragmented.
To get a higer quality answer you should provide informations about hardware,os,memory and file system. And the way your data file is used. Than you could get hints about kernel tuning etc.

So what is the retrieval mechanism here? How does your application know which of the smaller files to look in to find a record? If you have split up the big file without implementing some form of keyed lookup - indexing, partitioning - you have not addressed the problem, just re-arranged it.
Of course, if you have implemented some form of indexing then you have started down the road of building your own database.
Without knowing more regarding your application it would be rash for us to offer specific advice. Maybe the solution would be to apply an RDBMS solution. Possibly a NoSQL approach would be better. Perhaps you need a text indexing and retrieval engine.
So...
How often does your application need to retrieve records? How does it decide which records to get? What is your definition of poor performance? Why did you (your project) decide to use flat files rather than a database in the first place? What sort of records are we talking about?

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js