Key-Value DB (an alternative to Berkeley DB?) [closed] - c++

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed last month.
Improve this question
I'm looking for a hashmap on disk (Berkeley DB would fit exactly, but for the licensing problem).
The requirements are:
FOSS w/a commercial-friendly license (can be used in commercial applications without a fee)
a C/C++ interface
Embeddable
decent speed? faster than SQLite would be ideal
cross-platform would also be nice
Any suggestions welcome.
Thanks!

There have been a couple of recent options, both of which provide a byte[] -> byte[] map, atomic batch updates, and are BSD licensed:
Leveldb from Google.
RocksDB from Facebook, which is based on a fork of Leveldb, and claims to provide higher performance on SSD backed storage.

Although your application domain and data specifications are not clear; RocksDB, which is a recent solution for embedded persistent key-value storage, seems a fit for you. Benchmarks by Facebook show that it has better performance than LevelDB for server workloads and especially with data larger than RAM capacity. Also it is open-sourced under BSD license. You can find RocksDB C++ examples and more detail from here.

How about the *dbm libraries?
dbm ndbm gdbm sdbm tdbm and friends
Plenty to choose from.

If we are strict, none of the suggested mainstream options are hash-maps. Such structures are rarely used for persistent storage because a good hash function would route to different buckets, even for similar keys. Similar keys generally imply related data that is often accessed together, so you want to avoid putting them on remote parts of the disk. Random read latency may be too high.
The most famous matching implementation would be FASTER from Microsoft.

Related

what is the limitation of Libmicrohttpd? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
I want to develop a http server based on the library Libmicrohttpd.
And I m wondering what is the limitation number of users connecting at the same time that Libmicrohttpd can support.
Well, it depends on a number of factors:
As HTTP works on TCP, so you need to figure out how many TCP connections your server would be able to support at one time. I'd suggest to do some benchmarks to get an idea. You may use Apache Bench and/or Apache JMeter. Or, you may write your own benchmark app using libcurl.
The other thing is the number of sockets your OS can support. Depending on the OS, you may need to tweak / tune those values. On Linux, you may use ulimit command. And, on Windows, you may need to configure registry values.
The other important thing is the payload that a connection may bring in and the processing that the server has to do. You need to do benchmarks for some predefined amount of data (say, 64KB, 1MB, etc.). In this context, you might want to process all the data ASAP. Sockets have backlogs with fixed sizes. Those need to be configured also. That means you'd be needing more memory so bigger RAM sizes or some fine-tuning of OS stuff also be there. So, memory here is a bottleneck.
The connection timeouts are also important but you need to think about that if you want to consider those in your benchmarks or not. Depends on the handling of connections by your server.
You may also take a look at c10k to get a general idea. See this relevant article too.
These are the things that I could come up with at the moment. I'll update my answer if I find anything else.
Hope this helps!

Looking for C++ datawarehousing for time series data [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
I need a C++ library that can store and retrieve time series on demand to stream to client front-ends. I will be storing each component as structure of arrays format. I am currently using MySQL for correctness, but the DB access is starting to get ridiculously slow. I am trying to migrate away from this. Intuitively I can build such a library but it is not my business goal and will take quite a bit of implementation to get working. I am looking for an existing solution that can meet the following requirements:
O(1) lookup scheme
Excellent compression, each component is separated, so there should be plenty of redundancy that can be removed
Scalable to terabytes
(optional: Audit tracking)
Most important: Transactional support. There is going to be BIG data, and I can't have the possibility of a bad run corrupt an entire dataset which will create an unnecessary burden for backups and downtime during restores.
Also checkout TempoDB: http://tempo-db.com I'm a co-founder, and we built the service to solve this problem. We don't have a C++ client yet, but could work with you to develop one.
Take a look at OpenTSDB it's been develop at StumbleUpon by Benoit Sigoure:
http://opentsdb.net/
TeaFiles provide simple and efficient time series storage in flat files, enriched with item metadata and description. They might be a building block of the system you aim for. Currently free open source libraries exist for C++ (github.com/discretelogics/TeaFiles), C# and Python.
I am a founder of discretelogics and we coined this file format to overcome litations flat file time series storage while preserving its unrivaled speed.
Take a look at HDF5. It has a quick lookup scheme, has C, C++, Python interfaces. Has compression. Can get pretty big. Maintains metadata. Doesn't do auditing. You'll need a wrapper to handle multi-user capability.

Implementing licensing checking library [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
I am working on a small cross platform product for Windows and Mac written in C++/Obj-C. I have been asked to implement a licensing module for the same. This task is part of a very ambitious project to introduce licensing for all our products. At the end of it, we will have a complete licensing scheme where we will be able to sell licenses to our customers which support yearly renewals, license levels, etc. My problem is that I do not know the first thing about implementing license checkers. Can any one point me to some how-to's for the same? Are there any open source licensing modules around that I can study?
I use a system of Partial Key Verification (PKV), and I've implemented this in C# with a PHP generator. Google will come up with various hits, explanations, and implementations; but Brandon Staggs wrote a good overview (albeit in Delphi!), here:
http://www.brandonstaggs.com/2007/07/26/implementing-a-partial-serial-number-verification-system-in-delphi/
PKV works by encoding certain information (license type, serial number product, date, etc) in the key along with a hash of the user name, and hashes of the encoded information. Much of the key actually consists of multiple one char hashes. The idea is that you only check a subset of these hashes. The exact subset that issued can be changed over time for some security and to protect against certain kinds of reverse engineering.
I would also encrypt the key to help obfuscate what each char in the license means. Otherwise someone with multiple keys might determine certain char positions mean certain things ("oh, chars 3-4 are the serial number"). This might be a chink in your armour!
Any license system you develop is going to be imperfect. It will be crackable, and if your products are popular, will be cracked. However there's a strong argument that a license system exists to keep the honest people honest, and produce enough hurdles for the slightly dishonest people - but not so many hurdles that it becomes too much of an inconvenience (eg. I'm generally against hardware locking). Those who do hack your system probably weren't going to pay for it anyway.

C/C++ Libraries for reading from Universal Disk Format devices or files [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
Are there any good free C/C++ libraries that enable reading from common devices with filesystems such as UDF, and ISO9660 and extracting files/metadata etc.?
So far all I've been able to find is GNUs libcdio which is promising, and some "Magic UDF" which has so many hits I'm disgusted, pushes other results in Google, and comes with a pretty extreme looking price tag.
Cross-platform support is preferable (personal preference of course), and Windows compatibility is an unfortunate requirement. The less restrictive the license, the better, I have yet to investigate how compatible libcdio's GPLv3 license is.
Note this question is still open, I'll accept another answer if someone locates such a library.
After extensive investigation, I ended up rolling my own solution to perform the operations on UDF that I required. I'm unable to open the source, in all it was about 800 lines of C++. However here are several links which got me through:
The reference standard on which UDF is built
Universal Disk Format specification 2.60
Brief introduction to UDF
Wikipedia Page
UDF Verifier tool (you must sign up for access to this)
A few words of warning: Previous experience implementing ISO9660/ECMA-119 helped me significantly. Knowledge of how block devices operate and interface with the operating system is helpful. Information surrounding the physical layout and separation of sessions is somewhat mythical and difficult to grok.
See: http://www.thefreecountry.com/sourcecode/cpp.shtml
There are a lot of open source library for this but reliability is question.
On Windows You can use Image Mastering API. It comes with Window SDK , Work on both XP & Vista
http://msdn.microsoft.com/en-us/library/aa364806%28VS.85%29.aspx
7-Zip supports extracting files from UDF and ISO disk images, and is mostly LGPL licensed. Specifically, the UDF implementation code appears to be in CPP/7zip/Archive/Udf/UdfIn.cpp.

Distributed shared memory library for C++? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
I am writing a distributed application framework in C++. One of the requirements is the provision of distributed shared memory. Rather than write my own from scratch (and potentially re-invent the wheel), I thought I would see if there were any pre-existing Open source libraries - a quick google search did not yield anything to useful.
Does anyone on here have any experience of a good C++ DSM library that they can recommend?
Ideally, the library will support MRMW (Multiple readers/multiple writers), but I can make do with MRSW (Multiple readers, single writer) if need be. I am developing on Linux.
Ace shared memory is for sharing on 1 platform.
Distributed Shared Memory is very much non-trivial as there are issues regarding transactionality to solve. To effectively use Distributed Shared Memory (even for a copy) you will find you need (among other things) distributed synchronization algorithms and protocols that need resiliency in the face of failure. (Shshooot! aint that always the way!)
Significant research papers have been written about these issues (see some of the chapter bibliographies of Taubenfield's book)
This is really a warning that "rolling your own" will be a significant project in and of itself.
Have you considered memcached ?
It is network distributed and it can be really fast.
It has bindings for lots of languages, you can access it from different OS and supports multiple writers multiple readers.
Try the ACE library, it has a lot of good stuff you'll like. They have a Shared_memory class in there but I'm not sure its a DSM - if not, they have plenty of other network/distributed stuff you might find interesting.