Looking for C++ datawarehousing for time series data [closed] - c++

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
I need a C++ library that can store and retrieve time series on demand to stream to client front-ends. I will be storing each component as structure of arrays format. I am currently using MySQL for correctness, but the DB access is starting to get ridiculously slow. I am trying to migrate away from this. Intuitively I can build such a library but it is not my business goal and will take quite a bit of implementation to get working. I am looking for an existing solution that can meet the following requirements:
O(1) lookup scheme
Excellent compression, each component is separated, so there should be plenty of redundancy that can be removed
Scalable to terabytes
(optional: Audit tracking)
Most important: Transactional support. There is going to be BIG data, and I can't have the possibility of a bad run corrupt an entire dataset which will create an unnecessary burden for backups and downtime during restores.

Also checkout TempoDB: http://tempo-db.com I'm a co-founder, and we built the service to solve this problem. We don't have a C++ client yet, but could work with you to develop one.

Take a look at OpenTSDB it's been develop at StumbleUpon by Benoit Sigoure:
http://opentsdb.net/

TeaFiles provide simple and efficient time series storage in flat files, enriched with item metadata and description. They might be a building block of the system you aim for. Currently free open source libraries exist for C++ (github.com/discretelogics/TeaFiles), C# and Python.
I am a founder of discretelogics and we coined this file format to overcome litations flat file time series storage while preserving its unrivaled speed.

Take a look at HDF5. It has a quick lookup scheme, has C, C++, Python interfaces. Has compression. Can get pretty big. Maintains metadata. Doesn't do auditing. You'll need a wrapper to handle multi-user capability.

Related

How should i start learning code of any cryptocurrency? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 1 year ago.
Improve this question
I want to learn about the code of cryptocurrency with all it's features including POS and master node features, currently I have XSN code (stake-net coin)
and i want to learn it so i can make use of it to learn different features of blockchain. There is no purpose to clone it or anything. How should i start learning it? I mean from which file should i start learning the code. I have learned basics of c++ but unfortunately I'm not that much good with c++. So from which file should i start learning it there is a lot .cpp and header files. Is there any one can had the same experience learning it?
Well, you should not start learning by looking at someone else's source code.
The only real way to learn about blockchain programming is to take isolated problems and try to implement it yourself in minimum examples.
You can start by coding each one of them separately in its own example application:
Blockchain data structures and their serialisation (network / disk)
Storing block data into rolling binary blob file on disk containing the serialised blocks, while at the same time having some sort of indexed database for looking up block hashes and getting their disk-position of the block when a block needs to be "loaded" into memory.
P2P networking component, where you organise your unstructured P2P environment under the premise that most nodes will have a limit on inbound socket connection or be behind a NAT
In the same context you can dig into asynchronous network programming and how to properly do it with select() / epoll()
Proof of work scaling mechanism, which comes up with a hash target value depending on the time that was needed for the last X blocks
"Dominating chain" connector, where the "predominant" chain gets selected out of many multiple chain candidates (forking)
When you are done understanding these first simple building blocks, you can think about the next step; the actual functions of the Blockchain like maintaining balances and transferring coins.

Options for hot deployment [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
My requirement is to seamlessly hot deploy code update to a running service without losing the current status, including collection data. Is there any c++ framework out there I can use to develop such a solution?
You probably should read some research papers on dynamic software updating, e.g. on Kitsune (which you might use)
There is a major issue about updating the call stack (and instances in local variables); read also about continuations; and you might have some special case (if your application is event loop driven like most GUI applications are, you probably want to update the code outside of event handlers).
You certainly should think of dynamic software update very early in your design. Perhaps some terminology and concepts from garbage collection & persistence & serialization techniques are relevant.
Your requirement (to seamlessly hot deploy code update to a running service without losing the current status) is very hard and will need a lot of work (probably years) and is still a difficult & interesting research topic (definitely it is a good PhD subject).
You might want to use your own meta-programming techniques, that is generate most of the relevant C+++ support code by your own code generators.
If you already have a significant code base, you could consider customizing a recent GCC compiler with MELT (e.g. to query the compiler's internal representations and generate some code from them) -but even that means a lot of work-
PS. Coding in something better than C++, like Erlang or Common Lisp, would make your goal less difficult.

Communication method for data exchange between a server and several clients for 10+ years [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
We're running an experiment which will involve collecting data from multiple stations around the world. Each station will be providing HDF5 files with magnetic field measurements in a rate of 1 kHz and some auxiliary data in real time. The latency is going to be a few minutes.
I'm assigned to design this program (in C++, with clients/server model, with server being in linux and clients being cross-platform), and apparently I'll be designing this from scratch. My first concern is not to really do everything from scratch because this will be more error prone and pure wrong, so my question here is: What information/file transfer protocols/libraries should I use so that
The program can live for 10+ years with minimal maintenance
I can have very good support from the community for when I need help.
Since we need something relatively secure, my first thought was libssh (the only cross platform opensource library available out there for ssh), but then after discussing with some pros there I realized that the support there isn't so wonderful because only a few people work with libssh. The pros there hesitated in suggesting OpenSSL, but with OpenSSL I'll have to write my own authentication (apparently, I'm not an expert and that's why I'm asking).
What would you suggest? Please share your vision to whether I should go for OpenSSL, libssh, or something else.
PS: Please, if you're going to start off by saying this question is off-topic, move on and ignore it. Consider being helpful rather than critical.
If you require any additional information, please ask.
I think that OpenSSL might be a good choice.
No you do not have to "write you own authentication" - you just need to generate certificates and keys and put them in the right places - that is all.
I would suggest to look at the examples in <openssl-source-dir>/demos and <openssl-source-dir>/apps to get you started. Reading a book about OpenSSL would also be a good idea - for many other reasons (sometimes not directly related with SSL/TLS).
I hope that helps.

Virtual Filesystem with C/C++ under Windows [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
I am currently developing a game which simulates an operating system. Therefore i need an ingame filesystem. Currently, i am using zziplib to be able to load files from a zip archive, however this is a readonly "filesystem" and i need a way to write new files and serialize them afterwards (and deserializing them during the next execution)! Are there any useful libraries out there in the wild to be used or should i write one for myself based on any ones?
This is probably one of the places where using a simple database as a filesystem makes sens.
Use something like sqlite to store the data (with paths as keys, blobs as data, or something like that).
One of the advantages of doing this is that you don't actually have to worry about the storage, and you can use existing database tools to view/edit the data "offline" rather than having to write your own. (Plus you can store other game info in there as well.)
You might check out PicoStorage and Embedded File System in C++. I haven't directly used either but I've looked at them both. Embedded File System does have a dependency which could be a show stopper -- it requires Qt to be linked in. Perhaps that could be removed, but it uses it mainly for QString and QFile (and would have no reason to require the UI).
Update, 9 years later: As commented, the above links no longer work. This alternative link for PicoStorage may be viable (I was able to download the source from there but I've made no effort to validate it) but I cannot locate a modern equivalent for EFS.
My six pence on top of the answers above. SolFS (now CBFS Storage) and CodebaseFS provide virtual file system capabilities; both have an API for C/C++ and appear to do exactly what you are asking about. Still... the scale of your task is not clear for me. Does your game need to manage dozens, hundreds, zounds, ... of files? What are the sizes of those files? Etc, etc. I would raise these questions before looking for an appropriate solution.

Key-Value DB (an alternative to Berkeley DB?) [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed last month.
Improve this question
I'm looking for a hashmap on disk (Berkeley DB would fit exactly, but for the licensing problem).
The requirements are:
FOSS w/a commercial-friendly license (can be used in commercial applications without a fee)
a C/C++ interface
Embeddable
decent speed? faster than SQLite would be ideal
cross-platform would also be nice
Any suggestions welcome.
Thanks!
There have been a couple of recent options, both of which provide a byte[] -> byte[] map, atomic batch updates, and are BSD licensed:
Leveldb from Google.
RocksDB from Facebook, which is based on a fork of Leveldb, and claims to provide higher performance on SSD backed storage.
Although your application domain and data specifications are not clear; RocksDB, which is a recent solution for embedded persistent key-value storage, seems a fit for you. Benchmarks by Facebook show that it has better performance than LevelDB for server workloads and especially with data larger than RAM capacity. Also it is open-sourced under BSD license. You can find RocksDB C++ examples and more detail from here.
How about the *dbm libraries?
dbm ndbm gdbm sdbm tdbm and friends
Plenty to choose from.
If we are strict, none of the suggested mainstream options are hash-maps. Such structures are rarely used for persistent storage because a good hash function would route to different buckets, even for similar keys. Similar keys generally imply related data that is often accessed together, so you want to avoid putting them on remote parts of the disk. Random read latency may be too high.
The most famous matching implementation would be FASTER from Microsoft.