I have a lot of rrd files which got generated on a 1st Gen Cubieboard (1 GHz CPU, 1 Core, 1 GB RAM), and about a year ago, when I migrated the data loggers to an x86_64 machine, I noticed that I can no longer read those old files. I didn't know they were platform specific.
I know that there is a workflow where I export the data from the files into XML files and then import them on the other architecture, but this is not my first choice as the old board is painfully slow and has other important work to do, like be a DNS server. The rrdtool version is stuck at 1.4.7 and there are 1.4 gigs worth of files to get processed.
Is there a way to emulate the Cubieboard on a fast Intel machine or some x86_64 based tool which can convert those rrd files?
RRD File are not portable between architectures, as you have noticed. The format depends not only on the 32/64 bit integer size, but also on the 'endian' characteristics, and even on the compiler behaviour with structure padding. It may be possible to compile the library in 32-bit mode on your new platform, but it is still not likely to be compatible with your old RRD files as there are other hardware differences to consider.
In short, your best option is to (slowly?) export to XML and then re-import in the new architecture, as you already mentioned. I have previously done this on a large RRD installation, running in parallel for a while to avoid gaps in the data, but it takes time.
I seem to remember that Tobi was, at one time, planning on a new architecture-independent RRD format in RRD 1.6, but even if this comes to pass then it won't help you with your legacy data.
Related
I have a large collection of ISO files (around 1GB each) that have shared 'runs of data' between them. So, for example, one of the audio tracks may be the same (same length and content across 5 isos), but it may not necessarily have the same name or location in each.
Is there some compression technique I can apply that will detect and losslessly deduplicate this information across multiple files?
For anyone reading this, after some experimentation it turns out that by putting all the similar ISO or CHD files in a single 7zip archive (Solid archive, with maximum dictionary size of 1536MB), I was able to achieve extremely high compression via deduplication on already compressed data.
The lrzip program is designed for this kind of thing. It is available on most Linux/BSD systems package mangers, or via Cygwin for Windows.
It uses an extended version of rzip to first de-duplicate the source files, and then compresses them. Because it uses mmap() it does not have issues with the size of your RAM, like 7zip does.
In my tests lrzip was able to massively de-duplicate similar ISOs, bringing a 32GB set of OS installation discs down to around 5GB.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I have a memory problem. Since 12 Years we use in our software (C++, 32Bit) own made tables to store data.
The Tables are stored on disk. When we want to use data of it, they are loaded in Memory and stay there.
Some tables are very big, they have more than 2 Million rows. When we load them into memory
they up to 400MB. Due to 32bit and memory fragmentation we can actually load maximum 2 such tables int
o memory before other operations did't get enough of memory.
The software is installed on more the 3000 clients. The OS on the clients is win7-win10 (32Bit and 64Bit) and some insignificant XP and Vista Systems
So we discussed a good (fast, propper ) way to get out of this problem. Here are some ideas:
switching to 64Bit
switching from our own tables to sqlite or ejdb
opening every table in an own process and comunicate with the process to get data of the table
extend our own tables that thay can read directly from disk
All Ideas are more or less propper, practicable and fast (implementing speed and execution speed). The
advantages and disadvantages of every idea ist very complicated and will go beyond the scope of this.
Has someone another good idea to solve this problem?
[update]
I will try to explain this from a different angle. First the software is
installed on a wide base of different windows OS. From XP to W10 on all
kind of computers. The software can be used on single desktops as on terminal
servers with a central LAN data pool (only a folder on a file-server).
It collects articles in a special way. So there are a lot of informations about
all kind of article data and also price informations from different vendors.
So there is a big need for hiding/crypt this information to outsiders.
The current database is like an in-memory table of strings, doubles or long data values.
Each row can contain a different set of columns. But most of the tables are like
a structured database table. The whole table data is crypted and zipped in one block.
Once loaded the whole data is expanded in memory where we can access this data very fast.
If an index is needed, we do this with a std::map inside the software.
We tried to compare our current table data against SQLite and EJDB. A file which contains
about half a million simple article data takes 3.5 MB in our data, 28MB in SQLite and 100 MB (in
several files) on EJDB. SQLite and EJDB shows the data in plain strings ore simple binary parts
of as example "double". So with a good editor you can match an article number with a price very
easy.
The software uses about 40 DLLs with several dependencys of third-party libs. So switching from
32 to 64 BIT is a challenge. Also does it not solve our problems with 32Bit terminal servers by our
client installations.
Going to a real database (like MySQL, MongoDB etc.) is a big challenge too as we freqently update our
data every month on the wide base of computers. There is not allway a internet connection to use a
real server client modell.
So what can we do?
Use SQLite or EJDB or something else and crypt our data in each field ?
Reprogramm our database so it uses smaller chunks of data which leafs on this and loaded the
chunks on demand as they were neaded ?
Only the indexes are in memory. Manage the disk-data maybe with a B-Tree strategie.
Time is short. So reinventing the wheel does not help. What would you do or use in such
a situation ?
400MB. Due to 32bit and memory fragmentation we can actually load maximum 2 such tables
Aren't you by any chance "loading" this tables by allocating a large chunk of memory and reading table content from disc into it? If so then you should switch to loading tables using smaller memory-mapped blocks (probably 4Mb each which corresponds to large memory page size). This way you should be able to utilize most of 3.5 Gb address space available for 32-bit program.
I installed MonetDB and imported a (uncompressed) 291 GB TSV MySQL dump. It worked like a charm and the database is really fast, but the database needs more than 542 GB on the disk. It seems like MonetDB is also able to use compression, but I was not able to find out how to enable (or even force) it. How can I do so? I don't know if it really speeds up execution, but I would like to try it.
There is no user-controllable compression scheme available in the official MonetDB release. The predominant compression scheme is dictionary encoding for string valued columns. In general, a compression scheme reduces the disk/network footprint by spending more CPU cycles.
To speed up queries, it might be better to first look at the TRACE of the SQL queries for simple hints on where the time is actually spent. This often give hints on 'liberal' use of column types. For example, a BIGINT is an overkill if the actual value range is known to fit in 32bits.
I added the SQLite3 source to my project and compiled it. My file size is huge (~400KB).
I need my file to be as small as possible. What is the best way to do SQLite queries in C++ ?
When i say best i mean the smallest possible file size. Any other light weight SQLite libs for C++?
From sqlite about page
If optional features are omitted, the size of the SQLite library can
be reduced below 300KiB
I guess it will be hard to go lower, and I don't think there are alternative implementations doing less400 KB is a lot but SQlite do a lot too. Even a small database will be more than 50M. You may go lower dynamically linking with some Microsoft ADO but with many potential install or security problems (and no sqlite file support). My final words 400K is a lot. But for today 400K is pretty small. Many homepage are more than 1M and that's even more crazy.
I have a 500Mhz CPU and 256MB RAM machine running 32bit Linux.
I have a large number of files around 300KB in size. I need to compress them very fast. I have set up the compression level for zlib at Z_BEST_SPEED. Is there any other measure I could take?
Is it possible to compress 25-30 files like this in a second on such a machine?
You are essentially talking about a 10MB/sec speed. Even if you were to only copy the files from one place to another I would doubt that your slow hardware could do it. So, for compression I would vote No - it's not possible "to compress 25-30 files like this in a second on such a machine".