What is the easiest way to persist maps/structs in Clojure? - clojure

The obvious way is to load up JDBC support from Clojure Contrib and write some function to translate a map/struct to a table. One drawback of this is that it isn't very flexible; changes to your structure will require DDL changes. This implies either writing DDL generation (tough) or hand-coding migrations (boring).
What alternatives exist? Answers must be ACID, ruling out serializing to a file, etc.

FleetDB is a database implemented in Clojure. It has a very natural syntax for working with maps/structs, e.g. to insert:
(client ["insert" "accounts" {"id" 1, "owner" "Eve", "credits" 100}])
Then select
(client ["select" "accounts" {"where" ["=" "id" 1]}])
http://fleetdb.org/

One option for persisting maps in Clojure that still uses a relation database is to store the map data in an opaque blob. If you need the ability to search for records you can store indexes in separate tables. For example you can read how FriendFeed is storing schemaless data on top of MySQL - http://bret.appspot.com/entry/how-friendfeed-uses-mysql
Another option is to use the Entity-Attribute-Value model (EAV) for storing data in a database. You can read more about EAV on Wikipedia (I'd post a link but I'm a new user and can only post one link).
Yet another option is to use BerkeleyDB for Java - it's a native Java solution providing ACID and record level locking. (Same problem with posting a link).

Using CouchDB's Java-client lib and clojure.contrib.json.read/write works reasonably well for me. CouchDB's consistency guarantees may not be strong enough for your purposes, though.

Clj-record is an implementation of active record in clojure that may be of interest to you.

You could try one of the Java-based graph databases, such as Neo4J. It might be easy to code up a hashmap interface to make it reasonably transparent.

MongoDB and it's framework congomongo (lein: [congomongo "0.1.3-SNAPSHOT"]) works for me. It's incredible nice with the schemaless databases, and congomongo is quite easy to get along with. MongoDB adds an _id-field in every document to keep it identified, and there is quite good transparency between clojure-maps and mongo-maps.
https://github.com/somnium/congomongo
EDIT: I would not use MongoDB today. I would suggest you use transit. I would use JSON if the backend (Postgres etc) support it or the msgpack coding if you want to have a more compact binary encoding.

Related

what's the efficiency of qsqlite data base in Qt?

I am very new to database and I am trying to implement a offline map viewer. What would be the efficiency of the qsqldatabase?
To make it extreme, for example, is it possible to download all satellite image of all the detail levels of US from the google's map server and store it in a local sqlite database and still perform real time query based on my current gps location?
The Qt Database driver for SQLite uses SQLite internally (surprise!). So the question is more like: Is SQLite the right database to use? My answer: I would not use it to store geographical data, consider to look for a database which is optimized for this task.
If this is not an option; SQLite is really efficient. First check if your data is within the limits. Do not forget to create indexes and analyze the database. Then it should be able to handle your task. Here I assume you just want to get an image by its geographical position (but other solutions can be a lot faster because your data is sortable — if I remember correctly SQLite is not optimized for that).
As you will store large blobs, you may want to have a look at the Internal Versus External BLOBs in SQLite document. Maybe this gives you the answer already.

Is small changes in large documents a thing document databases is good for?

Sometimes documents with it's free form structure is attractive for storing data (in contrast to a relational database). But one problem is persistence in combination with making small changes to the data, since the entire document has to be rewritten to disk.
So my question is, are "document databases" especially made to solve this?
UPDATE
I think I understand the concept of "document oriented databases" better now. It's obviously not documents of any kind but each implementation uses it's own format, such as for instance JSON. And then the answer to my question also becomes obvious. If the entire JSON-structure had to be rewritten to disk after each change to keep it persisted, it wouldn't be a very good database.
If the entire JSON-structure had to be rewritten to disk after each change to keep it persisted, it wouldn't be a very good database.
I would say this is not true of any document database I know of. For example, Mongo doesn't store documents as JSON, it stores them as BSON (http://en.wikipedia.org/wiki/BSON).
Also databases like Mongo will store documents in RAM and persist them to disk later.
In fact many document databases will follow that pattern of storing documents in main memory and then writing them to disk.
But the fact that a given document database will write data to disk - and the fact that some documents might get changed a lot - does not mean the database is non-performant. I wouldn't disregard document databases based on speculation.

Raw Binary Tree Database or MongoDb/MySQL/Etc?

I will be storing terabytes of information, before indexes, and after compression methods.
Should I code up a Binary Tree Database by hand using sort files etc, or use something like MongoDB or even something like MySQL?
I am worried about (space) cost per record with things like MySQL and other DB's that are around. I also know that some databases even allow for compression, but they convert to read only tables. These tables/records need to be accessed and overwritten with new data fairly often. I think if I were to code something in C++ I'd be able to keep the cost of space per record to a minimum.
What should I do?
There are new non-relational databases that are becoming popular these days, that specialize in managing large-scale data.
Check out Hadoop or Cassandra, both of these are at the Apache Project.

Want to store profiles in Qt, use SQLite or something else?

I want to store some settings for different profiles of what a "task" does.
I know in .NET there's a nice ORM is there something like that or an Active Record or whatever? I know writing a bunch of SQL will be fun
I'm going to agree with Micheal E and say that you can use QJson, but no you don't have to manage serialization. QJson has a QObject->QJson serializer/deserialzer. So as long as all your relevant data is exposed via Q_PROPERTY QJson can grab it and write/read it to/from the disk.
Examples here: http://qjson.sourceforge.net/usage.html
From there you can simply dump the data into a file.
One option would be to serialize objects to JSON with QJson. You still need to manage serialization, but it could well be a lot simpler if you don't need sophisticated query capabilities.

Best way to store data in C++

I'm just learning C++, just started to mess around with QT, and I am sitting here wondering how most applications save data? Is there an industry standard? Do they store it in a XML file, text file, SQLite? What about sensitive data that say accounting software would need to save? I'm just interested in learning what the best practices for this are.
Thanks
This question is way too broad. The only answer is it depends on the nature of the particular application and the data, and whether or not it is written in C++ has very little to do with it.
For example, user-configurable application settings are often stored in text files, but on Windows they are typically stored in the Registry. Accounting applications typically keep their data in a database of some sort.
There are many good ways to store application data (call it serialization).
Personally, I think for larger datasets, using an open format is much, much easier for debugging. If you go with XML, for example, you can store your data in an open form so that if you have file corruption issues (i.e. a client can't open your file for some reason), it's easier to find. If you have sensitive data in there, you can always encrypt it before writing it to file using key encryption. Microsoft, for instance, has gone from using a proprietary format to open xml in their office docs. They use .*x extension (.docx, .xlsx, etc). It's really just a compressed folder with xml files.
Using binary serialization is, of course, the industry standard at the moment for most standalone applications. Most likely that is because of the application framework they are using (such as MFC, which is old). If you take a look at most of the serialization techniques in modern application frameworks, XML serialization is very well supported.
First you need to clarify what kind of data you would like to save.
If you just want to save some application settings, use QSettings to save your settings to an INI file or registry.
If it is much more than just some application settings, go for XML files or SQL.
There is no standard practice, however if you want to use complex structured data, consider using an embedded database engine such as SQLite or Metakit, or Berkeley DB files. XML files would also do the job and be human readable/writable. Preferences can use INI files or the Windows registry, and so on. In short, it really depends on your usage pattern.
This is a general question. Like many things, the right answer depends on your application and its needs.
Most desktop applications save end-user data to a file (think Word and Excel). The format is up to you, XML, binary, etc. And if you can serialize/deserialize objects to file it will probably make your life easier.
Internal application data such as configuration files or temporary data might be saved to an XML file or an lightweight, local database such as SQLite
Often, "enterprise" applications used internally by a business will save their data to a back-end database such as SQL Server or Oracle. This is so all of the enterprise's data is saved to a single central location. And then it is available for reporting, etc.
For accounting software, you would need to consider the business domain and end users. For example, if the software is to be sold to large businesses you would probably use some form of a database to save data. Otherwise a binary file would be fine, perhaps with some form of encryption if you are really paranoid.
When you say "the best way", then you have to define what you mean by "good".
The problem is that various requirements conflict with each other, therefore so you can't satisfy all of them simultaneously.
For example, if one requirement is "concurrent multi-user access to the data" then this suggests using a database engine, but that conflicts with "as small as possible" and "minimize dependencies on 3rd-party software".
If a requirement is "portable data format" then this suggests XML, but that conflicts with "compact" and "indexed".
Do they store it in a XML file, text file, SQLite?
Yes.
Also, Binary files and relational databases.
Anything else?