ORM for clojure? - clojure

I was reading this site about the clojure web stack:
http://brehaut.net/blog/2011/ring_introduction
and it has this to say about ORM for clojure:
"There are no SQL/Relational DB ORMs for Clojure for obvious reasons."
The obvious reason I can see is that the mapping to object happens automatically when you do a clojure.contrib.sql or clojureql query. However it seems some extra work is needed to do one-to-many or many-to-many relations (although maybe not too much work).
I found this write up for one-to-many: http://briancarper.net/blog/493/
Which I'm not sure I agree with; it appears to be assuming that both tables are pulled from the database and then the joined table is filtered in memory. In practice I think the sql query would specify the where criteria.
So I'm left wondering, is there some fairly obvious way to automatically do one-to-many relations via clojureql or clojure.contrib.sql? The only thing I can think of is something like this (using the typical blog post/comment example):
(defn post [id]
#(-> (table :posts)
(select (where :id id))))
(defn comments [post_id]
#(-> (table :comments)
(select (where :post_id post_id))))
(defn post-and-comments [id]
(assoc (post id) :comments (comments id)))
Is there any way to sort of automate this concept or is this as good as it gets?

I asked this question quite a while ago, but I ran across the following and decided to add it as an answer in case anyone is interested:
http://sqlkorma.com/

The "obvious" reason you don't need ORM as such in Clojure is because idiomatic Clojure doesn't have objects, per se.
The best way to represent data in a Clojure program is as lazy seqs of simple data strutures (maps & vectors). Mapping these to SQL rows is much less complex, and has much less of an impedance mismatch, than full-blown ORM.
Also, regarding the part of your question relating to forming a complex SQL query... reading your code, it doesn't really have any clear advantages over SQL itself. Don't be afraid of SQL! It's great for what it does: relational data manipulation.

There's still no high-level library to create complex relational queries that I know of. There's many ways to tackle this problem (the link you provided is one way) but even if ClojureQL provides a really nice DSL you can build upon, it still miss some important features. Here's a quick and dirty example of a macro that generate imbricated joins:
(defn parent-id [parent]
(let [id (str (apply str (butlast (name parent))) "_id")]
(keyword (str (name parent) "." id))))
(defn child-id [parent child]
(let [parent (apply str (butlast (name parent)))]
(keyword (str (name child) "." parent "_id"))))
(defn join-on [query parent child]
`(join ~(or query `(table ~parent)) (table ~child)
(where
(~'= ~(parent-id parent)
~(child-id parent child)))))
(defn zip [a b] (map #(vector %1 %2) a b))
(defmacro include [parent & tables]
(let [pairs (zip (conj tables parent) tables)]
(reduce (fn [query [parent child]] (join-on query parent child)) nil pairs)))
With this you could do (include :users :posts :comments) and get this SQL out of it:
SELECT users.*,posts.*,comments.*
FROM users
JOIN posts ON (users.user_id = posts.user_id)
JOIN comments ON (posts.post_id = comments.post_id)
There's one major issue with this technique though. The main problem is that the returned columns for all tables will be bundled together into the same map. As the column names can't be qualified automatically, it won't work if there's similarly named column in different tables. This also will prevent you from grouping the results without having access to the schema. I don't think there's a way around not knowing the database schema for this kind of things so there's still a lot of work to do. I think ClojureQL will always remain a low-level library, so you'll need to wait for some other higher-level library to exist or create your own.
To create such a library, you could always have a look at JDBC's DatabaseMetaData class to provide information about the database schema. I'm still working on a database analyzer for Lobos that use it (and some custom stuff) but I'm still far from starting to work on SQL queries, which I might add in version 2.0.

At the risk of swimming in the waters with some of the SO heavy hitters (to mix my metaphors, thoroughly ;) - surely one of the best features of ORM is that, for the vast majority of cases, the pragmatic programmer has to never use or even think about SQL. At worst some hacky programming with the results of a couple of queries may be required, on the basis that this will be converted in to raw SQL when that optimisation is required, of course, ;).
To say that ORM is not required for the 'obvious' reason, is somewhat missing the point. Further to start using a DSL to model SQL is compounding this mistake. In the vast majority of web frameworks, the object model is a DSL used to describe the data being stored by the web app, and SQL merely the declarative language required to communicate this to the database.
Order of steps when using ROR, django or Spring:
Describe your models in an OOP format
Open REPL and make some example models
Build some views
Check results in web browser
Ok, so you might use a slightly different order, but hopefully you get the point. Thinking in SQL or a DSL that describes it is a long way down the list. Instead, the the model layer abstracts away all the SQL away allowing us to create data objects that closely model the data we wish to use in the web site.
I would fully agree that OOP is no silver bullet, however, modelling data in a web framework is something it is definitely good for, and exploiting clojure's ability to define and manipulate Java classes would seem to be a good match here.
The examples in the question clearly demonstrate how painful SQL can be, and DSLs like Korma are only a part solution: "Let's suppose that we have some tables in a database..." - er, I thought my DSL was going to create those for me? Or is this just something an OOP language does better? ;)

Have you checked out the Korma library http://sqlkorma.com/ ? It allows you to define table relationships and abstract over joins. I think a major reason there aren't any ORM's for clojure is because they go against Rich Hickey's ideas of simplicity that the language was founded on. Check out this talk: http://www.infoq.com/presentations/Simple-Made-Easy

The library known as aggregate can take care of most of the itches you have here. It's not a full-scale ORM, but if you tell it the relationship graph for your database schema, then it provides CRUD implementations that automatically walk the relation graph. It's useful if you are using something like Yesql or raw SQL queries already, because it plugs in easily to an implementation that is using simple result maps.

Whether you want to use them or not, there was already aggregate
and there is toucan now (and they apparently read the same link as you did).

ORM is premature optimization. Walkable is the new sql library for Clojure that takes holistic approach. Check it out here https://github.com/walkable-server/walkable.
A real world example for those skeptical of fancy readmes: https://github.com/walkable-server/realworld

Related

Can you eagerly load all data into memory with Datomic?

I've previously asked this question on how to improve performance with Datomic but I have yet to find a good solution. One thing that struck me was that when using the Datomic Console to execute the query I got the impression that the query was MUCH faster. But I also noticed a great increase of startup time and memory consumption when using the Datomic Console compared to when I start my application standalone. This to me implies that Datomic Console pulls all data into memory before I explore the contents.
Am I right that this is the case?
If so, is this something I could do myself programmatically from a peer?
If (2) then how can this be done in Clojure?
As described here in the Datomic Documentation, the Peer Library loads index segments in the (in-process) Object Cache when it fetches them for querying.
Am I right that this is the case?
I doubt that the Datomic Console explicitly chooses to pull all datoms into memory, but it is possible that the Datomic console eagerly traverses a large chunk of your data in order to show its dashboard.
If so, is this something I could do myself programmatically from a peer?
Well, I guess you could always artificially scan through all the segments. One easy way to do this is via the Datoms API.
If (2) then how can this be done in Clojure?
(defn scan-whole-db [db]
(doseq [index [:eavt :aevt :avet :vaet]]
(dorun (seq (d/datoms db index)))))
That all being said, I'm not sure at all you should expect performance improvements from this strategy. Your Object Cache had better be large enough!

Understanding Data-centric app and object composition in Clojure

I've recently been much impressed by the work of Chris Granger and his Light Table. This question is not about light table though, but more about the "BOT" architecture he described using in his blog post "The IDE as a value" :
http://www.chris-granger.com/2013/01/24/the-ide-as-data/
Now, I'm pretty new to clojure but would like to better explore this way of programming: Behavior, Object, Tag:
(behavior* :read-only
:triggers #{:init}
:reaction (fn [this]
(set-options this {:readOnly "nocursor"})))
(object* :notifier
:triggers [:notifo.click :notifo.timeout]
:behaviors [:remove-on-timeout :on-click-rem!]
:init (fn [this]
[:ul#notifos
(map-bound (partial notifo this) notifos)]))
(object/tag-behaviors :editor.markdown [:eval-on-change :set-wrap])
Where can I find clojure code that uses that style and those composition principles?
BOT sounds like the Light Table "proprietary" flavor of Entity-Component-System (ECS) architecture. I would start with wikipedia entry and then go to this post with code examples in ActionScript (we are in the world of games).
There are also some examples in the Clojure context.

When should a person use native queries with JPA 2.0 instead of JPQL or CriteriaBuilder?

I am very confused about when and when not to use native queries in JPA 2.0. I was under the impression that using native queries could cause me to get out of sync with the JPA cache. If I can accomplish the same thing with a JPQL or CriteriaBuilder query, is there any good reason to use a native query? Similarly, is there any danger in using a native query if I can accomplish the same thing with JPQL or CriteriaBuilder? And finally, if there is a danger in using a native query as far as getting out of sync with the JPA cache, would the same danger exist in executing an equivalent query with JPQL or CriteriaBuilder?
My philosophy has been do avoid native queries, but surely there are times where they are necessary. It seems to me that if I can do it with JPQL or CriteriaBuilder, then I should.
Thanks.
I agree with your philosophy.
The main problem with native queries, IMHO, is the maintainability. First of all, they're generally more complex and longer than JPQL queries. But they also hardcode table and column names, rather than using class and property names.
JPQL queries are already problematic when refactoring, because they hard-code class and property names in Strings. But native queries are even worse, because they hard-code table and column names everywhere.
I don't think native select queries are a problem regarding the cache. Native update, insert and delete queries are a problem, though, because they modify data behind the back of the first and second-level cache. So these might become stale.
Another problem is that your native queries could use a syntax that is recognized by one database but not by another, making the application harder to migrate from one database to another.

applying clojure

I have been following clojure for some time, and some of its features are very exciting (persistent data structures, functional approach, immutable state). However, since I am still learning, I would like to understand how to apply in real scenarios, prove its benefits and then evolve and apply for more complex problems. i.e. what are the easy wins with clojure (e.g. in an e-commerce setup) that can be used to learn as well as ascertain its benefits.
I have investigated clojure based web frameworks, but I am not keen on them, as they need hand-written javascript (as against gwt). So for me, it is more about backend processing. Can someone explain where they applied clojure (in real deployments), and how did it prove useful (and the cons, if any, of using clojure)
Further analysis:
Lazy evaluation is an oft example of power of Lisp. Clojure being a Lisp, offers the same advantage. So, a real world example of such an application (in context of clojure) would help me gain insight.
You have mentioned that you work with CSV files. I found these to be very helpful, because I had to parse a csv file -- used clojure-csv; then extract certain columns from that csv file using sequence functions; interleave http form field names using zipmap; and then make http calls to an ASP application using clj-http.client.
(def accumail-url-keys ["CA", "STREET", "STREET2", "CITY", "STATE", "ZIP", "YR", "BILL_NO", "BILL_TYPE"] )
.
.
.
(defn ret-params
"Generates all q-parameters and returns them in a vector of vectors."
[all-csv-rows]
(reduce
(fn [param-vec one-full-csv-row]
(let [q-param (zipmap accumail-url-keys one-full-csv-row)
accu-q-param (first (rest (split-at 3 q-param)))
billing-param (first (split-at 3 q-param))]
(conj param-vec [accu-q-param billing-param])))
[]
all-csv-rows))
That project was an accelerated Clojure learning exercise.
Two sites 4Clojure.com and http://www.cis.upenn.edu/~matuszek/cis554-2010/Assignments/clojure-01-exercises.html are good places to start working on Clojure exercises. You can build on those.
Also the Clojure Google Group is a very helpful place to get information.
The Univ of Penn CIS exercises, as simple as they might seem, have given me a lot to digest, especially getting the skeleton of a tree, and recently the skeleton problem got a long discussion in the Google Clojure group.
Good luck.
cmn

Query building in a database agnostic way

In a C++ application that can use just about any relational database, what would be the best way of generating queries that can be easily extended to allow for a database engine's eccentricities?
In other words, the code may need to retrieve data in a way that is not consistent among the various database engines. What's the best way to design the code on the client side to generate queries in a way that will make supporting a new database engine a relatively painless affair.
For example, if I have (MFC)code that looks like this:
CString query = "SELECT id FROM table"
results = dbConnection->Query(query);
and we decide to support some database that uses, um, "AVEC" instead of "FROM". Now whenever the user uses that database engine, this query will fail.
Options so far:
Worst option: have the code making the query check the database type.
Better option: Create query request method on the db connection object that takes a unique query "code" and returns the appropriate query based on the database engine in use.
Betterer option: Create a query builder class that allows the caller to construct queries without using any SQL directly. Once the query is completed, caller can invoke a "Generate" method which returns a query string approrpriate for the active database engine
Best option: ??
Note: The database engine itself is abstracted away through some thin layers of our own creation. It's the queries themselves are the only remaining problem.
Solution:
I've decided to go with the "better" option (query "selector") for two reasons.
Debugging: As mentioned below, debugging is going to be slightly easier with the selector approach since the queries are pre-built and listed out in a readable form in code.
Flexibility: It occurred to me that there are some databases which might have vastly better and completely different ways of solving a particular query. For example, with Access I perform a complicated query on multiple tables each time because I have to, but on Sql Server I'd like to setup a view. Selecting from the view and from several tables are completely different queries (i think) and this query selector would handle it easily.
You need your own query-writing object, which can be inherited from by database-specific implementations.
So you would do something like:
DbAgnosticQueryObject query = new PostgresSQLQuery();
query.setFrom('foo');
query.setSelect('id');
// and so on
CString queryString = query.toString();
It can get pretty complicated in there once you go past simple selects from a single table. There are already ORM packages out there that deal with a lot of these nuances; it may be worth at looking at them instead of writing your own.
Best option: Pick a database, and code to it.
How often are you going to up and swap out the database on the back end of a production system? And even if you did, you'd have a lot more to worry about than just minor syntax issues. (Major stuff like join syntax, even datatypes can differ widely between databases.)
Now, if you are designing a commercial application where you want the customer to be able to use one of several back-end options when they implement it, then you may have to specify "we support Oracle, MS SQl, or MYSQL" and code to those specific options.
All of your options can be reduced to
Worst option: have the code making the query check the database type.
It's just a matter of where you're putting the logic to check the database type.
The option that I've seen work best in practice is
Better option: Create query request method on the db connection object that takes a unique query "code" and returns the appropriate query based on the database engine in use.
In my experience it is much easier to test queries independently from the rest of your code. It gets a lot harder if you have objects that are piecing together queries from bits of syntax, because then you have to test the query-creation code and the query itself.
If you pull all of your SQL out into separate files that are written and maintained by hand, you can have someone who is an expert in SQL write them (you can still automate the testing of these queries). If you try to write query-generating functions you'll essentially have a C++ expert writing SQL.
Choose an ORM, and start mapping.
If you are to support more than one DB, your problem is only going to get worse.
And just think of DB that are comming - cloud dbs with no (or close to no) SQL, and Object databases.
Take your queries outside the code - put them in the DB or in a resource file and allow overrides for different database engines.
If you use SPs it's potentially even easier, since the SPs abstract away your database differences.
I would think that what you would want to do, if you needed the ability to support multiple databases, would be to create a data provider interface (or abstract class) and associated concrete implementations. The data provider would need to support your standard query operators and other common, supported functionality required support your query operations (have a look at IEnumerable extension methods in .NET 3.5). Each concrete provider would then translate these into specific queries based on the target database engine.
Essentially, what you do is create a database abstraction layer and have your code interact with it. If you can find one of these for C++, it would probably be worth buying instead of writing. You may also want to look for Inversion of Control (IoC) containers for C++ that would basically do this and more. I know of several for Java and C#, but I'm not familiar with any for C++.