Datomic - working with OR clause - clojure

I'm currently working on migrating my clojure app(with korma) to Datomic framework and been in a loop while I was translating the queries. I realise the queries are not completely flexible(compared to korma), for example i would like to evaluate conditional clauses around different variables.
Considering a korma query,
(select users
(where (or (and {:first_name [= "user"]}
{:last_name [= "sample"]})
{:email [= "user.sample#email.com"]})))
this can be converted to Datomic, with something like this?
[:find ?e
:where (or (and [?u :user/first-name "user"]
[?u :user/last-name "sample"])
[?u :user/email "user.sample#email.com"])
but this is not the recommended way of querying(according to Datomic docs), as all clauses used in an or clause must use the same set of variables. How do I set an OR clause around different sets of variables?

Your query should work. All of your clauses do use the same variable: ?u
(d/q '[:find ?u
:where (or (and [?u :user/first-name "user"]
[?u :user/last-name "sample"])
[?u :user/email "user.sample#email.com"])]
[[1 :user/first-name "user"]
[1 :user/last-name "sample"]
[2 :user/email "user.sample#email.com"]
[3 :user/first-name "user"]
[3 :user/last-name "not sample"]])
=> #{[1] [2]}
If you need them to use different variables, you can use or-join to explicitly list them:
(d/q '[:find ?u
:in $ ?first-name ?last-name ?email
:where (or-join [?u ?first-name ?last-name ?email]
(and [?u :user/first-name ?first-name]
[?u :user/last-name ?last-name])
[?u :user/email ?email])]
[[1 :user/first-name "user"]
[1 :user/last-name "sample"]
[2 :user/email "user.sample#email.com"]
[3 :user/first-name "user"]
[3 :user/last-name "not sample"]]
"user"
"sample"
"user.sample#email.com")
=> #{[1] [2]}

This is very similar to this question:
SQL LIKE operator in datomic
You need to check out query rules.
http://docs.datomic.com/query.html
Your query would look something like this (untested!)
(let [rules '[[(find-user ?user ?fname ?lname ?email)
[?user :user/first-name ?fname]
[?user :user/last-name ?lname]]
[(find-user ?user ?fname ?lname ?email)
[?user :user/email ?email]]]]
(:find ?user
:in $ % ?fname ?lname ?email
:where
(find-user ?user ?fname ?lname ?email)
conn rules "user" "sample" "user.sample#email.com"))

Related

How to create PersistentArrayMap preventing evaluation?

My question is quite simple but weird at the same time, I would like to create a PersistentArrayMap avoiding evaluation at the same time that I would like to get the value inside this. What is the best solution? I mean
Imagine that I have this def:
(def queen "catherine the great")
And would like to do something like this (single quote):
'{:queen queen}
for sure the output is
=> {:queen queen}
But i expected do something like this
=> {:queen "catherine the great"}
I know that I can just do it
(array-map :queen queen)
But in my case I would like only evaluate some information cause my map is more complicated, like a datomic query:
'{:find [(pull $ ?c [*])]
:with []
:in [$ ?queen-name]
:where [
[$ ?c :queen/name ?queen-name]
]
:args [queen-name]}
for this case I would like only to evaluate queen-name.
My question is, there's a simple way to do it? Maybe using update?
something like this?
(assoc-in '{:find [(pull $ ?c [*])]
:with []
:in [$ ?queen-name]
:where [
[$ ?c :queen/name ?queen-name]
]
:args []} [:args] ["catherine the great"])
For both examples, you can use syntax-quote and unquote:
user=> (def queen-name "catherine the great")
#'user/queen-name
user=> `{:queen ~queen-name}
{:queen "catherine the great"}
user> {:find '[(pull $ ?c [*])]
:with '[]
:in '[$ ?queen-name]
:where '[
[$ ?c :queen/name ?queen-name]
]
:args `[~queen-name]}
{:find [(pull $ ?c [*])],
:with [],
:in [$ ?queen-name],
:where [[$ ?c :queen/name ?queen-name]],
:args ["catherine the great"]}
If you want to do this in general, you can use tupelo.quote.
(ns demo.core
(:require [tupelo.quote :as q]))
; problem: free symbols a and b are fully-qualified using current ns
`[a b ~(+ 2 3)] => [demo.core/a
demo.core/b
5]
(q/tmpl-fn '[a b (insert (+ 2 3))]) => [a b 5]
(let [a 1 b 2]
(q/tmpl [a b (insert (+ 2 3))])) => [1 2 5]
(is= [1 [2 3 4] 5] (q/tmpl [1 (insert (t/thru 2 4)) 5]))
(is= [1 2 3 4 5] (q/tmpl [1 (splice (t/thru 2 4)) 5]))
For Datomic in particular, you can use query inputs:
(def some-name "John Lennon") ; parameter
;; query
(d/q '[:find ?release-name ; query pattern (quoted)
:in $ ?artist-name
:where [?artist :artist/name ?artist-name]
[?release :release/artists ?artist]
[?release :release/name ?release-name]]
db, some-name ; inputs (not quoted)
)
with result
#{["Power to the People"]
["Unfinished Music No. 2: Life With the Lions"]
["Live Peace in Toronto 1969"]
["Live Jam"]
...}

Does anyone use Datomic to get the structure and entities separately?

So I use queries to filter data and then use pull to get the information out from the Datomic database.
(def rules
[[[search ?txt ?id] [(fulltext $ :artist/name ?txt) [[?id]]]]
[[search ?txt ?id] [(fulltext $ :track/name ?txt) [[?id]]]]])
(d/q
'[:find [(pull ?id [* {:track/artists [:db/id :track/name] :track/_artists [:db/id :artist/name] }]) ...]
:in $ % ?query
:where [search ?query ?id]]
db rules "John Lennon")
And sometimes these queries can get recursive, so for example I can change the pull to:
(d/q
'[:find [(pull ?id [* {:track/artists [:db/id :track/name] :track/_artists [* {:track/artists [:db/id :track/name]}]}]) ...]
:in $ % ?query
:where [search ?query ?id]]
db rules "John Lennon")
Now what I'd like to do is ensure that unique entities are being returned along with the :db/id structure as I don't want to return duplicate data as much as possible.
For example: (results elided with ...)
{:entities [{:db/id 1 :track/name "..." ...} {:db/id 2 :track/name "..." ...} {:db/id 3 :artist/name "..." ...}]
:structure [{:db/id 1 :track/artists [{:db/id 3}]} {:db/id 2 :track/artists [{:db/id 3}]}]}
Can this be done at the query level? Or do I need to walk the structure after the query returns and modify it? I'm happy to walk the structure at present, I'm just wondering if anyone has worked out a better approach?

DataScript / datahike rules returning nothing

I try to search for a string in more than one field of a database in datahike. So far without success. Here is my best effort approach so far:
(ns my.ns
(:require [clojure.string :as st]
[datahike.api :as dh]))
(def card-db [[1 :card/name "CardA"]
[1 :card/text "Text of CardA mentioning CardB"]
[2 :card/name "CardB"]
[2 :card/text "A mystery"]])
(def rules '[
[
(matches-name ?ent ?fn ?str)
[?ent :card/name ?name]
(?fn ?name ?str)]
[(matches-text ?ent ?fn ?str)
[?ent :card/text ?text]
(?fn ?text ?str)
]
])
(defn my-search [db search-strs]
(dh/q '[:find ?e
:in $ % ?fn [?str ...]
:where
[?e :card/name ?name]
#_[(?fn ?name ?str)] ;; this finds CardB
(or
(matches-name ?e ?fn ?str)
(matches-text ?e ?fn ?str)
) ;; this finds nothing
]
db rules st/includes? search-strs))
#_(count (my-search card-db ["CardB"]))
Expected result: 2
Actual result: 0
The solution needn't use rules as far as I'm concerned. It should just return a match if a string is found in at least one of multiple fields.
I'm using [io.replikativ/datahike "0.1.1"]
DataScript doesn’t support or yet. I think Datahike does neither. But you can emulate it using rules. Try:
(def rules '[[(matches ?ent ?fn ?str)
[?ent :card/name ?name]
(?fn ?name ?str)]
[(matches ?ent ?fn ?str)
[?ent :card/text ?text]
(?fn ?text ?str)]])
(defn my-search [db search-strs]
(dh/q '[:find ?e
:in $ % ?fn [?str ...]
:where [?e :card/name ?name]
(matches ?e ?fn ?str)]
db rules st/includes? search-strs))

Find entities whose ref-to-many attribute contains all elements of input

Suppose I have entity entry with ref-to-many attribute :entry/groups. How should I build a query to find entities whose :entry/groups attribute contains all of my input foreign ids?
Next pseudocode will illustrate my question better:
[2 3] ; having this as input foreign ids
;; and having these entry entities in db
[{:entry/id "A" :entry/groups [2 3 4]}
{:entry/id "B" :entry/groups [2]}
{:entry/id "C" :entry/groups [2 3]}
{:entry/id "D" :entry/groups [1 2 3]}
{:entry/id "E" :entry/groups [2 4]}]
;; only A, C, D should be pulled
Being new in Datomic/Datalog, I exhausted all options, so any help is appreciated. Thanks!
TL;DR
You're tackling the general problem of 'dynamic conjunction' in Datomic's Datalog.
3 strategies here:
Write a dynamic Datalog query which uses 2 negations and 1 disjunction or a recursive rule (see below)
Generate the query code (equivalent to Alan Thompson's answer): the drawbacks are the usual drawbacks of generating Datalog clauses dynamically, i.e you don't benefit from query plan caching.
Use the indexes directly (EAVT or AVET).
Dynamic Datalog query
Datalog has no direct way of expressing dynamic conjunction (logical AND / 'for all ...' / set intersection). However, you can achieve it in pure Datalog by combining one disjunction (logical OR / 'exists ...' / set union) and two negations, i.e (For all ?g in ?Gs p(?e,?g)) <=> NOT(Exists ?g in ?Gs, such that NOT(p(?e, ?g)))
In your case, this could be expressed as:
[:find [?entry ...] :in $ ?groups :where
;; these 2 clauses are for restricting the set of considered datoms, which is more efficient (and necessary in Datomic's Datalog, which will refuse to scan the whole db)
;; NOTE: this imposes ?groups cannot be empty!
[(first ?groups) ?group0]
[?entry :entry/groups ?group0]
;; here comes the double negation
(not-join [?entry ?groups]
[(identity ?groups) [?group ...]]
(not-join [?entry ?group]
[?entry :entry/groups ?group]))]
Good news: this can be expressed as a very general Datalog rule (which I may end up adding to Datofu):
[(matches-all ?e ?a ?vs)
[(first ?vs) ?v0]
[?e ?a ?v0]
(not-join [?e ?a ?vs]
[(seq ?vs) [?v ...]]
(not-join [?e ?a ?v]
[?e ?a ?v]))]
... which means your query can now be expressed as:
[:find [?entry ...] :in % $ ?groups :where
(matches-all ?entry :entry/groups ?groups)]
NOTE: there's an alternate implementation using a recursive rule:
[[(matches-all ?e ?a ?vs)
[(seq ?vs)]
[(first ?vs) ?v]
[?e ?a ?v]
[(rest ?vs) ?vs2]
(matches-all ?e ?a ?vs2)]
[(matches-all ?e ?a ?vs)
[(empty? ?vs)]]]
This one has the advantage of accepting an empty ?vs collection (so long as ?e and ?a have been bound in some other way in the query).
Generating the query code
The advantage of generating the query code is that it's relatively simple in this case, and it can probably make the query execution more efficient than the more dynamic alternative. The drawback of generating Datalog queries in Datomic is that you may lose the benefits of query plan caching; therefore, even if you're going to generate queries, you still want to make them as generic as possible (i.e depending only on the number of v values)
(defn q-find-having-all-vs
[n-vs]
(let [v-syms (for [i (range n-vs)]
(symbol (str "?v" i)))]
{:find '[[?e ...]]
:in (into '[$ ?a] v-syms)
:where
(for [?v v-syms]
['?e '?a ?v])}))
;; examples
(q-find-having-all-vs 1)
=> {:find [[?e ...]],
:in [$ ?a ?v0],
:where
([?e ?a ?v0])}
(q-find-having-all-vs 2)
=> {:find [[?e ...]],
:in [$ ?a ?v0 ?v1],
:where
([?e ?a ?v0]
[?e ?a ?v1])}
(q-find-having-all-vs 3)
=> {:find [[?e ...]],
:in [$ ?a ?v0 ?v1 ?v2],
:where
([?e ?a ?v0]
[?e ?a ?v1]
[?e ?a ?v2])}
;; executing the query: note that we're passing the attribute and values!
(apply d/q (q-find-having-all-vs (count groups))
db :entry/group groups)
Use the indexes directly
I'm not sure at all how efficient the above approaches are in the current implementation of Datomic Datalog. If your benchmarking shows this is slow, you can always fall back to direct index access.
Here's an example in Clojure using the AVET index:
(defn find-having-all-vs
"Given a database value `db`, an attribute identifier `a` and a non-empty seq of entity identifiers `vs`,
returns a set of entity identifiers for entities which have all the values in `vs` via `a`"
[db a vs]
;; DISCLAIMER: a LOT can be done to improve the efficiency of this code!
(apply clojure.set/intersection
(for [v vs]
(into #{}
(map :e)
(d/datoms db :avet a v)))))
You can see an example of this in the James Bond example from the Tupelo-Datomic library. You just specify 2 clauses, one for each desired value in the set:
; Search for people that match both {:weapon/type :weapon/guile} and {:weapon/type :weapon/gun}
(let [tuple-set (td/find :let [$ (live-db)]
:find [?name]
:where {:person/name ?name :weapon/type :weapon/guile }
{:person/name ?name :weapon/type :weapon/gun } ) ]
(is (= #{["Dr No"] ["M"]} tuple-set )))
In pure Datomic it will look similar, but using something like the Entity ID:
[?eid :entry/groups 2]
[?eid :entry/groups 3]
and Datomic will perform an implicit AND operation (i.e. both clauses must match; any surplus entries are ignored). This is logically a "join" operation, even though it is the same entity being queried for both values. You can find more info in the Datomic docs.

How does the not clause work in Datomic?

I am trying to find latitudes which fall between two inputs. My query:
(defn- latlngs-within-new-bounds
[db a w]
(d/q '[:find ?lat
:in $ ?a ?w
:where
[ ?e :location/lat ?lat]
[(>= ?lat ?a)]
(not
[(>= ?lat ?w)])]
db a w))
My error:
3 Unhandled com.google.common.util.concurrent.UncheckedExecutionException
java.lang.RuntimeException: Unable to resolve symbol: ?lat in this
context
2 Caused by clojure.lang.Compiler$CompilerException
1 Caused by java.lang.RuntimeException
Unable to resolve symbol: ?lat in this context
Util.java: 221 clojure.lang.Util/runtimeException
Any help with understanding what's wrong with my query would be appreciated. Bonus points if you can also use Datomic rules to factor out the in-bounds part of each half.
Your code seems to work for me with datomic-free 0.9.5173:
(defn- latlngs-within-new-bounds
[db a w]
(d/q '[:find ?lat
:in $ ?a ?w
:where
[ ?e :location/lat ?lat]
[(>= ?lat ?a)]
(not
[(>= ?lat ?w)])]
db a w))
(latlngs-within-new-bounds
[[1 :location/lat 1]
[2 :location/lat 2]
[3 :location/lat 3]
[4 :location/lat 4]
[4 :location/lat 5]]
2 4)
=> #{[2] [3]}