Possible to get enum value via Datomic pull syntax? - clojure

In the mbrainz sample data, the :artist/type is an enum. Is it possible to pull the value of the enum out of :db/ident and associate it as the value of the :artist/type key using pull syntax?
This is as close as I could get:
[:find (pull ?e [:artist/name {:artist/type [:db/ident]}])
:where
[?e :artist/name "Ray Charles"]
]
;;=> [[{:artist/name "Ray Charles", :artist/type {:db/ident :artist.type/person}}]]
Is it possible to use pull syntax to reshape the result into something like this?
;;=> [[{:artist/name "Ray Charles", :artist/type :artist.type/person}]]

I don't think you can do it using the Pull API the way you are seeking. You may find that it is easier to use the Tupelo Datomic library:
(require '[tupelo.datomic :as td]
'[tupelo.core :refer [spyx]] )
(let [x1 (td/query-scalar :let [$ db-val]
:find [ ?e ]
:where [ [?e :artist/name "Ray Charles"] ] )
x2 (td/entity-map db-val x1)
]
(spyx x1)
(spyx x2)
)
which gives the result:
x1 => 17592186049074
x2 => {:artist/sortName "Charles, Ray", :artist/name "Ray Charles",
:artist/type :artist.type/person, :artist/country :country/US,
:artist/gid #uuid "2ce02909-598b-44ef-a456-151ba0a3bd70",
:artist/startDay 23, :artist/endDay 10, :artist/startYear 1930,
:artist/endMonth 6, :artist/endYear 2004, :artist/startMonth 9,
:artist/gender :artist.gender/male}
So :artist/type is already converted into the :db/ident value and you can just pull it out of the map.

You can use specter on the result that the pull expression returns:
(->> pull-result
(sp/transform (sp/walker :db/ident) :db/ident))
The value of key :db/ident is extracted for every map that has that key.

Was quite easy to do with postwalk
for any pulled :db/ident you can transform with this function
(defn flatten-ident [coll]
(clojure.walk/postwalk
(fn [item] (get item :db/ident item)) coll))

Related

How to query against attributes of multiple values?

Tested on datascript 1.3.0
datoms:
[{:db/id -1 :name "Oliver Smith" :hobbies ["reading" "sports" "music"]}]
tried to run the query below to find who like sports, but the empty set returned.
'[:find ?name
:where
[?p :name ?name]
[?p :hobbies ?hobbies]
[(some #{"sports"} ?hobbies)]]
How to formulate the query correctly to get the expected result below?
#{[Oliver Smith]}
We have to explicitly define the schema with cardinality/many against the attribute of multiple values to solve the problem since schemaless doesn't work here.
(require '[datascript.core :as d])
(def schema {:hobbies {:db/cardinality db.cardinality/many}})
(def conn (d/create-conn schema))
(def datoms [{:db/id -1 :name "Oliver Smith" :hobbies ["reading" "sports" "music"]}])
(d/transact! conn datoms)
(def query '[:find ?name :where [?p :name ?name] [?p :hobbies "sports"]])
(-> (d/q query #conn) println)

How to create PersistentArrayMap preventing evaluation?

My question is quite simple but weird at the same time, I would like to create a PersistentArrayMap avoiding evaluation at the same time that I would like to get the value inside this. What is the best solution? I mean
Imagine that I have this def:
(def queen "catherine the great")
And would like to do something like this (single quote):
'{:queen queen}
for sure the output is
=> {:queen queen}
But i expected do something like this
=> {:queen "catherine the great"}
I know that I can just do it
(array-map :queen queen)
But in my case I would like only evaluate some information cause my map is more complicated, like a datomic query:
'{:find [(pull $ ?c [*])]
:with []
:in [$ ?queen-name]
:where [
[$ ?c :queen/name ?queen-name]
]
:args [queen-name]}
for this case I would like only to evaluate queen-name.
My question is, there's a simple way to do it? Maybe using update?
something like this?
(assoc-in '{:find [(pull $ ?c [*])]
:with []
:in [$ ?queen-name]
:where [
[$ ?c :queen/name ?queen-name]
]
:args []} [:args] ["catherine the great"])
For both examples, you can use syntax-quote and unquote:
user=> (def queen-name "catherine the great")
#'user/queen-name
user=> `{:queen ~queen-name}
{:queen "catherine the great"}
user> {:find '[(pull $ ?c [*])]
:with '[]
:in '[$ ?queen-name]
:where '[
[$ ?c :queen/name ?queen-name]
]
:args `[~queen-name]}
{:find [(pull $ ?c [*])],
:with [],
:in [$ ?queen-name],
:where [[$ ?c :queen/name ?queen-name]],
:args ["catherine the great"]}
If you want to do this in general, you can use tupelo.quote.
(ns demo.core
(:require [tupelo.quote :as q]))
; problem: free symbols a and b are fully-qualified using current ns
`[a b ~(+ 2 3)] => [demo.core/a
demo.core/b
5]
(q/tmpl-fn '[a b (insert (+ 2 3))]) => [a b 5]
(let [a 1 b 2]
(q/tmpl [a b (insert (+ 2 3))])) => [1 2 5]
(is= [1 [2 3 4] 5] (q/tmpl [1 (insert (t/thru 2 4)) 5]))
(is= [1 2 3 4 5] (q/tmpl [1 (splice (t/thru 2 4)) 5]))
For Datomic in particular, you can use query inputs:
(def some-name "John Lennon") ; parameter
;; query
(d/q '[:find ?release-name ; query pattern (quoted)
:in $ ?artist-name
:where [?artist :artist/name ?artist-name]
[?release :release/artists ?artist]
[?release :release/name ?release-name]]
db, some-name ; inputs (not quoted)
)
with result
#{["Power to the People"]
["Unfinished Music No. 2: Life With the Lions"]
["Live Peace in Toronto 1969"]
["Live Jam"]
...}

How can I filter JSON data by a given date in Clojure?

I have many JSON objects, and I am trying to filter those objects by the date. These objects are being parsed from several JSON files using Cheshire.core, meaning that the JSON objects are in a collection. The date is being passed in in the following format "YYYY-MM-DD" (eg. 2015-01-10). I have tried using the filter and contains? functions to do this, but I am having no luck so far. How can I filter these JSON objects by my chosen date?
Current Clojure code:
(def filter-by-date?
(fn [orders-data date-chosen]
(contains? (get (get orders-data :date) :date) date-chosen)))
(prn (filter (filter-by-date? orders-data "2017-12-25")))
Example JSON object:
{
"id":"05d8d404-b3f6-46d1-a0f9-dbdab7e0261f",
"date":{
"date":"2015-01-10T19:11:41.000Z"
},
"total":{
"GBP":57.45
}
}
JSON after parsing with Cheshire:
[({:id "05d8d404-b3f6-46d1-a0f9-dbdab7e0261f",
:date {:date "2015-01-10T19:11:41.000Z"},
:total {:GBP 57.45}}) ({:id "325bd04-b3f6-46d1-a0f9-dbdab7e0261f",
:date {:date "2015-02-23T10:15:14.000Z"},
:total {:GBP 32.90}})]
First, I'm going to assume you've parsed the JSON first into something like this:
(def parsed-JSON {:id "05d8d404-b3f6-46d1-a0f9-dbdab7e0261f",
:date {:date "2015-01-10T19:11:41.000Z"},
:total {:GBP 57.45}})
The main problem is the fact that the date as stored in the JSON contains time information, so you aren't going to be able to check it directly using equality.
You can get around this by using clojure.string/starts-with? to check for prefixes. I'm using s/ here as an alias for clojure.string:
(defn filter-by-date [date jsons]
(filter #(s/starts-with? (get-in % [:date :date]) date)
jsons))
You were close, but I made a few changes:
You can't use contains? like that. From the docs of contains?: Returns true if key is present in the given collection, otherwise returns false. It can't be used to check for substrings; it's used to test for the presence of a key in a collection.
Use -in postfix versions to access nested structures instead of using multiple calls. I'm using (get-in ...) here instead of (get (get ...)).
You're using (def ... (fn [])) which makes things more complicated than they need to be. This is essentially what defn does, although defn also adds some more stuff as well.
To address the new information, you can just flatten the nested sequences containing the JSONs first:
(->> nested-json-colls ; The data at the bottom of the question
(flatten)
(filter-by-date "2015-01-10"))
#!/usr/bin/env boot
(defn deps [new-deps]
(merge-env! :dependencies new-deps))
(deps '[[org.clojure/clojure "1.9.0"]
[cheshire "5.8.0"]])
(require '[cheshire.core :as json]
'[clojure.string :as str])
(def orders-data-str
"[{
\"id\":\"987654\",
\"date\":{
\"date\":\"2015-01-10T19:11:41.000Z\"
},
\"total\":{
\"GBP\":57.45
}
},
{
\"id\":\"123456\",
\"date\":{
\"date\":\"2016-01-10T19:11:41.000Z\"
},
\"total\":{
\"GBP\":23.15
}
}]")
(def orders (json/parse-string orders-data-str true))
(def ret (filter #(clojure.string/includes? (get-in % [:date :date]) "2015-01-") orders))
(println ret) ; ({:id 987654, :date {:date 2015-01-10T19:11:41.000Z}, :total {:GBP 57.45}})
You can convert the date string to Date object using any DateTime library like joda-time and then do a proper filter if required.
clj-time has functions for parsing strings and comparing date-time objects. So you could do something like:
(ns filter-by-time-example
(:require [clj-time.coerce :as tc]
[clj-time.core :as t]))
(def objs [{"id" nil
"date" {"date" "2015-01-12T19:11:41.000Z"}
"total" nil}
{"id" "05d8d404-b3f6-46d1-a0f9-dbdab7e0261f"
"date" {"date" "2015-01-10T19:11:41.000Z"}
"total" {"GBP" :57.45}}
{"id" nil
"date" {"date" "2015-01-11T19:11:41.000Z"}
"total" nil}])
(defn filter-by-day
[objs y m d]
(let [start (t/date-time y m d)
end (t/plus start (t/days 1))]
(filter #(->> (get-in % ["date" "date"])
tc/from-string
(t/within? start end)) objs)))
(clojure.pprint/pprint (filter-by-day objs 2015 1 10)) ;; Returns second obj
If you're going to repeatedly do this (e.g. for multiple days) you could parse all dates in your collection into date-time objects with
(map #(update-in % ["date" "date"] tc/from-string) objs)
and then just work with that collection to avoid repeating the parsing step.
(ns filter-by-time-example
(:require [clj-time.format :as f]
[clj-time.core :as t]
[cheshire.core :as cheshire]))
(->> json-coll
(map (fn [json] (cheshire/parse-string json true)))
(map (fn [record] (assoc record :dt-date (f/format (get-in record [:date :date])))))
(filter (fn [record] (t/after? (tf/format "2017-12-25") (:dt-date record))))
(map (fn [record] (dissoc record :dt-date))))
Maybe something like this? You might need to change the filter for your usecase but as :dt-time is now a jodo.DateTime you can leverage all the clj-time predicates.

How to construct a query that matches exactly a vector of refs in DataScript?

Setup Consider the following DataScript database of films and cast, with data stolen from learndatalogtoday.org: the following code can be executed in a JVM/Clojure REPL or a ClojureScript REPL, as long as project.clj contains [datascript "0.15.0"] as a dependency.
(ns user
(:require [datascript.core :as d]))
(def data
[["First Blood" ["Sylvester Stallone" "Brian Dennehy" "Richard Crenna"]]
["Terminator 2: Judgment Day" ["Linda Hamilton" "Arnold Schwarzenegger" "Edward Furlong" "Robert Patrick"]]
["The Terminator" ["Arnold Schwarzenegger" "Linda Hamilton" "Michael Biehn"]]
["Rambo III" ["Richard Crenna" "Sylvester Stallone" "Marc de Jonge"]]
["Predator 2" ["Gary Busey" "Danny Glover" "Ruben Blades"]]
["Lethal Weapon" ["Gary Busey" "Mel Gibson" "Danny Glover"]]
["Lethal Weapon 2" ["Mel Gibson" "Joe Pesci" "Danny Glover"]]
["Lethal Weapon 3" ["Joe Pesci" "Danny Glover" "Mel Gibson"]]
["Alien" ["Tom Skerritt" "Veronica Cartwright" "Sigourney Weaver"]]
["Aliens" ["Carrie Henn" "Sigourney Weaver" "Michael Biehn"]]
["Die Hard" ["Alan Rickman" "Bruce Willis" "Alexander Godunov"]]
["Rambo: First Blood Part II" ["Richard Crenna" "Sylvester Stallone" "Charles Napier"]]
["Commando" ["Arnold Schwarzenegger" "Alyssa Milano" "Rae Dawn Chong"]]
["Mad Max 2" ["Bruce Spence" "Mel Gibson" "Michael Preston"]]
["Mad Max" ["Joanne Samuel" "Steve Bisley" "Mel Gibson"]]
["RoboCop" ["Nancy Allen" "Peter Weller" "Ronny Cox"]]
["Braveheart" ["Sophie Marceau" "Mel Gibson"]]
["Mad Max Beyond Thunderdome" ["Mel Gibson" "Tina Turner"]]
["Predator" ["Carl Weathers" "Elpidia Carrillo" "Arnold Schwarzenegger"]]
["Terminator 3: Rise of the Machines" ["Nick Stahl" "Arnold Schwarzenegger" "Claire Danes"]]])
(def conn (d/create-conn {:film/cast {:db/valueType :db.type/ref
:db/cardinality :db.cardinality/many}
:film/name {:db/unique :db.unique/identity
:db/cardinality :db.cardinality/one}
:actor/name {:db/unique :db.unique/identity
:db/cardinality :db.cardinality/one}}))
(def all-datoms (mapcat (fn [[film actors]]
(into [{:film/name film}]
(map #(hash-map :actor/name %) actors)))
data))
(def all-relations (mapv (fn [[film actors]]
{:db/id [:film/name film]
:film/cast (mapv #(vector :actor/name %) actors)}) data))
(d/transact! conn all-datoms)
(d/transact! conn all-relations)
Description In a nutshell, there are two kinds of entities in this database—films and actors (word intended to be ungendered)—and three kinds of datoms:
film entity: :film/name (a unique string)
film entity: :film/cast (multiple refs)
actor entity: :actor/name (unique string)
Question I would like to construct a query which asks: which films have these N actors, and these N actors alone, appeared as the sole stars, for N>=2?
E.g., RoboCop starred Nancy Allen, Peter Weller, Ronny Cox, but no film starred solely the first two of these, Allen and Weller. Therefore, I would expect the following query to produce the empty set:
(d/q '[:find ?film-name
:where
[?film :film/name ?film-name]
[?film :film/cast ?actor-1]
[?film :film/cast ?actor-2]
[?actor-1 :actor/name "Nancy Allen"]
[?actor-2 :actor/name "Peter Weller"]]
#conn)
; => #{["RoboCop"]}
However, the query is flawed because I don't know how to express that any matches should exclude any actors who are not Allen or Weller—again, I want to find the movies where only Allen and Weller have collaborated without any other actors, so I want to adapt the above query to produce the empty set. How can I adjust this query to enforce this requirement?
Because DataScript doesn't have negation (as of May 2016), I don't believe that's possible with one static query in 'pure' Datalog.
My way to go would be:
build the query programmatically to add the N clauses that state that the cast must contain the N actors
Add a predicate function which, given a movie, the database, and the set of actors ids, uses the EAVT index to find if each movie has an actor that is not in the set.
Here's a basic implementation
(defn only-those-actors? [db movie actors]
(->> (datoms db :eavt movie :film/cast) seq
(every? (fn [[_ _ actor]]
(contains? actors actor)))
))
(defn find-movies-with-exact-cast [db actors-names]
(let [actors (set (d/q '[:find [?actor ...] :in $ [?name ...] ?only-those-actors :where
[?actor :actor/name ?name]]
db actors-names))
query {:find '[[?movie ...]]
:in '[$ ?actors ?db]
:where
(concat
(for [actor actors]
['?movie :film/cast actor])
[['(only-those-actors? ?db ?movie ?actors)]])}]
(d/q query db actors db only-those-actors?)))
You can use predicate fun and d/entity together for filtering datoms by :film/cast field of an entity. This approach looks much more straightforward until Datascript doesn't support negation (not operator and so on).
Look at the row (= a (:age (d/entity db e)) in the test case of the Datascript here
[{:db/id 1 :name "Ivan" :age 10}
{:db/id 2 :name "Ivan" :age 20}
{:db/id 3 :name "Oleg" :age 10}
{:db/id 4 :name "Oleg" :age 20}]
...
(let [pred (fn [db e a]
(= a (:age (d/entity db e))))]
(is (= (q/q '[:find ?e
:in $ ?pred
:where [?e :age ?a]
[(?pred $ ?e 10)]]
db pred)
#{[1] [3]})))))
In your case, the predicate body could look something like this
(clojure.set/subset? actors (:film/cast (d/entity db e))
In regards to performance, the d/entity call is fast because it is a lookup by index.

Mapping a list of datomic ids to entity maps

When I query for a list of Datomic entities, e.g like in the example below:
'[:find ?e
:where
[?e :category/name]]
Usually, I'd like to create a list of maps that represent the full entities, i.e
#{[1234] [2223]} => [{:category/name "x" :db/id 1234}, {:category/name "y" :db/id 2223}]
Here is my approach at the moment, in the form of a helper function.
(defn- db-ids->entity-maps
"Takes a list of datomic entity ids retrieves and returns
a list of hydrated entities in the form of a list of maps."
[db-conn db-ids]
(->>
db-ids
seq
flatten
(map #(->>
%
;; id -> lazy entity map
(d/entity (d/db db-conn))
;; realize all values, except for db/id
d/touch
(into {:db/id %})))))
Is there a better way?
With the pull api, this is pretty easy now.
'[:find [(pull ?e [*]) ...]
:in $ [[?e] ...]
:where [?e]]
I used to take this approach to save queries to the DB, the code is probably less reusable but it depends on what is more critical in your current scenario. I haven't a Datomic instance configured as I am not working with it right now so it may contain syntax error but I hope you get the idea.
(def query-result '[:find ?cat-name ?id
:where
[?cat-name :category/name
[?id :db/id]])
=>
#{["x" 1234] ["x" 2223]}
(defn- describe-values
"Adds proper keys to the given values."
[keys-vec query-result]
(vec (map #(zipmap keys-vec %) query-result))
(describe-values [:category/name :db/id] query-result)
=>
[{:db/id 2223, :category/name "x"} {:db/id 1234, :category/name "x"}]