How to query against attributes of multiple values? - clojure

Tested on datascript 1.3.0
datoms:
[{:db/id -1 :name "Oliver Smith" :hobbies ["reading" "sports" "music"]}]
tried to run the query below to find who like sports, but the empty set returned.
'[:find ?name
:where
[?p :name ?name]
[?p :hobbies ?hobbies]
[(some #{"sports"} ?hobbies)]]
How to formulate the query correctly to get the expected result below?
#{[Oliver Smith]}

We have to explicitly define the schema with cardinality/many against the attribute of multiple values to solve the problem since schemaless doesn't work here.
(require '[datascript.core :as d])
(def schema {:hobbies {:db/cardinality db.cardinality/many}})
(def conn (d/create-conn schema))
(def datoms [{:db/id -1 :name "Oliver Smith" :hobbies ["reading" "sports" "music"]}])
(d/transact! conn datoms)
(def query '[:find ?name :where [?p :name ?name] [?p :hobbies "sports"]])
(-> (d/q query #conn) println)

Related

Why logical "and" predicates do not work?

Tested on datascript 1.3.0
datoms:
[{:db/id -1 :name "Smith" :firstname "Oliver" :age 20}
{:db/id -2 :name "Jones" :firstname "Oliver" :age 20}
{:db/id -3 :name "Smith" :firstname "Amelia" :age 16}
{:db/id -4 :name "Jones" :firstname "Amelia" :age 16}]
tried to query with logical and predicates below who are named Smith and aged older than 18 years, why did it return the unfiltered whole set?
'[:find ?firstname ?name
:where
[?p :name ?name]
[?p :firstname ?firstname]
[?p :age ?age]
[(and (= ?name "Smith") (> ?age 18))]]
;;; wrong result: #{[Oliver Smith] [Oliver Jones] [Amelia Smith] [Amelia Jones]}
then changed to query with discrete predicates and got the satisfied result as expected.
'[:find ?firstname ?name
:where
[?p :name ?name]
[?p :firstname ?firstname]
[?p :age ?age]
[(= ?name "Smith")]
[(> ?age 18)]]
;;; correct result: #{[Oliver Smith]}
Do datomic and datascript or datalog in general only support data patterns scattered to discrete clauses? Are conventional logical operations and etc. incompatible here?
This is because AND is implicit. All clauses are implicitly joined by AND as in they all must be true at the same time for the query to match
According to the manual you cannot use an and-clause just like that. The only way you can use an and-clause is when it is inside an or-clause:
Inside the or clause, you may use an and clause to specify
conjunction. This clause is not available outside of an or clause,
since conjunction is the default in other clauses.

Datomic not returning the correct "min" result when retrieving entity ID in result tuple

I've got this simple schema and data:
(def product-offer-schema
[{:db/ident :product-offer/product
:db/valueType :db.type/ref
:db/cardinality :db.cardinality/one}
{:db/ident :product-offer/vendor
:db/valueType :db.type/ref
:db/cardinality :db.cardinality/one}
{:db/ident :product-offer/price
:db/valueType :db.type/long
:db/cardinality :db.cardinality/one}
{:db/ident :product-offer/stock-quantity
:db/valueType :db.type/long
:db/cardinality :db.cardinality/one}
])
(d/transact conn product-offer-schema)
(d/transact conn
[{:db/ident :vendor/Alice}
{:db/ident :vendor/Bob}
{:db/ident :product/BunnyBoots}
{:db/ident :product/Gum}
])
(d/transact conn
[{:product-offer/vendor :vendor/Alice
:product-offer/product :product/BunnyBoots
:product-offer/price 9981 ;; $99.81
:product-offer/stock-quantity 78
}
{:product-offer/vendor :vendor/Alice
:product-offer/product :product/Gum
:product-offer/price 200 ;; $2.00
:product-offer/stock-quantity 500
}
{:product-offer/vendor :vendor/Bob
:product-offer/product :product/BunnyBoots
:product-offer/price 9000 ;; $90.00
:product-offer/stock-quantity 15
}
])
When I retrieve the cheapest bunny boots, only retrieving the price, I get the expected result (9000):
(def cheapest-boots-q '[:find (min ?p) .
:where
[?e :product-offer/product :product/BunnyBoots]
[?e :product-offer/price ?p]
])
(d/q cheapest-boots-q db)
;; => 9000
However, when I want to get the entity ID along with the price, it gives me the higher-priced boots:
(def db (d/db conn))
(def cheapest-boots-q '[:find [?e (min ?p)]
:where
[?e :product-offer/product :product/BunnyBoots]
[?e :product-offer/price ?p]
])
(d/q cheapest-boots-q db)
;; => [17592186045423 9981]
I tried adding :with but that gives me an error:
(def cheapest-boots-q '[:find [?e (min ?p)]
:with ?e
:where
[?e :product-offer/product :product/BunnyBoots]
[?e :product-offer/price ?p]
])
(d/q cheapest-boots-q db)
;; => => Execution error (ArrayIndexOutOfBoundsException) at datomic.datalog/fn$project (datalog.clj:503).
What am I doing wrong?
As a commenter kind of pointed out, ?e isn't bound in any way to the (min ?p) expression, so it's not defined what you'll get there, beyond a product entity id of some sort.
What you actually want to do is unify those values somehow as part of the query, and not perform aggregation on the results, for example:
(d/q '[:find [?e ?p]
:where
[?e :product-offer/product :product/BunnyBoots]
[?e :product-offer/price ?p]
[(min ?p)]]
db)
You can see that the min clause is part of the query, and as such will take part in the unification on the result, giving you what you want.

Filter by cardinality-many field that contains ALL the values

I’ve got an account model that has interests, which is an array of strings (db.type=string and cardinality=many). For example:
{:id 42
:name "Test"
:interests ["cars" "games" "etc"]
:age 50}
Then another list of interests come from the request. How can I query accounts what have ALL the specified interests?
For example, for ["cars" "games"] I'll get the account #42 since it includes the passed list. But for ["cars" "games" "books"] I won't because "books" is an extra one.
UPD
What I have is:
;; model
{:db/ident :account/interests
:db/valueType :db.type/string
:db/cardinality :db.cardinality/many
:db/index true}
;; a query
[:find ?id
:id $ [?interest ...]
:where
[?a :account/interests ?interest]
[?a :account/id ?id]]
So how should I build a Datomic query?
Try using entities in a query
(d/q '[:find ?e
:in $ ?interests
:where
[?e :interests _]
[(datomic.api/entity $ ?e) ?ent]
[(:interests ?ent) ?ent_interests]
[(subset? ?interests ?ent_interests)]]
(d/db conn)
#{"cars" "games"})
(time (d/q
'[:find ?id
:in $
:where
[?a :account/interests "foo"]
[?a :account/interests "bar"]
[?a :account/id ?id]]
(d/db conn)))
"Elapsed time: 3.771973 msecs"
(time (d/q
'[:find ?id
:in $ ?iii
:where
[(datomic.api/entity $ ?a) ?e]
[(:account/interests ?e) ?interests]
[(clojure.set/superset? ?interests ?iii)]
[?a :account/id ?id]]
(d/db conn)
#{"50 Cent" "Целоваться"}))
"Elapsed time: 169.767354 msecs"
The first one is better.

How to construct a query that matches exactly a vector of refs in DataScript?

Setup Consider the following DataScript database of films and cast, with data stolen from learndatalogtoday.org: the following code can be executed in a JVM/Clojure REPL or a ClojureScript REPL, as long as project.clj contains [datascript "0.15.0"] as a dependency.
(ns user
(:require [datascript.core :as d]))
(def data
[["First Blood" ["Sylvester Stallone" "Brian Dennehy" "Richard Crenna"]]
["Terminator 2: Judgment Day" ["Linda Hamilton" "Arnold Schwarzenegger" "Edward Furlong" "Robert Patrick"]]
["The Terminator" ["Arnold Schwarzenegger" "Linda Hamilton" "Michael Biehn"]]
["Rambo III" ["Richard Crenna" "Sylvester Stallone" "Marc de Jonge"]]
["Predator 2" ["Gary Busey" "Danny Glover" "Ruben Blades"]]
["Lethal Weapon" ["Gary Busey" "Mel Gibson" "Danny Glover"]]
["Lethal Weapon 2" ["Mel Gibson" "Joe Pesci" "Danny Glover"]]
["Lethal Weapon 3" ["Joe Pesci" "Danny Glover" "Mel Gibson"]]
["Alien" ["Tom Skerritt" "Veronica Cartwright" "Sigourney Weaver"]]
["Aliens" ["Carrie Henn" "Sigourney Weaver" "Michael Biehn"]]
["Die Hard" ["Alan Rickman" "Bruce Willis" "Alexander Godunov"]]
["Rambo: First Blood Part II" ["Richard Crenna" "Sylvester Stallone" "Charles Napier"]]
["Commando" ["Arnold Schwarzenegger" "Alyssa Milano" "Rae Dawn Chong"]]
["Mad Max 2" ["Bruce Spence" "Mel Gibson" "Michael Preston"]]
["Mad Max" ["Joanne Samuel" "Steve Bisley" "Mel Gibson"]]
["RoboCop" ["Nancy Allen" "Peter Weller" "Ronny Cox"]]
["Braveheart" ["Sophie Marceau" "Mel Gibson"]]
["Mad Max Beyond Thunderdome" ["Mel Gibson" "Tina Turner"]]
["Predator" ["Carl Weathers" "Elpidia Carrillo" "Arnold Schwarzenegger"]]
["Terminator 3: Rise of the Machines" ["Nick Stahl" "Arnold Schwarzenegger" "Claire Danes"]]])
(def conn (d/create-conn {:film/cast {:db/valueType :db.type/ref
:db/cardinality :db.cardinality/many}
:film/name {:db/unique :db.unique/identity
:db/cardinality :db.cardinality/one}
:actor/name {:db/unique :db.unique/identity
:db/cardinality :db.cardinality/one}}))
(def all-datoms (mapcat (fn [[film actors]]
(into [{:film/name film}]
(map #(hash-map :actor/name %) actors)))
data))
(def all-relations (mapv (fn [[film actors]]
{:db/id [:film/name film]
:film/cast (mapv #(vector :actor/name %) actors)}) data))
(d/transact! conn all-datoms)
(d/transact! conn all-relations)
Description In a nutshell, there are two kinds of entities in this database—films and actors (word intended to be ungendered)—and three kinds of datoms:
film entity: :film/name (a unique string)
film entity: :film/cast (multiple refs)
actor entity: :actor/name (unique string)
Question I would like to construct a query which asks: which films have these N actors, and these N actors alone, appeared as the sole stars, for N>=2?
E.g., RoboCop starred Nancy Allen, Peter Weller, Ronny Cox, but no film starred solely the first two of these, Allen and Weller. Therefore, I would expect the following query to produce the empty set:
(d/q '[:find ?film-name
:where
[?film :film/name ?film-name]
[?film :film/cast ?actor-1]
[?film :film/cast ?actor-2]
[?actor-1 :actor/name "Nancy Allen"]
[?actor-2 :actor/name "Peter Weller"]]
#conn)
; => #{["RoboCop"]}
However, the query is flawed because I don't know how to express that any matches should exclude any actors who are not Allen or Weller—again, I want to find the movies where only Allen and Weller have collaborated without any other actors, so I want to adapt the above query to produce the empty set. How can I adjust this query to enforce this requirement?
Because DataScript doesn't have negation (as of May 2016), I don't believe that's possible with one static query in 'pure' Datalog.
My way to go would be:
build the query programmatically to add the N clauses that state that the cast must contain the N actors
Add a predicate function which, given a movie, the database, and the set of actors ids, uses the EAVT index to find if each movie has an actor that is not in the set.
Here's a basic implementation
(defn only-those-actors? [db movie actors]
(->> (datoms db :eavt movie :film/cast) seq
(every? (fn [[_ _ actor]]
(contains? actors actor)))
))
(defn find-movies-with-exact-cast [db actors-names]
(let [actors (set (d/q '[:find [?actor ...] :in $ [?name ...] ?only-those-actors :where
[?actor :actor/name ?name]]
db actors-names))
query {:find '[[?movie ...]]
:in '[$ ?actors ?db]
:where
(concat
(for [actor actors]
['?movie :film/cast actor])
[['(only-those-actors? ?db ?movie ?actors)]])}]
(d/q query db actors db only-those-actors?)))
You can use predicate fun and d/entity together for filtering datoms by :film/cast field of an entity. This approach looks much more straightforward until Datascript doesn't support negation (not operator and so on).
Look at the row (= a (:age (d/entity db e)) in the test case of the Datascript here
[{:db/id 1 :name "Ivan" :age 10}
{:db/id 2 :name "Ivan" :age 20}
{:db/id 3 :name "Oleg" :age 10}
{:db/id 4 :name "Oleg" :age 20}]
...
(let [pred (fn [db e a]
(= a (:age (d/entity db e))))]
(is (= (q/q '[:find ?e
:in $ ?pred
:where [?e :age ?a]
[(?pred $ ?e 10)]]
db pred)
#{[1] [3]})))))
In your case, the predicate body could look something like this
(clojure.set/subset? actors (:film/cast (d/entity db e))
In regards to performance, the d/entity call is fast because it is a lookup by index.

Possible to get enum value via Datomic pull syntax?

In the mbrainz sample data, the :artist/type is an enum. Is it possible to pull the value of the enum out of :db/ident and associate it as the value of the :artist/type key using pull syntax?
This is as close as I could get:
[:find (pull ?e [:artist/name {:artist/type [:db/ident]}])
:where
[?e :artist/name "Ray Charles"]
]
;;=> [[{:artist/name "Ray Charles", :artist/type {:db/ident :artist.type/person}}]]
Is it possible to use pull syntax to reshape the result into something like this?
;;=> [[{:artist/name "Ray Charles", :artist/type :artist.type/person}]]
I don't think you can do it using the Pull API the way you are seeking. You may find that it is easier to use the Tupelo Datomic library:
(require '[tupelo.datomic :as td]
'[tupelo.core :refer [spyx]] )
(let [x1 (td/query-scalar :let [$ db-val]
:find [ ?e ]
:where [ [?e :artist/name "Ray Charles"] ] )
x2 (td/entity-map db-val x1)
]
(spyx x1)
(spyx x2)
)
which gives the result:
x1 => 17592186049074
x2 => {:artist/sortName "Charles, Ray", :artist/name "Ray Charles",
:artist/type :artist.type/person, :artist/country :country/US,
:artist/gid #uuid "2ce02909-598b-44ef-a456-151ba0a3bd70",
:artist/startDay 23, :artist/endDay 10, :artist/startYear 1930,
:artist/endMonth 6, :artist/endYear 2004, :artist/startMonth 9,
:artist/gender :artist.gender/male}
So :artist/type is already converted into the :db/ident value and you can just pull it out of the map.
You can use specter on the result that the pull expression returns:
(->> pull-result
(sp/transform (sp/walker :db/ident) :db/ident))
The value of key :db/ident is extracted for every map that has that key.
Was quite easy to do with postwalk
for any pulled :db/ident you can transform with this function
(defn flatten-ident [coll]
(clojure.walk/postwalk
(fn [item] (get item :db/ident item)) coll))