I tried to use directly Clojure's hashmap with MapDB and ran into weird behaviour. I checked Clojure and MapDB sources and couldn't understand the problem.
First everything looks fine:
lein try org.mapdb/mapdb "1.0.6"
; defining a db for the first time
(import [org.mapdb DB DBMaker])
(defonce db (-> (DBMaker/newFileDB (java.io.File. "/tmp/mapdb"))
.closeOnJvmShutdown
.compressionEnable
.make))
(defonce fruits (.getTreeMap db "fruits-store"))
(do (.put fruits :banana {:qty 2}) (.commit db))
(get fruits :banana)
=> {:qty 2}
(:qty (get fruits :banana))
=> 2
(first (keys (get fruits :banana)))
=> :qty
(= :qty (first (keys (get fruits :banana))))
=> true
CTRL-D
=> Bye for now!
Then I try to access the data again:
lein try org.mapdb/mapdb "1.0.6"
; loading previsously created db
(import [org.mapdb DB DBMaker])
(defonce db (-> (DBMaker/newFileDB (java.io.File. "/tmp/mapdb"))
.closeOnJvmShutdown
.compressionEnable
.make))
(defonce fruits (.getTreeMap db "fruits-store"))
(get fruits :banana)
=> {:qty 2}
(:qty (get fruits :banana))
=> nil
(first (keys (get fruits :banana)))
=> :qty
(= :qty (first (keys (get fruits :banana))))
=> false
(class (first (keys (get fruits :banana))))
=> clojure.lang.Keyword
How come the same keyword be different with respect to = ?
Is there some weird reference problem happening ?
The problem is caused by the way equality of keywords works. Looking at the
implementation of the = function we see that since keywords are not
clojure.lang.Number or clojure.lang.IPersistentCollection their equality is
determined in terms of the Object.equals method. Skimming the source of
clojure.lang.Keyword we learn that keywords don't not override
Object.equals and therefore two keywords are equal iff they are the same
object.
The default serializer of MapDB is org.mapdb.SerializerPojo, a subclass of
org.mapdb.SerializerBase. In its documentation we can read that
it's a
Serializer which uses ‘header byte’ to serialize/deserialize most of classes
from ‘java.lang’ and ‘java.util’ packages.
Unfortunately, it doesn't work that well with clojure.lang classes; It doesn't
preserve identity of keywords, thus breaking equality.
In order to fix it let's attempt to write our own serializer using the
EDN format—alternatively, you could consider, say, Nippy—and use
it in our MapDB.
(require '[clojure.edn :as edn])
(deftype EDNSeralizer []
;; See docs of org.mapdb.Serializer for semantics.
org.mapdb.Serializer
(fixedSize [_]
-1)
(serialize [_ out obj]
(.writeUTF out (pr-str obj)))
(deserialize [_ in available]
(edn/read-string (.readUTF in)))
;; MapDB expects serializers to be serializable.
java.io.Serializable)
(def edn-serializer (EDNSeralizer.))
(import [org.mapdb DB DBMaker])
(def db (.. (DBMaker/newFileDB (java.io.File. "/tmp/mapdb"))
closeOnJvmShutdown
compressionEnable
make))
(def more-fruits (.. db
(createTreeMap "more-fruits")
(valueSerializer (EDNSeralizer.))
(makeOrGet)))
(.put more-fruits :banana {:qty 2})
(.commit db)
Once the more-fruits tree map is reopened in a JVM with EDNSeralizer defined
the :qty object stored inside will be the same object as any other :qty
instance. As a result equality checks will work properly.
Related
I have many JSON objects, and I am trying to filter those objects by the date. These objects are being parsed from several JSON files using Cheshire.core, meaning that the JSON objects are in a collection. The date is being passed in in the following format "YYYY-MM-DD" (eg. 2015-01-10). I have tried using the filter and contains? functions to do this, but I am having no luck so far. How can I filter these JSON objects by my chosen date?
Current Clojure code:
(def filter-by-date?
(fn [orders-data date-chosen]
(contains? (get (get orders-data :date) :date) date-chosen)))
(prn (filter (filter-by-date? orders-data "2017-12-25")))
Example JSON object:
{
"id":"05d8d404-b3f6-46d1-a0f9-dbdab7e0261f",
"date":{
"date":"2015-01-10T19:11:41.000Z"
},
"total":{
"GBP":57.45
}
}
JSON after parsing with Cheshire:
[({:id "05d8d404-b3f6-46d1-a0f9-dbdab7e0261f",
:date {:date "2015-01-10T19:11:41.000Z"},
:total {:GBP 57.45}}) ({:id "325bd04-b3f6-46d1-a0f9-dbdab7e0261f",
:date {:date "2015-02-23T10:15:14.000Z"},
:total {:GBP 32.90}})]
First, I'm going to assume you've parsed the JSON first into something like this:
(def parsed-JSON {:id "05d8d404-b3f6-46d1-a0f9-dbdab7e0261f",
:date {:date "2015-01-10T19:11:41.000Z"},
:total {:GBP 57.45}})
The main problem is the fact that the date as stored in the JSON contains time information, so you aren't going to be able to check it directly using equality.
You can get around this by using clojure.string/starts-with? to check for prefixes. I'm using s/ here as an alias for clojure.string:
(defn filter-by-date [date jsons]
(filter #(s/starts-with? (get-in % [:date :date]) date)
jsons))
You were close, but I made a few changes:
You can't use contains? like that. From the docs of contains?: Returns true if key is present in the given collection, otherwise returns false. It can't be used to check for substrings; it's used to test for the presence of a key in a collection.
Use -in postfix versions to access nested structures instead of using multiple calls. I'm using (get-in ...) here instead of (get (get ...)).
You're using (def ... (fn [])) which makes things more complicated than they need to be. This is essentially what defn does, although defn also adds some more stuff as well.
To address the new information, you can just flatten the nested sequences containing the JSONs first:
(->> nested-json-colls ; The data at the bottom of the question
(flatten)
(filter-by-date "2015-01-10"))
#!/usr/bin/env boot
(defn deps [new-deps]
(merge-env! :dependencies new-deps))
(deps '[[org.clojure/clojure "1.9.0"]
[cheshire "5.8.0"]])
(require '[cheshire.core :as json]
'[clojure.string :as str])
(def orders-data-str
"[{
\"id\":\"987654\",
\"date\":{
\"date\":\"2015-01-10T19:11:41.000Z\"
},
\"total\":{
\"GBP\":57.45
}
},
{
\"id\":\"123456\",
\"date\":{
\"date\":\"2016-01-10T19:11:41.000Z\"
},
\"total\":{
\"GBP\":23.15
}
}]")
(def orders (json/parse-string orders-data-str true))
(def ret (filter #(clojure.string/includes? (get-in % [:date :date]) "2015-01-") orders))
(println ret) ; ({:id 987654, :date {:date 2015-01-10T19:11:41.000Z}, :total {:GBP 57.45}})
You can convert the date string to Date object using any DateTime library like joda-time and then do a proper filter if required.
clj-time has functions for parsing strings and comparing date-time objects. So you could do something like:
(ns filter-by-time-example
(:require [clj-time.coerce :as tc]
[clj-time.core :as t]))
(def objs [{"id" nil
"date" {"date" "2015-01-12T19:11:41.000Z"}
"total" nil}
{"id" "05d8d404-b3f6-46d1-a0f9-dbdab7e0261f"
"date" {"date" "2015-01-10T19:11:41.000Z"}
"total" {"GBP" :57.45}}
{"id" nil
"date" {"date" "2015-01-11T19:11:41.000Z"}
"total" nil}])
(defn filter-by-day
[objs y m d]
(let [start (t/date-time y m d)
end (t/plus start (t/days 1))]
(filter #(->> (get-in % ["date" "date"])
tc/from-string
(t/within? start end)) objs)))
(clojure.pprint/pprint (filter-by-day objs 2015 1 10)) ;; Returns second obj
If you're going to repeatedly do this (e.g. for multiple days) you could parse all dates in your collection into date-time objects with
(map #(update-in % ["date" "date"] tc/from-string) objs)
and then just work with that collection to avoid repeating the parsing step.
(ns filter-by-time-example
(:require [clj-time.format :as f]
[clj-time.core :as t]
[cheshire.core :as cheshire]))
(->> json-coll
(map (fn [json] (cheshire/parse-string json true)))
(map (fn [record] (assoc record :dt-date (f/format (get-in record [:date :date])))))
(filter (fn [record] (t/after? (tf/format "2017-12-25") (:dt-date record))))
(map (fn [record] (dissoc record :dt-date))))
Maybe something like this? You might need to change the filter for your usecase but as :dt-time is now a jodo.DateTime you can leverage all the clj-time predicates.
I've defined a record with a bunch of fields--some of which are computed, some of which don't map directly to keys in the JSON data I'm ingesting. I'm writing a factory function for it, but I want to have sensible default/not-found values. Is there a better way that tacking on :or [field1 "" field2 "" field3 "" field4 ""...]? I could write a macro but I'd rather not if I don't have to.
There are three common idioms for implementing defaults in constructor functions.
:or destructoring
Example:
(defn make-creature [{:keys [type name], :or {type :human
name (str "unnamed-" (name type))}}]
;; ...
)
This is useful when you want to specify the defaults inline. As a bonus, it allows let style bindings in the :or map where the kvs are ordered according to the :keys vector.
Merging
Example:
(def default-creature-spec {:type :human})
(defn make-creature [spec]
(let [spec (merge default-creature-spec
spec)]
;; ....
))
This is useful when you want to define the defaults externally, generate them at runtime and/or reuse them elsewhere.
Simple or
Example:
(defn make-creature [{:keys [type name]}]
(let [type (or type :human)
name (or name (str "unnamed-" (name type)))]
;; ...
))
This is as useful as :or destructoring but only those defaults are evaluated that are actually needed, i. e. it should be used in cases where computing the default adds unwanted overhead. (I don't know why :or evaluates all defaults (as of Clojure 1.7), so this is a workaround).
If you really want the same default value for all the fields, and they really have to be different than nil, and you don't want to write them down again, then you can get the record fields by calling keys on an empty instance, and then construct a map with the default values merged with the actual values:
(defrecord MyFancyRecord [a b c d])
(def my-fancy-record-fields (keys (map->MyFancyRecord {})))
;=> (:a :b :c :d)
(def default-fancy-fields (zipmap my-fancy-record-fields (repeat "")))
(defn make-fancy-record [fields]
(map->MyFancyRecord (merge default-fancy-fields
fields)))
(make-fancy-record {})
;=> {:a "", :b "", :c "", :d ""}
(make-fancy-record {:a 1})
;=> {:a 1, :b "", :c "", :d ""}
To get the list of record fields you could also use the static method getBasis on your record class:
(def my-fancy-record-fields (map keyword (MyFancyRecord/getBasis)))
(getBasis is not part of the public records api so there are no guarantees it won't be removed in future clojure versions. Right now it's available in both clojure and clojurescript, it's usage is explained in "Clojure programming by Chas Emerick, Brian Carper, Christophe Grand" and it's also mentioned in this thread during a discussion about how to get the keys from a record. So, it's up to you to decide if it's a good idea to use it)
I'm generating json as literally as I can in clojure. My problem is that certain branches of the json are only present if given parameters are given. Here is a sample of such a condition
(defn message-for
[name uuid & [generated-uuids]]
{:message {:id (generate-uuid)
:details {:name name}
:metadata {:batch (merge {:id uuid}
(when generated-uuids (let [batches (map #(array-map :id %) generated-uuids)]
{:generatedBatches batches})))}}})
Unfortunately the when/let part is quite ugly. This same could be achieved using when-let as following but it doesn't work because my map returns [] instead of a nil.
(defn message-for
[name uuid & [generated-uuids]]
{:message {:id (generate-uuid)
:details {:name name}
:metadata {:batch (merge {:id uuid}
(when-let [batches (map #(array-map :id %) generated-uuids)]
{:generatedBatches batches}))}}})
Any ideas if I could somehow make when-let consider an empty list/array/seq as false so I could clean up my code a bit?
not-empty returns its argument if it is not empty.
When using when-let with a collection, always use not-empty
to retain the collection type
make refactoring easier
expressivenes
(when-let [batches (not-empty (map ...))]
...)
In your case I'd however prefer something like this:
...
:metadata {:batch (cond-> {:id uuid}
(seq generated-uuids)
(assoc :generatedBatches (map ...)))}
...
Notice that all three of the advantages listed above where met, without a nested let.
Also notice a new advantage
easier to extend with more conditions lateron
seq returns nil on an empty input sequence so you could do:
(when-let [batches (seq (map #(array-map :id %) generated-uuids))]
{:generatedBatches batches}))}}})
I'm trying to call clojure.walk/stringify-keys on a map that might include record instances. Since stringify-keys is recursive, it attempts to convert the keys on my record, (since (map? record-var) is true) which causes an error. Is there any way to tell if a var is a record rather than just a Clojure map? I
d like to provide my own implementation of stringify-keys that is record-aware.
The current implementation of stringify-keys causes the following:
(use '[clojure.walk :only [stringify-keys]])
(defrecord Rec [x])
(let [record (Rec. "foo")
params {:x "x" :rec record}]
(stringify-keys params))
This causes the following exception: UnsupportedOperationException Can't create empty: user.Rec user.Rec (NO_SOURCE_FILE:1)
Records seem to implement the IRecord marker interface:
user=> (defrecord Rec [x])
user.Rec
user=> (require '[clojure.reflect :as r])
nil
user=> (:bases (r/reflect Rec))
#{java.io.Serializable clojure.lang.IKeywordLookup clojure.lang.IPersistentMap clojure.lang.IRecord java.lang.Object clojure.lang.IObj clojure.lang.ILookup java.util.Map}
user=> (instance? clojure.lang.IRecord (Rec. "hi"))
true
Update
1.6 now has the record? functions
you can check the type of each member and see if it is really a map or something else (where something else is presumed to be a record)
user> (type {:a 1 :b 2})
clojure.lang.PersistentArrayMap
If I have the request "size=3&mean=1&sd=3&type=pdf&distr=normal" what's the idiomatic way of writing the function (defn request->map [request] ...) that takes this request and
returns a map {:size 3, :mean 1, :sd 3, :type pdf, :distr normal}
Here is my attempt (using clojure.walk and clojure.string):
(defn request-to-map
[request]
(keywordize-keys
(apply hash-map
(split request #"(&|=)"))))
I am interested in how others would solve this problem.
Using form-decode and keywordize-keys:
(use 'ring.util.codec)
(use 'clojure.walk)
(keywordize-keys (form-decode "hello=world&foo=bar"))
{:foo "bar", :hello "world"}
Assuming you want to parse HTTP request query parameters, why not use ring? ring.middleware.params contains what you want.
The function for parameter extraction goes like this:
(defn- parse-params
"Parse parameters from a string into a map."
[^String param-string encoding]
(reduce
(fn [param-map encoded-param]
(if-let [[_ key val] (re-matches #"([^=]+)=(.*)" encoded-param)]
(assoc-param param-map
(codec/url-decode key encoding)
(codec/url-decode (or val "") encoding))
param-map))
{}
(string/split param-string #"&")))
You can do this easily with a number of Java libraries. I'd be hesitant to try to roll my own parser unless I read the URI specs carefully and made sure I wasn't missing any edge cases (e.g. params appearing in the query twice with different values). This uses jetty-util:
(import '[org.eclipse.jetty.util UrlEncoded MultiMap])
(defn parse-query-string [query]
(let [params (MultiMap.)]
(UrlEncoded/decodeTo query params "UTF-8")
(into {} params)))
user> (parse-query-string "size=3&mean=1&sd=3&type=pdf&distr=normal")
{"sd" "3", "mean" "1", "distr" "normal", "type" "pdf", "size" "3"}
Can also use this library for both clojure and clojurescript: https://github.com/cemerick/url
user=> (-> "a=1&b=2&c=3" cemerick.url/query->map clojure.walk/keywordize-keys)
{:a "1", :b "2", :c "3"}
Yours looks fine. I tend to overuse regexes, so I would have solved it as
(defn request-to-keywords [req]
(into {} (for [[_ k v] (re-seq #"([^&=]+)=([^&]+)" req)]
[(keyword k) v])))
(request-to-keywords "size=1&test=3NA=G")
{:size "1", :test "3NA=G"}
Edit: try to stay away from clojure.walk though. I don't think it's officially deprecated, but it's not very well maintained. (I use it plenty too, though, so don't feel too bad).
I came across this question when constructing my own site and the answer can be a bit different, and easier, if you are passing parameters internally.
Using Secretary to handle routing: https://github.com/gf3/secretary
Parameters are automatically extracted to a map in :query-params when a route match is found. The example given in the documentation:
(defroute "/users/:id" [id query-params]
(js/console.log (str "User: " id))
(js/console.log (pr-str query-params)))
(defroute #"/users/(\d+)" [id {:keys [query-params]}]
(js/console.log (str "User: " id))
(js/console.log (pr-str query-params)))
;; In both instances...
(secretary/dispach! "/users/10?action=delete")
;; ... will log
;; User: 10
;; "{:action \"delete\"}"
You can use ring.middleware.params. Here's an example with aleph:
user=> (require '[aleph.http :as http])
user=> (defn my-handler [req] (println "params:" (:params req)))
user=> (def server (http/start-server (wrap-params my-handler)))
wrap-params creates an entry in the request object called :params. If you want the query parameters as keywords, you can use ring.middleware.keyword-params. Be sure to wrap with wrap-params first:
user=> (require '[ring.middleware.params :refer [wrap-params]])
user=> (require '[ring.middleware.keyword-params :refer [wrap-keyword-params])
user=> (def server
(http/start-server (wrap-keyword-params (wrap-params my-handler))))
However, be mindful that this includes a dependency on ring.