Easiest way to search records in Clojure - clojure

I have a map in Clojure something like this:
(def stuff #{
{:a "help" :b "goodbye"}
{:c "help2" :b "goodbye"}
{:a "steve" :b "goodbye"}
{:c "hello2" :b "sue"}
})
: and I want to provide a search so that:
(search stuff "help")
: would return :
#{
{:a "help" :b "goodbye"}
{:c "help2" :b "goodbye"}
}
: What is the simplest way to do this?

user=> (defn search [s q] (select #(some (partial re-find (re-pattern q)) (vals %)) s))
#'user/search
user=> (search stuff "help")
#{{:a "help", :b "goodbye"} {:c "help2", :b "goodbye"}}
This does the trick.

Full text search is a different topic, but if you can live with regexps I would use something like:
(defn match [re e]
(re-find re (:a e))
(defn search [re m]
(into #{} (filter (partial match re) m)))

(filter (comp #{"help"} :a) stuff): the freshly-composed function first calls :a on the target, then calls #{"help"} on the result: this returns a truthy value iff the :a attribute is exactly "help".
Converting this into a set, and encapsulating it in a function with the arguments you want to tweak, is left as an exercise for the reader. Frankly, though, the code is so simple that it might well be shorter and more readable to rewrite it each time you want to do a "search".

Related

How do I exit a Clojure walk postwalk on a nested maps on the first true predicate match?

I am using clojure.walk/postwalk to compare a predicate to every map in a nested collection and want to exit with true on the first true. How would I do that? I am ok with it walking the whole data structure and then returning true if there is a true match.
As a corollary question, I guess the same question could apply to when one performs a map as opposed to a postwalk.
UPDATE: this was truly a tired/lazy question; I should have provided a code example. That said, I'm leaving it up in case anyone is currently formulating an answer to my half-baked question. The only thing that is worse than asking one is taking it down after someone has been kind enough to start helping. I will be quite content if no one answers, if they request a better question, or if they just give me suggestions of what to research.
a bit different way to do it, also employing tree-seq:
(defn find-deep [pred data not-found]
(->> data
(tree-seq coll? seq)
(some #(when (pred %) [%]))
((fnil first [not-found]))))
user> (find-deep #(= (:c %) 30) [{:a 10 :b [{:c 20 :d {:c 30}}]}] ::none)
;;=> {:c 30}
user> (find-deep #(= (:c %) 40) [{:a 10 :b [{:c 20 :d {:c 30}}]}] ::none)
;;=> :user/none
You may be interested in this function I call walk-seq. It returns a lazy depth-first sequence over a data structure which you can then seek against to find the first match. I find it to be preferable here because it doesn't require callbacks and exceptions to exit early like clojure.walk/postwalk would.
(defn walk-seq
"Returns a lazy depth-first sequence of all forms within a data structure."
[form]
(tree-seq coll? seq form))
(defn seek
"Find the first element in the collection that matches pred,
else returns not-found. Note that using seek can lead to
poor performance and you should always use indexed data
structures instead of multiple seeks over the same data."
([pred coll]
(seek pred coll nil))
([pred coll not-found]
(reduce (fn [nf x] (if (pred x) (reduced x) nf)) not-found coll)))
Usage of walk-seq:
(walk-seq {:a [{:b -1} {:b 1}] :b 2})
=>
({:a [{:b -1} {:b 1}], :b 2}
[:a [{:b -1} {:b 1}]]
:a
[{:b -1} {:b 1}]
{:b -1}
[:b -1]
:b
-1
{:b 1}
[:b 1]
:b
1
[:b 2]
:b
2)
Combining the two:
(seek (every-pred number? pos?) (walk-seq {:a [{:b -1} {:b 1}] :b 2}))
=>
1
It can be done using postwalk by throwing an exception once the predicate is true as I suggested in the comment. This approach is unconventional but concise and lets us reuse the logic of postwalk for walking the datastructure:
(defn walk-some [pred data]
(try
(clojure.walk/postwalk
#(if (pred %)
(throw (ex-info "Found" {:data %}))
%)
data)
false
(catch clojure.lang.ExceptionInfo e
true)))
(walk-some #(and (number? %) (odd? %)) {:a [[9] 3]})
;; => true
(walk-some #(and (number? %) (even? %)) {:a [[9] 3]})
;; => false
Using exceptions for control flow is rarely needed but occasionally it useful to deviate a bit from convention. You may want to define a custom exception type for improved robustness in case your predicate can throw objects of type ExceptionInfo.

accessing Clojure's thread-first macro arguments

I was wondering if there was a way to access the arguments value of a thread-first macro in Clojure while it is being executed on.
for example:
(def x {:a 1 :b 2})
(-> x
(assoc :a 20) ;; I want the value of x after this step
(assoc :b (:a x))) ;; {:a 20, :b 1}
It has come to my attention that this works:
(-> x
(assoc :a 20)
((fn [x] (assoc x :b (:a x))))) ;; {:a 20, :b 20}
But are there any other ways to do that?
You can use as->:
(let [x {:a 1 :b 2}]
(as-> x it
(assoc it :a 20)
(assoc it :b (:a it))))
In addition to akond's comment, note that using as-> can get quite confusing quickly. I recommend either extracting a top level function for these cases, or trying to use as-> in -> only:
(-> something
(process-something)
(as-> $ (do-something $ (very-complicated $)))
(finish-processing))

Lazy self-recursive data structures in Clojure

Is there a way to describe arbitrary lazy self-recursive data structures in Clojure?
Let's say for example I wanted to do something like this:
(def inf-seq (fn rec [] (lazy-seq (cons 42 (rec)))))
(take 3 (inf-seq))
but with a map:
(def inf-map (fn rec [] (??? {:a (rec) :b 42})))
(get-in (inf-map) [:a :a :a :b])
Sequence laziness does not apply to deferred function evaluation in Clojure, which you would obviously need for constructing infinitely nested maps.
Try using Delays:
user=> (def inf-map (fn rec [] {:a (delay (rec)) :b 42}))
#'user/inf-map
user=> (inf-map)
{:a #<Delay#4e9f9a19: :pending>, :b 42}
user=> #(:a (inf-map))
{:a #<Delay#5afd479c: :pending>, :b 42}

dissoc in clojure can't get to work

I have this function:
(defn dissoc-all [m kv]
(let [[k & ks] kv]
(dissoc m k ks)))
Where m is the map and kv is the vector of keys. I use it like this:
(dissoc-all {:a 1 :b 2} [:a :b])
=>{:b 2}
This is not what I've expected. ks has :b but I don't know why it is not being use by dissoc. Anyone can help me with this?
Edit: Added question is that why is this not triggering the 3rd overload of dissoc, which is dissoc [map key & ks]?
Changed name from dissoc-in to dissoc-all as noisesmith have said, -in is not a proper name for this and I agree.
This won't work because ks is a collection of all the elements in kv after the first. So instead of :b it is [:b].
Instead, you can just use apply:
(defn dissoc-in [m vs]
(apply dissoc m vs))
Also, dissoc-in is an odd name for this function, because the standard functions with -in in the name all do nested access, and this does not use the keys to do any nested access of the map.
Why not something like this?
(defn dissoc-all [m ks]
(apply dissoc m ks))
(dissoc-all {:a 1 :b 2} [:a :b])
=> {}
The reason the third overlod of dissoc is not getting called is because it does not expect a collection of keys like [:a :b] - it expects just the keys.
For example:
(dissoc {:a "a" :b "b" :c "c" :d "d"} :a :b :c)
=> {:d "d"}
Further to noisesmith's answer:
You're being confused by the overloads/arities of dissoc, which have this simple effect:
[m & ks]
"Returns a new map of the same (hashed/sorted) type,
that does not contain a mapping for any of ks. "
The explicit arities for no keys and one key are for performance. Many clojure functions are so organised, and the documentation follows the organisation, not the underlying idea.
Now, the action of
(dissoc-all {:a 1 :b 2} [:a :b])
;{:b 2}
is to bind
k to :a
ks to [:b]
Note the latter. The example removes the :a but fails to remove the [:b], which isn't there.
You can use apply to crack open ks:
(defn dissoc-all [m kk]
(let [[k & ks] kk]
(apply dissoc m k ks)))
(dissoc-all {:a 1 :b 2} [:a :b])
;{}
... or, better, do as #noisesmith does and short-circuit the destructuring, using apply at once.

Clojure: How to pass two sets of unbounded parameters?

Contrived example to illustrate:
(def nest1 {:a {:b {:c "foo"}}})
(def nest2 {:d {:e "bar"}})
If I wanted to conj these nests at arbitrary levels, I could explicitly do this:
(conj (-> nest1 :a :b) (-> nest2 :d)) ; yields {:e "bar", :c "foo"}
(conj (-> nest1 :a) (-> nest2 :d)) ; yields {:e "bar", :b {:c "foo"}}
But what if I wanted to create a function that would accept the "depth" of nest1 and nest2 as parameters?
; Does not work, but shows what I am trying to do
(defn join-nests-by-paths [nest1-path nest2-path]
(conj (-> nest1 nest1-path) (-> nest2 nest2-path))
And I might try to call it like this:
; Does not work
(join-nests-by-paths '(:a :b) '(:d))
This doesn't work. I can't simply pass each "path" as a list to the function (or maybe I can, but need to work with it differently in the function).
Any thoughts? TIA...
Sean
Use get-in:
(defn join-by-paths [path1 path2]
(conj (get-in nest1 path1) (get-in nest2 path2)))
user> (join-by-paths [:a :b] [:d])
{:e "bar", :c "foo"}
user> (join-by-paths [:a] [:d])
{:e "bar", :b {:c "foo"}}
Your version is actually doing something like this:
user> (macroexpand '(-> nest1 (:a :b)))
(:a nest1 :b)
which doesn't work, as you said.
get-in has friends assoc-in and update-in, all for working with nested maps of maps. There's a dissoc-in somewhere in clojure.conrtrib.
(In Clojure it's more idiomatic to use vectors instead of quoted lists when you're passing around sequential groups of things.)