clojure set of maps - basic filtering - clojure

Clojure beginner here..
If I have a set of maps, such as
(def kids #{{:name "Noah" :age 5}
{:name "George":age 3}
{:name "Reagan" :age 1.5}})
I know I can get names like this
(map :name kids)
1) How do I select a specific map? For example
I want to get back the map where name="Reagan".
{:name "Reagan" :age 1.5}
Can this be done using a filter?
2) How about returning the name where age = 3?

Yes, you can do it with filter:
(filter #(= (:name %) "Reagan") kids)
(filter #(= (:age %) 3) kids)

There's clojure.set/select:
(clojure.set/select set-of-maps #(-> % :age (= 3)))
And similarly with name and "Reagan". The return value in this case will be a set.
You could also use filter without any special preparations, since filter calls seq on its collection argument (edit: as already described by ffriend while I was typing this):
(filter #(-> % :age (= 3))) set-of-maps)
Here the return value will be a lazy seq.
If you know there will only be one item satisfying your predicate in the set, some will be more efficient (as it will not process any additional elements after finding the match):
(some #(if (-> % :age (= 3)) %) set-of-maps)
The return value here will be the matching element.

Related

Clojure/FP: apply functions to each argument to an operator

Let's say I have several vectors
(def coll-a [{:name "foo"} ...])
(def coll-b [{:name "foo"} ...])
(def coll-c [{:name "foo"} ...])
and that I would like to see if the names of the first elements are equal.
I could
(= (:name (first coll-a)) (:name (first coll-b)) (:name (first coll-c)))
but this quickly gets tiring and overly verbose as more functions are composed. (Maybe I want to compare the last letter of the first element's name?)
To directly express the essence of the computation it seems intuitive to
(apply = (map (comp :name first) [coll-a coll-b coll-c]))
but it leaves me wondering if there's a higher level abstraction for this sort of thing.
I often find myself comparing / otherwise operating on things which are to be computed via a single composition applied to multiple elements, but the map syntax looks a little off to me.
If I were to home brew some sort of operator, I would want syntax like
(-op- (= :name first) coll-a coll-b coll-c)
because the majority of the computation is expressed in (= :name first).
I'd like an abstraction to apply to both the operator & the functions applied to each argument. That is, it should be just as easy to sum as compare.
(def coll-a [{:name "foo" :age 43}])
(def coll-b [{:name "foo" :age 35}])
(def coll-c [{:name "foo" :age 28}])
(-op- (+ :age first) coll-a coll-b coll-c)
; => 106
(-op- (= :name first) coll-a coll-b coll-c)
; => true
Something like
(defmacro -op-
[[op & to-comp] & args]
(let [args' (map (fn [a] `((comp ~#to-comp) ~a)) args)]
`(~op ~#args')))
Is there an idiomatic way to do this in clojure, some standard library function I could be using?
Is there a name for this type of expression?
For your addition example, I often use transduce:
(transduce
(map (comp :age first))
+
[coll-a coll-b coll-c])
Your equality use case is trickier, but you could create a custom reducing function to maintain a similar pattern. Here's one such function:
(defn all? [f]
(let [prev (volatile! ::no-value)]
(fn
([] true)
([result] result)
([result item]
(if (or (= ::no-value #prev)
(f #prev item))
(do
(vreset! prev item)
true)
(reduced false))))))
Then use it as
(transduce
(map (comp :name first))
(all? =)
[coll-a coll-b coll-c])
The semantics are fairly similar to your -op- macro, while being both more idiomatic Clojure and more extensible. Other Clojure developers will immediately understand your usage of transduce. They may have to investigate the custom reducing function, but such functions are common enough in Clojure that readers can see how it fits an existing pattern. Also, it should be fairly transparent how to create new reducing functions for use cases where a simple map-and-apply wouldn't work. The transducing function can also be composed with other transformations such as filter and mapcat, for cases when you have a more complex initial data structure.
You may be looking for the every? function, but I would enhance clarity by breaking it down and naming the sub-elements:
(let [colls [coll-a coll-b coll-c]
first-name (fn [coll] (:name (first coll)))
names (map first-name colls)
tgt-name (first-name coll-a)
all-names-equal (every? #(= tgt-name %) names)]
all-names-equal => true
I would avoid the DSL, as there is no need and it makes it much harder for others to read (since they don't know the DSL). Keep it simple:
(let [colls [coll-a coll-b coll-c]
vals (map #(:age (first %)) colls)
result (apply + vals)]
result => 106
I don't think you need a macro, you just need to parameterize your op function and compare functions. To me, you are pretty close with your (apply = (map (comp :name first) [coll-a coll-b coll-c])) version.
Here is one way you could make it more generic:
(defn compare-in [op to-compare & args]
(apply op (map #(get-in % to-compare) args)))
(compare-in + [0 :age] coll-a coll-b coll-c)
(compare-in = [0 :name] coll-a coll-b coll-c)
;; compares last element of "foo"
(compare-in = [0 :name 2] coll-a coll-b coll-c)
I actually did not know you can use get on strings, but in the third case you can see we compare the last element of each foo.
This approach doesn't allow the to-compare arguments to be arbitrary functions, but it seems like your use case mainly deals with digging out what elements you want to compare, and then applying an arbitrary function to those values.
I'm not sure this approach is better than the transducer version supplied above (certainly not as efficient), but I think it provides a simpler alternative when that efficiency is not needed.
I would split this process into three stages:
transform items in collections into the data in collections you want to operate
on - (map :name coll);
Operate on transformed items in collections, returning collection of results - (map = transf-coll-a transf-coll-b transf-coll-c)
Finally, selecting which result in resulting collection to return - (first calculated-coll)
When playing with collections, I try to put more than one item into collection:
(def coll-a [{:name "foo" :age 43} {:name "bar" :age 45}])
(def coll-b [{:name "foo" :age 35} {:name "bar" :age 37}])
(def coll-c [{:name "foo" :age 28} {:name "bra" :age 30}])
For example, matching items by second char in :name and returning result for items in second place:
(let
[colls [coll-a coll-b coll-c]
transf-fn (comp #(nth % 1) :name)
op =
fetch second]
(fetch (apply map op (map #(map transf-fn %) colls))))
;; => false
In transducers world you can use sequence function which also works on multiple collections:
(let
[colls [coll-a coll-b coll-c]
transf-fn (comp (map :name) (map #(nth % 1)))
op =
fetch second]
(fetch (apply sequence (map op) (map #(sequence transf-fn %) colls))))
Calculate sum of ages (for all items at the same level):
(let
[colls [coll-a coll-b coll-c]
transf-fn (comp (map :age))
op +
fetch identity]
(fetch (apply sequence (map op) (map #(sequence transf-fn %) colls))))
;; => (106 112)

Clojure Atom Doesn't Update When Wrapped in Defined Function

Not sure what is going on here, but I have this code, where the map function successfully executes in my repl without being wrapped in a defined function:
(def dogs '({:name "scout" :age 5} {:name "rux" :age 3} {:name "fenley" :age 2}))
(def ages (atom {:above-four '() :below-four '()}))
(map
#(if (> (get-in % [:age]) 4)
(swap! ages update-in [:above-four] merge %)
(swap! ages update-in [:below-four] merge %)) dogs)
#ages
=> {:above-four ({:name "scout", :age 5}), :below-four ({:name "fenley", :age 2} {:name "rux", :age 3})}
Yet, when I define the map function as such:
(def ages (atom {:above-four '() :below-four '()}))
(def dogs '({:name "scout" :age 5} {:name "rux" :age 3} {:name "fenley" :age 2}))
(defn test-dogs []
(map
#(if (> (get-in % [:age]) 4)
(swap! ages update-in [:above-four] merge %)
(swap! ages update-in [:below-four] merge %)) dogs)
#ages)
I get the following result:
=> {:above-four (), :below-four ()}
I'm very confused, because this function taken straight from the Clojure docs works just fine:
(def m1 (atom {:a "A" :b "B"}))
(defn update-m1 []
(swap! m1 assoc :a "Aaay")
#m1)
=> {:a "Aaay", :b "B"}
Because test-dogs uses map, it returns a lazy sequence. The elements of lazy sequences aren't realized until they're needed.
The problem with your set up is you're trying to use map to run a side effect (the call to swap!; an impure action), and never actually use the result of map. Because you never request results from map, the mapping function containing swap! never runs.
By using mapv (which returns a non-lazy vector), or doseq (which is meant to carry out side effects):
(doseq [dog dogs]
(let [k (if (> (:age dog) 4)
:above-four
:below-four)]
(swap! ages update k merge dog)))
You can force the side effects to run.
I cleaned up the code a bit. The -in versions you were using were unnecessary; as was the the call to get-in. I also got rid of the redundant calls to swap!.
Note though that at least in your example, use of atoms is entirely unnecessary. Even if you have a more complicated use case, make sure their use is justified. Mutable variables just aren't as common in languages like Clojure.

What is the idiomatic way to alter a vector that is stored in an atomized map?

I have an atom called app-state that holds a map. It looks like this:
{:skills [{:id 1 :text "hi"} {:id 2 :text "yeah"}]}
What is the idiomatic way to remove the element inside the vector with :id = 2 ? The result would look like:
{:skills [{:id 1 :text "hi"}]}
...
So far, I have this:
(defn new-list [id]
(remove #(= id (:id %)) (get #app-state :skills)))
swap! app-state assoc :skills (new-list 2)
It works, but I feel like this isn't quite right. I think it could be something like:
swap! app-state update-in [:skills] remove #(= id (:id %))
But this doesn't seem to work.
Any help is much appreciated!
Try this:
(defn new-list [app-state-map id]
(assoc app-state-map :skills (into [] (remove #(= id (:id %)) (:skills app-state-map)))))
(swap! app-state new-list 2)
swap! will pass the current value of the atom to the function you supply it. There's no need to dereference it yourself in the function.
See the docs on swap! for more details.
(swap! state update :skills (partial remove (comp #{2} :id)))
(def skills {:skills [{:id 1 :text "hi"} {:id 2 :text "yeah"}]})
(defn remove-skill [id]
(update skills :skills (fn [sks] (vec (remove #(= id (:id %)) sks)))))
You would then be able to call say (remove-skill 1) and see that only the other one (skill with :id of 2) is left.
I like your way better. And this would need to be adapted for use against an atom.
You can use filter to do this. Here is a function that takes an id and the map and let's you filter out anything that doesn't match your criteria. Of course, you can make the #() reader macro check for equality rather than inequality depending on your needs.
user=> (def foo {:skills [{:id 1 :text "hi"} {:id 2 :text "yeah"}]})
#'user/foo
user=> (defn bar [id sklz] (filter #(not= (:id %) id) (:skills sklz)))
#'user/bar
user=> (bar 1 foo)
({:id 2, :text "yeah"})
user=> (bar 2 foo)
({:id 1, :text "hi"})

Deep data structure match & replace first

I'm trying to figure out an idiomatic, performant, and/or highly functional way to do the following:
I have a sequence of maps that looks like this:
({:_id "abc" :related ({:id "123"} {:id "234"})}
{:_id "bcd" :related ({:id "345"} {:id "456"})}
{:_id "cde" :related ({:id "234"} {:id "345"})})
The :id fields can be assumed to be unique within any one :_id.
In addition, I have two sets:
ids like ("234" "345") and
substitutes like ({:id "111"} {:id "222"})
Note that the fact that substitutes only has :id in this example doesn't mean it can be reduced to a collection of ids. This is a simplified version of a problem and the real data has other key/value pairs in the map that have to come along.
I need to return a new sequence that is the same as the original but with the values from substitutes replacing the first occurrence of the matching id from ids in the :related collections of all of the items. So what the final collection should look like is:
({:_id "abc" :related ({:id "123"} {:id "111"})}
{:_id "bcd" :related ({:id "222"} {:id "456"})}
{:_id "cde" :related ({:id "234"} {:id "345"})})
I'm sure I could eventually code up something that involves nesting maps and conditionals (thinking in iterative terms about loops of loops) but that feels to me like I'm not thinking functionally or cleverly enough given the tools I might have available, either in clojure.core or extensions like match or walk (if those are even the right libraries to be looking at).
Also, it feels like it would be much easier without the requirement to limit it to a particular strategy (namely, subbing on the first match only, ignoring others), but that's a requirement. And ideally, a solution would be adaptable to a different strategy down the line (e.g. a single, but randomly positioned match). The one invariant to strategy is that each id/sub pair should used only once. So:
Replace one, and one only, occurrence of a :related value whose :id matches a value from ids with the corresponding value from substitutes, where the one occurrence is the first (or nth or rand-nth...) occurrence.
(def id-mapping (zipmap ids
(map :id substitutes)))
;; id-mapping -> {"345" "222", "234" "111"}
(clojure.walk/prewalk-replace id-mapping original)
Assuming that the collection is called results:
(require '[clojure.zip :as z])
(defn modify-related
[results id sub]
(loop [loc (z/down (z/seq-zip results))
done? false]
(if (= done? true)
(z/root loc)
(let [change? (->> loc z/node :_id (= id))]
(recur (z/next (cond change?
(z/edit loc (fn [_] identity sub))
:else loc))
change?)))))
(defn modify-results
[results id sub]
(loop [loc (z/down (z/seq-zip results))
done? false]
(if (= done? true)
(z/root loc)
(let [related (->> loc z/node :related)
change? (->> related (map :_id) set (#(contains? % id)))]
(recur (z/next (cond change?
(z/edit loc #(assoc % :related (modify-related related id sub)))
:else loc))
change?)))))
(defn sub-for-first
[results ids substitutes]
(let [subs (zipmap ids substitutes)]
(reduce-kv modify-results results subs)))

NOT - EXISTS / NOT - IN type query in Clojure

I have 2 data structures like the ones below
(ns test)
(def l
[{:name "Sean" :age 27}
{:name "Ross" :age 27}
{:name "Brian" :age 22}])
(def r
[{:owner "Sean" :item "Beer" }
{:owner "Sean" :item "Pizza"}
{:owner "Ross" :item "Computer"}
{:owner "Matt" :item "Bike"}])
I want to have get persons who dont own any item . (Brian in this case so [ {:name "Brian" :age 22}]
If this was SQL I would do left outer join or not exists but I not sure how to do this in clojure in more performant way.
While Chuck's solution is certainly the most sensible one, I find it interesting that it is possible to write a solution in terms of relational algebraic operators using clojure.set:
(require '[clojure.set :as set])
(set/difference (set l)
(set/project (set/join r l {:owner :name})
#{:name :age}))
; => #{{:name "Brian", :age 22}}
You basically want to do a filter on l, but negative. We could just not the condition, but the remove function already does this for us. So something like:
(let [owner-names (set (map :owner r))]
(remove #(owner-names (% :name)) l))
(I think it reads more nicely with the set, but if you want to avoid allocating the set, you can just do (remove (fn [person] (some #(= (% :owner) (person :name)) r)) l).)