Set difference using a projected function

Set difference using a projected function - clojure

I've got two databases that I'm attempting to keep in sync using a bit of Clojure glue code.
I'd like to make something like a clojure.set/difference that operates on values projected by a function.
Here's some sample data:
(diff #{{:name "bob smith" :favourite-colour "blue"}
{:name "geraldine smith" :age 29}}
#{{:first-name "bob" :last-name "smith" :favourite-colour "blue"}}
:name
(fn [x] (str (:first-name x) " " (:last-name x))))
;; => {:name "geraldine smith" :age 29}
The best I've got is:
(defn diff
"Return members of l who do not exist in r, based on applying function
fl to left and fr to right"
[l r fl fr]
(let [l-project (into #{} (map fl l))
r-project (into #{} (map fr r))
d (set/difference l-project r-project)
i (group-by fl l)]
(map (comp first i) d)))
But I feel that this is a bit unwieldly, and I can't imagine it performs very well. I'm throwing away information that I'd like to keep, and then looking it up again.
I did have a go using metadata, to keep the original values around during the set difference, but I can't seem put metadata on primitive types, so that didn't work...
I'm not sure why, but I have this tiny voice inside my head telling me that this kind of operation on the side is what monads are for, and that I should really get around to finding out what a monad is and how to use it. Any guidance as to whether the tiny voice is right is very welcome!

(defn diff
[l r fl fr]
(let [r-project (into #{} (map fr r))]
(set (remove #(contains? r-project (fl %)) l))))
This no longer exposes the difference operation directly (it is now implicit with the remove / contains combination), but it is succinct and should give the result you are looking for.
example usage and output:
user> (diff #{{:name "bob smith" :favourite-colour "blue"}
{:name "geraldine smith" :age 29}}
#{{:first-name "bob" :last-name "smith" :favourite-colour "blue"}}
:name
(fn [x] (str (:first-name x) " " (:last-name x))))
#{{:age 29, :name "geraldine smith"}}

Related

Idiomatic way to wrap object into collection (if it's not a collection already) in Clojure?

I have (for instance) a mix of data structures such as {:name "Peter" :children "Mark"} and {:name "Mark" :children ["Julia" "John"] i.e. :children value is either a single string or a collection of strings. Other functions in my code expect that the value of :children is always a collection of strings, so I need to adapt the data for them.
Of course I can use something like:
(defn data-adapter [m]
(let [children (:children m)]
(assoc m :children
(if (coll? children)
children
[children]))))
But is there a more idiomatic/laconic way?

I think you will have to take no for an answer.
(if (coll? x) x [x]) is about as terse and expressive as it gets. It’s what people usually use for this problem (sometimes with sequential? instead of coll?).
cond-> enthusiasts like me sometimes try to use it in place of a simple conditional, but here it is no improvement:
(cond-> x (not (coll? x)) vector)
In the context of your code, however, you can do a little better. A lookup and association is best expressed with update:
(defn data-adapter [m]
(update m :children #(if (coll? %) % [%])))

the only advice would be to abstract that logic to some function, to keep your actual business logic clean.
(defn data-adapter [m]
(let [children (:children m)]
(assoc m :children (ensure-coll children))))
or, more concise, with update:
(defn data-adapter [m]
(update m :children ensure-coll))
where ensure-coll could be something like this:
(defn iffun [check & {:keys [t f] :or {t identity f identity}}]
#((if (check %) t f) %))
(def ensure-coll (iffun coll? :f list))
(or whatever another implementation you like)
user> (data-adapter {:children 1})
;;=> {:children (1)}
user> (data-adapter {:children [1]})
;;=> {:children [1]}

Perhaps not idiomatic, but laconic:
(flatten [x])
https://clojuredocs.org/clojure.core/flatten

Clojure/FP: apply functions to each argument to an operator

Let's say I have several vectors
(def coll-a [{:name "foo"} ...])
(def coll-b [{:name "foo"} ...])
(def coll-c [{:name "foo"} ...])
and that I would like to see if the names of the first elements are equal.
I could
(= (:name (first coll-a)) (:name (first coll-b)) (:name (first coll-c)))
but this quickly gets tiring and overly verbose as more functions are composed. (Maybe I want to compare the last letter of the first element's name?)
To directly express the essence of the computation it seems intuitive to
(apply = (map (comp :name first) [coll-a coll-b coll-c]))
but it leaves me wondering if there's a higher level abstraction for this sort of thing.
I often find myself comparing / otherwise operating on things which are to be computed via a single composition applied to multiple elements, but the map syntax looks a little off to me.
If I were to home brew some sort of operator, I would want syntax like
(-op- (= :name first) coll-a coll-b coll-c)
because the majority of the computation is expressed in (= :name first).
I'd like an abstraction to apply to both the operator & the functions applied to each argument. That is, it should be just as easy to sum as compare.
(def coll-a [{:name "foo" :age 43}])
(def coll-b [{:name "foo" :age 35}])
(def coll-c [{:name "foo" :age 28}])
(-op- (+ :age first) coll-a coll-b coll-c)
; => 106
(-op- (= :name first) coll-a coll-b coll-c)
; => true
Something like
(defmacro -op-
[[op & to-comp] & args]
(let [args' (map (fn [a] `((comp ~#to-comp) ~a)) args)]
`(~op ~#args')))
Is there an idiomatic way to do this in clojure, some standard library function I could be using?
Is there a name for this type of expression?

For your addition example, I often use transduce:
(transduce
(map (comp :age first))
+
[coll-a coll-b coll-c])
Your equality use case is trickier, but you could create a custom reducing function to maintain a similar pattern. Here's one such function:
(defn all? [f]
(let [prev (volatile! ::no-value)]
(fn
([] true)
([result] result)
([result item]
(if (or (= ::no-value #prev)
(f #prev item))
(do
(vreset! prev item)
true)
(reduced false))))))
Then use it as
(transduce
(map (comp :name first))
(all? =)
[coll-a coll-b coll-c])
The semantics are fairly similar to your -op- macro, while being both more idiomatic Clojure and more extensible. Other Clojure developers will immediately understand your usage of transduce. They may have to investigate the custom reducing function, but such functions are common enough in Clojure that readers can see how it fits an existing pattern. Also, it should be fairly transparent how to create new reducing functions for use cases where a simple map-and-apply wouldn't work. The transducing function can also be composed with other transformations such as filter and mapcat, for cases when you have a more complex initial data structure.

You may be looking for the every? function, but I would enhance clarity by breaking it down and naming the sub-elements:
(let [colls [coll-a coll-b coll-c]
first-name (fn [coll] (:name (first coll)))
names (map first-name colls)
tgt-name (first-name coll-a)
all-names-equal (every? #(= tgt-name %) names)]
all-names-equal => true
I would avoid the DSL, as there is no need and it makes it much harder for others to read (since they don't know the DSL). Keep it simple:
(let [colls [coll-a coll-b coll-c]
vals (map #(:age (first %)) colls)
result (apply + vals)]
result => 106

I don't think you need a macro, you just need to parameterize your op function and compare functions. To me, you are pretty close with your (apply = (map (comp :name first) [coll-a coll-b coll-c])) version.
Here is one way you could make it more generic:
(defn compare-in [op to-compare & args]
(apply op (map #(get-in % to-compare) args)))
(compare-in + [0 :age] coll-a coll-b coll-c)
(compare-in = [0 :name] coll-a coll-b coll-c)
;; compares last element of "foo"
(compare-in = [0 :name 2] coll-a coll-b coll-c)
I actually did not know you can use get on strings, but in the third case you can see we compare the last element of each foo.
This approach doesn't allow the to-compare arguments to be arbitrary functions, but it seems like your use case mainly deals with digging out what elements you want to compare, and then applying an arbitrary function to those values.
I'm not sure this approach is better than the transducer version supplied above (certainly not as efficient), but I think it provides a simpler alternative when that efficiency is not needed.

I would split this process into three stages:
transform items in collections into the data in collections you want to operate
on - (map :name coll);
Operate on transformed items in collections, returning collection of results - (map = transf-coll-a transf-coll-b transf-coll-c)
Finally, selecting which result in resulting collection to return - (first calculated-coll)
When playing with collections, I try to put more than one item into collection:
(def coll-a [{:name "foo" :age 43} {:name "bar" :age 45}])
(def coll-b [{:name "foo" :age 35} {:name "bar" :age 37}])
(def coll-c [{:name "foo" :age 28} {:name "bra" :age 30}])
For example, matching items by second char in :name and returning result for items in second place:
(let
[colls [coll-a coll-b coll-c]
transf-fn (comp #(nth % 1) :name)
op =
fetch second]
(fetch (apply map op (map #(map transf-fn %) colls))))
;; => false
In transducers world you can use sequence function which also works on multiple collections:
(let
[colls [coll-a coll-b coll-c]
transf-fn (comp (map :name) (map #(nth % 1)))
op =
fetch second]
(fetch (apply sequence (map op) (map #(sequence transf-fn %) colls))))
Calculate sum of ages (for all items at the same level):
(let
[colls [coll-a coll-b coll-c]
transf-fn (comp (map :age))
op +
fetch identity]
(fetch (apply sequence (map op) (map #(sequence transf-fn %) colls))))
;; => (106 112)

Clojure Atom Doesn't Update When Wrapped in Defined Function

Not sure what is going on here, but I have this code, where the map function successfully executes in my repl without being wrapped in a defined function:
(def dogs '({:name "scout" :age 5} {:name "rux" :age 3} {:name "fenley" :age 2}))
(def ages (atom {:above-four '() :below-four '()}))
(map
#(if (> (get-in % [:age]) 4)
(swap! ages update-in [:above-four] merge %)
(swap! ages update-in [:below-four] merge %)) dogs)
#ages
=> {:above-four ({:name "scout", :age 5}), :below-four ({:name "fenley", :age 2} {:name "rux", :age 3})}
Yet, when I define the map function as such:
(def ages (atom {:above-four '() :below-four '()}))
(def dogs '({:name "scout" :age 5} {:name "rux" :age 3} {:name "fenley" :age 2}))
(defn test-dogs []
(map
#(if (> (get-in % [:age]) 4)
(swap! ages update-in [:above-four] merge %)
(swap! ages update-in [:below-four] merge %)) dogs)
#ages)
I get the following result:
=> {:above-four (), :below-four ()}
I'm very confused, because this function taken straight from the Clojure docs works just fine:
(def m1 (atom {:a "A" :b "B"}))
(defn update-m1 []
(swap! m1 assoc :a "Aaay")
#m1)
=> {:a "Aaay", :b "B"}

Because test-dogs uses map, it returns a lazy sequence. The elements of lazy sequences aren't realized until they're needed.
The problem with your set up is you're trying to use map to run a side effect (the call to swap!; an impure action), and never actually use the result of map. Because you never request results from map, the mapping function containing swap! never runs.
By using mapv (which returns a non-lazy vector), or doseq (which is meant to carry out side effects):
(doseq [dog dogs]
(let [k (if (> (:age dog) 4)
:above-four
:below-four)]
(swap! ages update k merge dog)))
You can force the side effects to run.
I cleaned up the code a bit. The -in versions you were using were unnecessary; as was the the call to get-in. I also got rid of the redundant calls to swap!.
Note though that at least in your example, use of atoms is entirely unnecessary. Even if you have a more complicated use case, make sure their use is justified. Mutable variables just aren't as common in languages like Clojure.

clojure set of maps - basic filtering

Clojure beginner here..
If I have a set of maps, such as
(def kids #{{:name "Noah" :age 5}
{:name "George":age 3}
{:name "Reagan" :age 1.5}})
I know I can get names like this
(map :name kids)
1) How do I select a specific map? For example
I want to get back the map where name="Reagan".
{:name "Reagan" :age 1.5}
Can this be done using a filter?
2) How about returning the name where age = 3?

Yes, you can do it with filter:
(filter #(= (:name %) "Reagan") kids)
(filter #(= (:age %) 3) kids)

There's clojure.set/select:
(clojure.set/select set-of-maps #(-> % :age (= 3)))
And similarly with name and "Reagan". The return value in this case will be a set.
You could also use filter without any special preparations, since filter calls seq on its collection argument (edit: as already described by ffriend while I was typing this):
(filter #(-> % :age (= 3))) set-of-maps)
Here the return value will be a lazy seq.
If you know there will only be one item satisfying your predicate in the set, some will be more efficient (as it will not process any additional elements after finding the match):
(some #(if (-> % :age (= 3)) %) set-of-maps)
The return value here will be the matching element.

NOT - EXISTS / NOT - IN type query in Clojure

I have 2 data structures like the ones below
(ns test)
(def l
[{:name "Sean" :age 27}
{:name "Ross" :age 27}
{:name "Brian" :age 22}])
(def r
[{:owner "Sean" :item "Beer" }
{:owner "Sean" :item "Pizza"}
{:owner "Ross" :item "Computer"}
{:owner "Matt" :item "Bike"}])
I want to have get persons who dont own any item . (Brian in this case so [ {:name "Brian" :age 22}]
If this was SQL I would do left outer join or not exists but I not sure how to do this in clojure in more performant way.

While Chuck's solution is certainly the most sensible one, I find it interesting that it is possible to write a solution in terms of relational algebraic operators using clojure.set:
(require '[clojure.set :as set])
(set/difference (set l)
(set/project (set/join r l {:owner :name})
#{:name :age}))
; => #{{:name "Brian", :age 22}}

You basically want to do a filter on l, but negative. We could just not the condition, but the remove function already does this for us. So something like:
(let [owner-names (set (map :owner r))]
(remove #(owner-names (% :name)) l))
(I think it reads more nicely with the set, but if you want to avoid allocating the set, you can just do (remove (fn [person] (some #(= (% :owner) (person :name)) r)) l).)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Set difference using a projected function - clojure

Related

Idiomatic way to wrap object into collection (if it's not a collection already) in Clojure?

Clojure/FP: apply functions to each argument to an operator

Clojure Atom Doesn't Update When Wrapped in Defined Function

clojure set of maps - basic filtering

NOT - EXISTS / NOT - IN type query in Clojure

Categories

Resources