Clojure: how to use compare with set/union [duplicate]

Clojure: how to use compare with set/union [duplicate] - clojure

This question already has answers here:
Custom equality in Clojure distinct
(3 answers)
Closed 5 years ago.
For the sake of example, let's assume I have two sets:
(def set-a #{{:id 1 :name "ABC" :zip 78759} {:id 2 :name "DEF" :zip 78759}})
(def set-b #{{:id 1 :name "ABC" :zip 78753} {:id 3 :name "XYZ" :zip 78704}})
I would like to find an union between the sets, using only :id and :name fields. However, with out using a custom comparator I get four elements in the set, because :zip field is different.
(clojure.set/union set-a set-b)
#{{:id 3, :name "XYZ", :zip 78704} {:id 1, :name "ABC", :zip 78753}
{:id 1, :name "ABC", :zip 78759} {:id 2, :name "DEF", :zip 78759}}
What is the idomatic way of finding union between two sets using a custom comparator or compare?

You could use group-by to do this:
(map first (vals (group-by (juxt :id :name) (concat set-a set-b))))
Or threaded:
(->> (concat set-a set-b)
(group-by (juxt :id :name))
(vals)
(map first))
This is grouping your elements by a combination of their key/values i.e. (juxt :id :name). Then it grabs the values of the produced map, then maps first over that to get the first item in each grouping.
Or use some code specifically built for this like distinct-by.
Note these approaches apply to any collection, not just sets.

If you don't mind throwing :zip away entirely, consider using clojure.set/project.
(clojure.set/union
(clojure.set/project set-a [:id :name])
(clojure.set/project set-b [:id :name]))
#{{:id 3, :name "XYZ"} {:id 2, :name "DEF"} {:id 1, :name "ABC"}}

Related

How to filter a map comparing it with another collection

I have a map with collection of these {:id 2489 ,values :.......} {:id 5647 ,values : .....} and so on till 10000 and I want to filter its value dependent on another collection which has ids of first one like (2489 ,......)
I am new to clojure and I have tried :
(into {} (filter #(some (fn [u] (= u (:id %))) [2489 3456 4567 5689]) record-sets))
But it gives me only the last that is 5689 id as output {:id 5689 ,:values ....}, while I want all of them, can you suggest what I can do.

One problem is that you start out with a sequence of N maps, then you try to stuff them into a single map. This will cause the last one to overwrite the first one.
Instead, you need to have the output be a sequence of M maps (M <= N).
Something like this is what you want:
(def data
[{:id 1 :stuff :fred}
{:id 2 :stuff :barney}
{:id 3 :stuff :wilma}
{:id 4 :stuff :betty}])
(let [ids-wanted #{1 3}
data-wanted (filterv
(fn [item]
(contains? ids-wanted (:id item)))
data)]
(println data-wanted))
with result:
[{:id 1, :stuff :fred}
{:id 3, :stuff :wilma}]
Be sure to use the Clojure CheatSheet: http://jafingerhut.github.io/cheatsheet/clojuredocs/cheatsheet-tiptip-cdocs-summary.html
I like filterv over plain filter since it always gives a non-lazy result (i.e. a Clojure vector).

You are squashing all your maps into one. First thing, for sake of performance, is to change your list of IDs into a set, then simply filter.
(let [ids (into #{} [2489 3456 4567 5689])]
(filter (comp ids :id) record-sets))
This will give you the sequence of correct maps. If you want to covert this sequence of maps into a map keyed by ID, you can do this:
(let [ids (into #{} [2489 3456 4567 5689])]
(->> record-sets
(filter (comp ids :id))
(into {} (map (juxt :id identity)))))

Another way to do this could be with the use of select-keys functions in Clojure
select-keys returns a map of only the keys given to the function
so given that your data is a list of maps we can convert it into a hash-map of ids using group-by and then call select-keys on it
(def data
[{:id 1 :stuff :fred}
{:id 2 :stuff :barney}
{:id 3 :stuff :wilma}
{:id 4 :stuff :betty}])
(select-keys (group-by :id data) [1 4])
; => {1 [{:id 1, :stuff :fred}], 4 [{:id 4, :stuff :betty}]}
However now the values is a map of ids. So in order to get the orignal structure back we need get all the values in the map and then flatten the vectors
; get all the values in the map
(vals (select-keys (group-by :id data) [1 4]))
; => ([{:id 1, :stuff :fred}] [{:id 4, :stuff :betty}])
; flatten the nested vectors
(flatten (vals (select-keys (group-by :id data) [1 4])))
; => ({:id 1, :stuff :fred} {:id 4, :stuff :betty})
Extracting the values and flattening might seem a bit inefficient but i think its less complex then the nested loop that needs to be done in the filter based methods.
You can using the threading macro to compose all the steps together
(-> (group-by :id data)
(select-keys [1 4])
vals
flatten)
Another thing that you can do is to store the data as a map of ids from the beginning this way using select keys wont require group-by and the result wont require flattening.
Update all keys in a map
(update-values (group-by :id data) first)
; => {1 {:id 1, :stuff :fred}, 2 {:id 2, :stuff :barney}, 3 {:id 3, :stuff :wilma}, 4 {:id 4, :stuff :betty}}
This would probably be the most efficient for this problem but this structure might not work for every case.

Clojure Macro using filter returns an object reference. Do not know how to interpret this reference

I am defining this macro
seminar.core=> (defmacro select
#_=> [vara _ coll _ wherearg _ orderarg]
#_=> `(filter ~wherearg))
#'seminar.core/select
And then defining a table
(def persons '({:id 1 :name "olle"} {:id 2 :name "anna"} {:id 3 :name
"isak"} {:id 4 :name "beatrice"}))
When I try to run my macro, so that I get the columns from the table where the id is greater than 2 (i.e {:id 3 :name "isak"} {:id 4 :name "beatrice"})
seminar.core=> (select [:id :name] from persons where [> :id 2] orderby :name)
I receive the following message and do not know quite how to interpret it
#object[clojure.core$filter$fn__4808 0x18e53c53 "clojure.core$filter$fn__4808#18e53c53"]
Update
I added a second argument to filter
seminar.core=> (defmacro select
#_=> [vara _ coll _ wherearg _ orderarg]
#_=> `(filter ~wherearg ~coll))
and receive IllegalArgumentException Key must be integer clojure.lang.APersistentVector.invoke (APersistentVector.java:292) as my return value now. I do not know how to interpret this error

When you use macroexpand-1 function to see the expanded form of macro it may give you a clue:
(macroexpand-1 '(select [:id :name] from persons where (> :id 2) orderby :name))
;;=> (clojure.core/filter [> :id 2] persons)
The form [> :id 2] isn't a valid function definition in Clojure. You have to pass proper function to filter, e.g. using anonymous function:
(select [:id :name] from persons where #(> (:id %) 2) orderby :name)
;;=> ({:id 3, :name "isak"} {:id 4, :name "beatrice"})

Idiomatic way to select a map in vector by a key

Suppose I have this vector of maps:
[{:title "Title1" :id 18347125}
{:title "Title2" :id 18347123}
{:title "Title3" :id 18341121}]
And I wish to select the map with :id 18347125, how would I do this?
I've tried
(for [map maps
:when (= (:id map) id)]
map)
This feels a bit ugly and returns a sequence of length one, and I want to return just the map.

IMHO, there are several ways to solve your problem, and the definitely idiomatic way is in the realm of taste. This is my solution where I simply translated "to select maps whose :id is 1834715" into Clojure.
user> (def xs [{:title "Title1" :id 18347125}
{:title "Title2" :id 18347123}
{:title "Title3" :id 18341121}])
#'user/xs
user> (filter (comp #{18347125} :id) xs)
({:title "Title1", :id 18347125})
The :id keyword is a function that looks up itself in a collection passed to it. The set #{18347125} is also a function that tests if a value passed to it equals 18347125. Using a Clojure set as a predicate function allows for a succinct idiom.

I'm not sure if it's the simplest way to write it, but I think this is more clear about your intentions:
(->> maps
(filter #(= (:id %) id))
first)

This doesn't do what you asked for exactly, but might be useful nonetheless:
user=> (group-by :id [{:title "Title1" :id 18347125}
{:title "Title2" :id 18347123}
{:title "Title3" :id 18341121}])
{18347125 [{:title "Title1" :id 18347125}]
18347123 [{:title "Title2" :id 18347123}]
18341121 [{:title "Title3" :id 18341121}]}
Now you can simply look the map up by id. Read more about group-by on clojuredocs, its a very useful function.
Note that it puts the maps inside vectors. This is because group-by is designed to handle grouping (ie multiple items with the same key):
user=> (group-by :id [{:title "Title1" :id 123}
{:title "Title2" :id 123}
{:title "Title3" :id 18341121}])
{123 [{:title "Title1" :id 123} {:title "Title2" :id 123}]
18341121 [{:title "Title3" :id 18341121}]}

If you need to query not just once, but multiple times for maps with specific IDs, I would suggest to make your data types match your use case, i.e. to change the vector into a map:
(def maps-by-id (zipmap (map :id maps) maps))
So now your IDs are the keys in this new map of maps:
user=> (maps-by-id 18347125)
{:title "Title1", :id 18347125}

How best to update this tree?

I've got the following tree:
{:start_date "2014-12-07"
:data {
:people [
{:id 1
:projects [{:id 1} {:id 2}]}
{:id 2
:projects [{:id 1} {:id 3}]}
]
}
}
I want to update the people and projects subtrees by adding a :name key-value pair.
Assuming I have these maps to perform the lookup:
(def people {1 "Susan" 2 "John")
(def projects {1 "Foo" 2 "Bar" 3 "Qux")
How could I update the original tree so that I end up with the following?
{:start_date "2014-12-07"
:data {
:people [
{:id 1
:name "Susan"
:projects [{:id 1 :name "Foo"} {:id 2 :name "Bar"}]}
{:id 2
:name "John"
:projects [{:id 1 :name "Foo"} {:id 3 :name "Qux"}]}
]
}
}
I've tried multiple combinations of assoc-in, update-in, get-in and map calls, but haven't been able to figure this out.

I have used letfn to break down the update into easier to understand units.
user> (def tree {:start_date "2014-12-07"
:data {:people [{:id 1
:projects [{:id 1} {:id 2}]}
{:id 2
:projects [{:id 1} {:id 3}]}]}})
#'user/tree
user> (def people {1 "Susan" 2 "John"})
#'user/people
user> (def projects {1 "Foo" 2 "Bar" 3 "Qux"})
#'user/projects
user>
(defn integrate-tree
[tree people projects]
;; letfn is like let, but it creates fn, and allows forward references
(letfn [(update-person [person]
;; -> is the "thread first" macro, the result of each expression
;; becomes the first arg to the next
(-> person
(assoc :name (people (:id person)))
(update-in [:projects] update-projects)))
(update-projects [all-projects]
(mapv
#(assoc % :name (projects (:id %)))
all-projects))]
(update-in tree [:data :people] #(mapv update-person %))))
#'user/integrate-tree
user> (pprint (integrate-tree tree people projects))
{:start_date "2014-12-07",
:data
{:people
[{:projects [{:name "Foo", :id 1} {:name "Bar", :id 2}],
:name "Susan",
:id 1}
{:projects [{:name "Foo", :id 1} {:name "Qux", :id 3}],
:name "John",
:id 2}]}}
nil

Not sure if entirely the best approach:
(defn update-names
[tree people projects]
(reduce
(fn [t [id name]]
(let [person-idx (ffirst (filter #(= (:id (second %)) id)
(map-indexed vector (:people (:data t)))))
temp (assoc-in t [:data :people person-idx :name] name)]
(reduce
(fn [t [id name]]
(let [project-idx (ffirst (filter #(= (:id (second %)) id)
(map-indexed vector (get-in t [:data :people person-idx :projects]))))]
(if project-idx
(assoc-in t [:data :people person-idx :projects project-idx :name] name)
t)))
temp
projects)))
tree
people))
Just call it with your parameters:
(clojure.pprint/pprint (update-names tree people projects))
{:start_date "2014-12-07",
:data
{:people
[{:projects [{:name "Foo", :id 1} {:name "Bar", :id 2}],
:name "Susan",
:id 1}
{:projects [{:name "Foo", :id 1} {:name "Qux", :id 3}],
:name "John",
:id 2}]}}
With nested reduces
Reduce over the people to update corresponding names
For each people, reduce over projects to update corresponding names
The noisesmith solution looks better since doesn't need to find person index or project index for each step.
Naturally you tried to assoc-in or update-in but the problem lies in your tree structure, since the key path to update John name is [:data :people 1 :name], so your assoc-in code would look like:
(assoc-in tree [:data :people 1 :name] "John")
But you need to find John's index in the people vector before you can update it, same things happens with projects inside.

Add items from collection 1 to collection 2, if collection 2 doesn't contain item from collection 1

I've got two maps:
(def people {:1 "John" :2 "Paul" :3 "Ringo" :4 "George"})
(def band
{:data
{:members
{:1 {:id 1 :name "John"}
:2 {:id 2 :name "Paul"}}}})
I want to loop over people and add any members that don't exist in [:data :members] to band, resulting in:
(def band
{:data
{:members
{:1 {:id 1 :name "John"}
:2 {:id 2 :name "Paul"}
:3 {:id 3 :name "Ringo"}
:4 {:id 4 :name "George"}}}})
Here's what I've tried:
(for [[id name] people]
(when-not
(contains? (get-in band [:data :members]) id)
(assoc-in band [:data :members id] {:id id :name name})))
Which yields:
({:data
{:members
{:4 {:id :4, :name "George"},
:1 {:name "John", :id 1},
:2 {:name "Paul", :id 2}}}}
nil
nil
{:data
{:members
{:1 {:name "John", :id 1},
:2 {:name "Paul", :id 2},
:3 {:id :3, :name "Ringo"}}}})
I'm not sure why I'm getting back what looks to be a list of each mutation of band. What am I doing wrong here? How can I add the missing members of people to band [:data :members]?

To be pedantic you aren't getting back any mutation of band. In fact, one of the most important features of Clojure is that the standard types are immutible, and the primary collection operations return a modified copy without changing the original.
Also, for in Clojure is not a loop, it is a list comprehension. This is why it always returns a sequence of each step. So instead of altering an input one step at a time, you made a new variation on the input for each step, each derived from the immutable original.
The standard construct for making a series of updated copies of an input based on a sequence of values is reduce, which passes a new version of the accumulator and each element of the list to your function.
Finally, you are misunderstanding the role of :keyword syntax - prefixing an item with a : is not needed in order to construct map keys - just about any clojure value is a valid key for a map, and keywords are just a convenient idiom.
user=> (def band
{:data
{:members
{1 {:id 1 :name "John"}
2 {:id 2 :name "Paul"}}}})
#'user/band
user=> (def people {1 "John" 2 "Paul" 3 "Ringo" 4 "George"})
#'user/people
user=> (pprint
(reduce (fn [band [id name :as person]]
(if-not (contains? (get-in band [:data :members]) id)
(assoc-in band [:data :members id] {:id id :name name})
band))
band
people))
{:data
{:members
{3 {:id 3, :name "Ringo"},
4 {:id 4, :name "George"},
1 {:name "John", :id 1},
2 {:name "Paul", :id 2}}}}
nil
You may notice the body of the fn passed to reduce is essentially the same as the body of your for comprehension. The difference is that instead of when-not which returns nil on the alternate case, I use if-not, which allows us to propagate the accumulator (here called band, same as the input) regardless of whether any new version of it is made.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Clojure: how to use compare with set/union [duplicate] - clojure

If you don't mind throwing :zip away entirely, consider using clojure.set/project. (clojure.set/union (clojure.set/project set-a [:id :name]) (clojure.set/project set-b [:id :name])) #{{:id 3, :name "XYZ"} {:id 2, :name "DEF"} {:id 1, :name "ABC"}}

Related

How to filter a map comparing it with another collection

Clojure Macro using filter returns an object reference. Do not know how to interpret this reference

Idiomatic way to select a map in vector by a key

How best to update this tree?

Add items from collection 1 to collection 2, if collection 2 doesn't contain item from collection 1

Categories

Resources