Clojure - update hashmaps inside vectors [duplicate] - clojure

This question already has answers here:
How can I update a vector item in Clojure?
(3 answers)
Closed 4 years ago.
Let's suppose I have the following vector
[{:id "1" :type "type"}, {:id "2" :type "another-type"}]
And I want to write a function that updates a hashmap, depending on it's id.
(defn update
[vector id value]
....)
And the result would be:
(update vector "1" "value")
[{:id "1" :type "type" :new-key ["value"]}, {:id "2" :type "another-type"}]
What's the most idiomatic way of performing this change?

(mapv
(fn [m]
(if (= "1" (:id m)) (assoc m :new-key ["value"]) m))
vector)

Related

Clojure - Filter keys from a map [duplicate]

This question already has answers here:
How to remove multiple keys from a map?
(4 answers)
Closed 2 years ago.
Let's assume I have a hashmap and I want to filter out entry by keys provided in a given vector. For example assuming I have
1. map: {:k1 "v1" :k2 "v2" :k3 "v3"}
2. list: [:k2 :k4]
and I want to be left with k1, k3
My current solution is:
(defn rr
"remove key that are in set from the map"
[m1 s]
(loop [mm m1 ss s]
(if (first ss)
(recur (dissoc mm (first ss)) (rest ss))
mm)))
Wonder do you prettier solution?
(apply dissoc {:k1 "v1" :k2 "v2" :k3 "v3"} [:k2 :k4])
Since dissoc can take multiple keys to remove, it can operate on keys collection with apply, or else you can use reduce the same way:
(reduce dissoc {:k1 "v1" :k2 "v2" :k3 "v3"} [:k2 :k4])
so your function rr could be:
(def dissoc-keyset (partial apply dissoc))

How to filter a map comparing it with another collection

I have a map with collection of these {:id 2489 ,values :.......} {:id 5647 ,values : .....} and so on till 10000 and I want to filter its value dependent on another collection which has ids of first one like (2489 ,......)
I am new to clojure and I have tried :
(into {} (filter #(some (fn [u] (= u (:id %))) [2489 3456 4567 5689]) record-sets))
But it gives me only the last that is 5689 id as output {:id 5689 ,:values ....}, while I want all of them, can you suggest what I can do.
One problem is that you start out with a sequence of N maps, then you try to stuff them into a single map. This will cause the last one to overwrite the first one.
Instead, you need to have the output be a sequence of M maps (M <= N).
Something like this is what you want:
(def data
[{:id 1 :stuff :fred}
{:id 2 :stuff :barney}
{:id 3 :stuff :wilma}
{:id 4 :stuff :betty}])
(let [ids-wanted #{1 3}
data-wanted (filterv
(fn [item]
(contains? ids-wanted (:id item)))
data)]
(println data-wanted))
with result:
[{:id 1, :stuff :fred}
{:id 3, :stuff :wilma}]
Be sure to use the Clojure CheatSheet: http://jafingerhut.github.io/cheatsheet/clojuredocs/cheatsheet-tiptip-cdocs-summary.html
I like filterv over plain filter since it always gives a non-lazy result (i.e. a Clojure vector).
You are squashing all your maps into one. First thing, for sake of performance, is to change your list of IDs into a set, then simply filter.
(let [ids (into #{} [2489 3456 4567 5689])]
(filter (comp ids :id) record-sets))
This will give you the sequence of correct maps. If you want to covert this sequence of maps into a map keyed by ID, you can do this:
(let [ids (into #{} [2489 3456 4567 5689])]
(->> record-sets
(filter (comp ids :id))
(into {} (map (juxt :id identity)))))
Another way to do this could be with the use of select-keys functions in Clojure
select-keys returns a map of only the keys given to the function
so given that your data is a list of maps we can convert it into a hash-map of ids using group-by and then call select-keys on it
(def data
[{:id 1 :stuff :fred}
{:id 2 :stuff :barney}
{:id 3 :stuff :wilma}
{:id 4 :stuff :betty}])
(select-keys (group-by :id data) [1 4])
; => {1 [{:id 1, :stuff :fred}], 4 [{:id 4, :stuff :betty}]}
However now the values is a map of ids. So in order to get the orignal structure back we need get all the values in the map and then flatten the vectors
; get all the values in the map
(vals (select-keys (group-by :id data) [1 4]))
; => ([{:id 1, :stuff :fred}] [{:id 4, :stuff :betty}])
; flatten the nested vectors
(flatten (vals (select-keys (group-by :id data) [1 4])))
; => ({:id 1, :stuff :fred} {:id 4, :stuff :betty})
Extracting the values and flattening might seem a bit inefficient but i think its less complex then the nested loop that needs to be done in the filter based methods.
You can using the threading macro to compose all the steps together
(-> (group-by :id data)
(select-keys [1 4])
vals
flatten)
Another thing that you can do is to store the data as a map of ids from the beginning this way using select keys wont require group-by and the result wont require flattening.
Update all keys in a map
(update-values (group-by :id data) first)
; => {1 {:id 1, :stuff :fred}, 2 {:id 2, :stuff :barney}, 3 {:id 3, :stuff :wilma}, 4 {:id 4, :stuff :betty}}
This would probably be the most efficient for this problem but this structure might not work for every case.

Clojure: Find missing records in a collection based on another collection

I have 2 vectors: employ and emp-income. I want to loop thru emp-income based on employ to find what all the missing records. In this case, it's missing id = 2. And i want to create the missing record in emp-income and set the income as the previous record's income value. What is the best way to do it in clojure?
(def employ
[{:id 1 :name "Aaron"}
{:id 2 :name "Ben"}
{:id 3 :name "Carry"}])
from:
(def emp-income
[{:emp-id 1 :income 1000}
{:emp-id 3 :income 2000}])
to:
(def emp-income
[{:emp-id 1 :income 1000}
{:emp-id 2 :income 1000}
{:emp-id 3 :income 2000}])
You could use:
(let [emp-id->income (into {} (map (fn [rec] [(:emp-id rec) rec]) emp-income))]
(reduce (fn [acc {:keys [id]}]
(let [{:keys [income]} (or (get emp-id->income id) (peek acc))]
(conj acc {:emp-id id :income income})))
[]
employ))
Note this will create a record of {:emp-id id :income nil} if the first record is not found in emp-income. It will also use the last :emp-id encountered if duplicate :emp-id values are found within emp-income.

Clojure: how to use compare with set/union [duplicate]

This question already has answers here:
Custom equality in Clojure distinct
(3 answers)
Closed 5 years ago.
For the sake of example, let's assume I have two sets:
(def set-a #{{:id 1 :name "ABC" :zip 78759} {:id 2 :name "DEF" :zip 78759}})
(def set-b #{{:id 1 :name "ABC" :zip 78753} {:id 3 :name "XYZ" :zip 78704}})
I would like to find an union between the sets, using only :id and :name fields. However, with out using a custom comparator I get four elements in the set, because :zip field is different.
(clojure.set/union set-a set-b)
#{{:id 3, :name "XYZ", :zip 78704} {:id 1, :name "ABC", :zip 78753}
{:id 1, :name "ABC", :zip 78759} {:id 2, :name "DEF", :zip 78759}}
What is the idomatic way of finding union between two sets using a custom comparator or compare?
You could use group-by to do this:
(map first (vals (group-by (juxt :id :name) (concat set-a set-b))))
Or threaded:
(->> (concat set-a set-b)
(group-by (juxt :id :name))
(vals)
(map first))
This is grouping your elements by a combination of their key/values i.e. (juxt :id :name). Then it grabs the values of the produced map, then maps first over that to get the first item in each grouping.
Or use some code specifically built for this like distinct-by.
Note these approaches apply to any collection, not just sets.
If you don't mind throwing :zip away entirely, consider using clojure.set/project.
(clojure.set/union
(clojure.set/project set-a [:id :name])
(clojure.set/project set-b [:id :name]))
#{{:id 3, :name "XYZ"} {:id 2, :name "DEF"} {:id 1, :name "ABC"}}

Perform "get" on all HashMap elements of a LazySeq

I'm parsing some XML data from Stack Exchange using clojure.data.xml, for example if I parse Votes data it returns a LazySeq containing a HashMap for each row of data.
What I am trying to do is to get the values associated with only certain keys, for each row,e.g., (get votes [:Id :CreationDate]). I've tried numerous things, most of them leading to casting errors.
The closest I could get to what I need is using (doall (map get votes [:Id :CreationDate])). However, the problem I am experiencing now is that I cannot seem to return more than just the first row (i.e. (1 2011-01-19T00:00:00.000))
Here is a MCVE that can be run on any Clojure REPL, or on Codepad online IDE.
Ideally I would like to return some kind of collection or map which contains the values I need for each row, the end goal is to write to something like a CSV file or such. For example a map like
(1 2011-01-19T00:00:00.000
2 2011-01-19T00:00:00.000
3 2011-01-19T00:00:00.000
4 2011-01-19T00:00:00.000)
(def votes '({:Id "1",
:PostId "2",
:VoteTypeId "2",
:CreationDate "2011-01-19T00:00:00.000"}
{:Id "2",
:PostId "3",
:VoteTypeId "2",
:CreationDate "2011-01-19T00:00:00.000"}
{:Id "3",
:PostId "1",
:VoteTypeId "2",
:CreationDate "2011-01-19T00:00:00.000"}
{:Id "4",
:PostId "1",
:VoteTypeId "2",
:CreationDate "2011-01-19T00:00:00.000"}))
(println (doall (map get votes [:Id :CreationDate])))
Additional detail: If this is of any help/interest, the code I am using to get the above lazy seq is as follows:
(ns se-datadump.read-xml
(require
[clojure.data.xml :as xml])
(def xml-votes
"<votes><row Id=\"1\" PostId=\"2\" VoteTypeId=\"2\" CreationDate=\"2011-01-19T00:00:00.000\" /> <row Id=\"2\" PostId=\"3\" VoteTypeId=\"2\" CreationDate=\"2011-01-19T00:00:00.000\" /> <row Id=\"3\" PostId=\"1\" VoteTypeId=\"2\" CreationDate=\"2011-01-19T00:00:00.000\" /> <row Id=\"4\" PostId=\"1\" VoteTypeId=\"2\" CreationDate=\"2011-01-19T00:00:00.000\" /></votes>")
(defn se-xml->rows-seq
"Returns LazySequence from a properly formatted XML string,
which contains a HashMap for every <row> element with each of its attributes.
This assumes the standard Stack Exchange XML format, where a parent element contains
only a series of <row> child elements with no further hierarchy."
[xml-str]
(let [xml-records (xml/parse-str xml-str)]
(map :attrs (-> xml-records :content))))
; this returns a map identical as in the MCVE:
(def votes (se-xml->rows-seq xml-votes)
You apparently need juxt:
(map (juxt :Id :CreationDate) votes)
;; => (["1" "2011-01-19T00:00:00.000"] ["2" "2011-01-19T00:00:00.000"] ["3" "2011-01-19T00:00:00.000"] ["4" "2011-01-19T00:00:00.000"])
If you need a map out of it:
(into {} (map (juxt :Id :CreationDate) votes))
;; => {"1" "2011-01-19T00:00:00.000", "2" "2011-01-19T00:00:00.000", "3" "2011-01-19T00:00:00.000", "4" "2011-01-19T00:00:00.000"}
First of all, let me explain, what the piece of code you suggest in the CodePad is actually doing. I doubt that it's the thing you are intending to do:
(println (doall (map get votes [:Id :CreationDate])))
The crucial part is: (map get votes [:Id :CreationDate])
This maps over two collections: the lazy sequence 'votes' and a vector. Whenever mapping over more than one collection, the returned lazy sequence will be as long as the shortest collection provided.
For instance one can map over a finite collection and an infinite sequence:
(map + (range) [1 2 3])
;; (0 3 5)
This explains why yours result is only two items long:
(map get votes [:Id :CreationDate])
reduces to:
((get (votes 0) ([:Id :CreationDate] 0)
(get (votes 1) ([:Id :CreationDate] 1))
reduces to:
((get {:Id "1",
:PostId "2",
:VoteTypeId "2",
:CreationDate "2011-01-19T00:00:00.000"} :Id)
(get {:Id "2",
:PostId "3",
:VoteTypeId "2",
:CreationDate "2011-01-19T00:00:00.000"} :CreationDate))
reduces finally to:
(1 2011-01-19T00:00:00.000)
This is just for understanding purpose. If the compiler does exactly these steps, is another question.
doall is not necessary here, since println already does this implicitly.
As already noted. In your case you'd better use juxt and only map over votes. If you really want to have the sample output you additionally need to flatten the output:
(flatten (map (juxt :Id :CreationDate) votes))