Counting grouped data in clojure - clojure

I have data in clojure which I have grouped by application into following format:
(["name1" [{:application "name1", :date "date1", :description "desc1"}
{:application "name1", :date "date2", :description "desc2"}]]
["name2" [{:application "name2", :date "date1", :description "desc1"}
{:application "name2", :date "date2", :description "desc2"}]]
... and so on)
I need to count the number of events for each application (i.e. the number of maps in each vector) and produce a list of maps in the format:
({:application "name1", :count 2} {:application "name2", :count2} ... etc )
I have the following code to produce a list of application names and a list of the counts for each application name but I am struggling with how to get them back into the format above.
(let[
application-list (map first group-by-app)
grouped-data (map second group-by-app)
count-list (map count grouped-data)]
Any help would be greatly appreciated.
Thanks in advance.
David

for offers a really succinct, but expressive way to do something with the elements of sequences to build up a new sequence. I'd certainly recommend it here.
(def group-by-app '(["name1" [{:application "name1", :date "date1", :description "desc1"}
{:application "name1", :date "date2", :description "desc2"}]]
["name2" [{:application "name2", :date "date1", :description "desc1"}
{:application "name2", :date "date2", :description "desc2"}]]))
(for [[application events] group-by-app]
{:application application, :count (count events)})
; => ({:application "name1", :count 2} {:application "name2", :count 2})
For completeness, however, it's worth noting that map can map a function over multiple sequences at the same time. So, you could recombine the intermediate data you produced with a function of the application and the count.
(let [application-list (map first group-by-app)
grouped-data (map second group-by-app)
count-list (map count grouped-data)]
(map
(fn [app event-count]
{:application app :count event-count})
application-list count-list))

group-by produces a map, which is a sequence of key value pairs. You can simply map that sequence.
user=> (def foo #{{:application "name1", :date "date1", :description "desc1"}
#_=> {:application "name1", :date "date2", :description "desc2"}
#_=> {:application "name2", :date "date1", :description "desc1"}
#_=> {:application "name2", :date "date2", :description "desc2"}})
#'user/foo
user=> (map #(let [[key tuples] %] [key (count tuples)]) (group-by :application foo))
(["name1" 2] ["name2" 2])
You can turn that back into a map if that's what you need.
user=> (into {} (map #(let [[key tuples] %] [key (count tuples)]) (group-by :application foo)))
{"name1" 2, "name2" 2}

(reduce
(fn [acc [_ maps]]
(let [c (count maps)]
(into acc
(map #(assoc % :count c) maps))))
'()
grouped)
grouped is the grouped data you provided.

Related

Clojure - transform flat map to nested with predecided structure

I'm a newbie to clojure and i'm trying to convert a messages that come in a particular format into another.
ie, i have to convert something like:
{
:image-url ["https://image.png"],
:topic "Some title",
:id "88ebaf91-a01d-4683-9aa7-629bb3ecea01",
:short-description "Some Description",
:mobile-deeplink "https://deeplink.com/link",
:partner-name "partner"}
Into something like
{
:title "Some title",
:id "88ebaf91-a01d-4683-9aa7-629bb3ecea01",
:content {
:url ["https://image.png"],
:description "Some Description",
:deeplink "https://deeplink.com/link",
:partner "partner"}}
So in effect, there is a combination of renaming keys and nesting the flat map
What I have done so far was something on the lines of:
(let [message-map {
:image-url :purl
:topic :title
:partner-name :partner
:short-description :description
:mobile-deeplink :deeplink}]
(defn- map-to-body
[message]
(-> message
(clojure.set/rename-keys message-map)
;;some sort of (assoc-in) <- this is where i need help in
)))
Combining assoc-in, a path conversion table, and reduce could be more self-describing and maintainable. You could choose to reduce over either the conversion table or the input message, whichever makes more sense for the data you have.
(defn transform [m]
(let [pp '([:image-url [:content :url]]
[:topic [:title]]
[:id [:id]]
[:short-description [:content :description]]
;; etc.
)]
(reduce
(fn [o [mk ok]]
(assoc-in o ok (get m mk)))
{}
pp)))
You could chain-assoc-in here, but I think you are easier off
using select-keys. select-keys lets you extract only the keys
from a map into a new map, you need. So you can select :id/:title for
the outer map and the rest to assoc to :content.
E.g.
(require 'clojure.set)
(defn transform
[message]
(let [message-map {:image-url :url
:topic :title
:partner-name :partner
:short-description :description
:mobile-deeplink :deeplink}
renamed (clojure.set/rename-keys message message-map)]
(assoc ; XXX
(select-keys renamed [:title :id])
:content (select-keys renamed [:url :description :deeplink :partner]))))
(def src {:image-url ["https://image.png"],
:topic "Some title",
:id "88ebaf91-a01d-4683-9aa7-629bb3ecea01",
:short-description "Some Description",
:mobile-deeplink "https://deeplink.com/link",
:partner-name "partner"})
(def tgt {:title "Some title",
:id "88ebaf91-a01d-4683-9aa7-629bb3ecea01",
:content {
:url ["https://image.png"],
:description "Some Description",
:deeplink "https://deeplink.com/link",
:partner "partner"}})
(assert (= (transform src) tgt))

Get two different keywords from map

I just started learning Clojure and I'd like to get two keywords from a vector of maps.
Let's say there's a vector
(def a [{:id 1, :description "bla", :amount 12, :type "A", :other "x"} {:id 2, :description "blabla", :amount 10, :type "B", :other "y"}])
And I'd like to get a new vector
[{"bla" 12} {"blabla" 10}]
How can I do that??
Thanks!
Assuming you want the :description and :amount separately, not maps that map one to the other, you can use juxt to retrieve both at the same time:
(mapv (juxt :description :amount) a)
;; => [["bla" 12] ["blabla" 10]]
If you actually did mean to make maps, you can use for instance apply and hash-map to do that:
(mapv #(apply hash-map ((juxt :description :amount) %)) a)
;; => [{"bla" 12} {"blabla" 10}]
You can use mapv to map over the source vector. Within the transform function you can destructure each map to extract the keys you want and construct the result:
(mapv (fn [{:keys [description amount]}] {description amount}) a)
(mapv #(hash-map (:description %) (:amount %)) a)

Recursive map query using specter

Is there a simple way in specter to collect all the structure satisfying a predicate ?
(./pull '[com.rpl/specter "1.0.0"])
(use 'com.rpl.specter)
(def data {:items [{:name "Washing machine"
:subparts [{:name "Ballast" :weight 1}
{:name "Hull" :weight 2}]}]})
(reduce + (select [(walker :weight) :weight] data))
;=> 3
(select [(walker :name) :name] data)
;=> ["Washing machine"]
How can we get all the value for :name, including ["Ballast" "Hull"] ?
Here's one way, using recursive-path and stay-then-continue to do the real work. (If you omit the final :name from the path argument to select, you'll get the full “item / part maps” rather than just the :name strings.)
(def data
{:items [{:name "Washing machine"
:subparts [{:name "Ballast" :weight 1}
{:name "Hull" :weight 2}]}]})
(specter/select
[(specter/recursive-path [] p
[(specter/walker :name) (specter/stay-then-continue [:subparts p])])
:name]
data)
;= ["Washing machine" "Ballast" "Hull"]
Update: In answer to the comment below, here's a version of the above the descends into arbitrary branches of the tree, as opposed to only descending into the :subparts branch of any given node, excluding :name (which is the key whose values in the tree we want to extract and should not itself be viewed as a branching off point):
(specter/select
[(specter/recursive-path [] p
[(specter/walker :name)
(specter/stay-then-continue
[(specter/filterer #(not= :name (key %)))
(specter/walker :name)
p])])
:name]
;; adding the key `:subparts` with the value [{:name "Foo"}]
;; to the "Washing machine" map to exercise the new descent strategy
(assoc-in data [:items 0 :subparts2] [{:name "Foo"}]))
;= ["Washing machine" "Ballast" "Hull" "Foo"]
The selected? selector can be used to collect structures for which another selector matches something within the structure
From the examples at https://github.com/nathanmarz/specter/wiki/List-of-Navigators#selected
=> (select [ALL (selected? [(must :a) even?])] [{:a 0} {:a 1} {:a 2} {:a 3}])
[{:a 0} {:a 2}]
I think you could iterate on map recursively using clojure.walk package. On each step, you may check the current value for a predicate and push it into an atom to collect the result.

How best to update this tree?

I've got the following tree:
{:start_date "2014-12-07"
:data {
:people [
{:id 1
:projects [{:id 1} {:id 2}]}
{:id 2
:projects [{:id 1} {:id 3}]}
]
}
}
I want to update the people and projects subtrees by adding a :name key-value pair.
Assuming I have these maps to perform the lookup:
(def people {1 "Susan" 2 "John")
(def projects {1 "Foo" 2 "Bar" 3 "Qux")
How could I update the original tree so that I end up with the following?
{:start_date "2014-12-07"
:data {
:people [
{:id 1
:name "Susan"
:projects [{:id 1 :name "Foo"} {:id 2 :name "Bar"}]}
{:id 2
:name "John"
:projects [{:id 1 :name "Foo"} {:id 3 :name "Qux"}]}
]
}
}
I've tried multiple combinations of assoc-in, update-in, get-in and map calls, but haven't been able to figure this out.
I have used letfn to break down the update into easier to understand units.
user> (def tree {:start_date "2014-12-07"
:data {:people [{:id 1
:projects [{:id 1} {:id 2}]}
{:id 2
:projects [{:id 1} {:id 3}]}]}})
#'user/tree
user> (def people {1 "Susan" 2 "John"})
#'user/people
user> (def projects {1 "Foo" 2 "Bar" 3 "Qux"})
#'user/projects
user>
(defn integrate-tree
[tree people projects]
;; letfn is like let, but it creates fn, and allows forward references
(letfn [(update-person [person]
;; -> is the "thread first" macro, the result of each expression
;; becomes the first arg to the next
(-> person
(assoc :name (people (:id person)))
(update-in [:projects] update-projects)))
(update-projects [all-projects]
(mapv
#(assoc % :name (projects (:id %)))
all-projects))]
(update-in tree [:data :people] #(mapv update-person %))))
#'user/integrate-tree
user> (pprint (integrate-tree tree people projects))
{:start_date "2014-12-07",
:data
{:people
[{:projects [{:name "Foo", :id 1} {:name "Bar", :id 2}],
:name "Susan",
:id 1}
{:projects [{:name "Foo", :id 1} {:name "Qux", :id 3}],
:name "John",
:id 2}]}}
nil
Not sure if entirely the best approach:
(defn update-names
[tree people projects]
(reduce
(fn [t [id name]]
(let [person-idx (ffirst (filter #(= (:id (second %)) id)
(map-indexed vector (:people (:data t)))))
temp (assoc-in t [:data :people person-idx :name] name)]
(reduce
(fn [t [id name]]
(let [project-idx (ffirst (filter #(= (:id (second %)) id)
(map-indexed vector (get-in t [:data :people person-idx :projects]))))]
(if project-idx
(assoc-in t [:data :people person-idx :projects project-idx :name] name)
t)))
temp
projects)))
tree
people))
Just call it with your parameters:
(clojure.pprint/pprint (update-names tree people projects))
{:start_date "2014-12-07",
:data
{:people
[{:projects [{:name "Foo", :id 1} {:name "Bar", :id 2}],
:name "Susan",
:id 1}
{:projects [{:name "Foo", :id 1} {:name "Qux", :id 3}],
:name "John",
:id 2}]}}
With nested reduces
Reduce over the people to update corresponding names
For each people, reduce over projects to update corresponding names
The noisesmith solution looks better since doesn't need to find person index or project index for each step.
Naturally you tried to assoc-in or update-in but the problem lies in your tree structure, since the key path to update John name is [:data :people 1 :name], so your assoc-in code would look like:
(assoc-in tree [:data :people 1 :name] "John")
But you need to find John's index in the people vector before you can update it, same things happens with projects inside.

Retrieving maps from a sequence based on contents - Clojure

I have a data set in the format
{:application "name1", :date "date1", :description "desc1"}
{:application "name1", :date "date2", :description "desc2"}
{:application "name2", :date "date1", :description "desc1"}
{:application "name2", :date "date2", :description "desc2"}
etc ...)
My goal is to create a sequence containing the maps for an individual application, as below.
{:application "name1", :date "date1", :description "desc1"}
{:application "name1", :date "date2", :description "desc2"}
I've tried a number of different ways to do this but can't seem to get any to work. My current thinking for how to do it is:
(let[
a (for [x data] (if (= (get x :application) "name1") x))
])
I know there is probably a simple solution to this, but I'm new to Clojure and I just can't figure it out.
Thanks in advance.
David
if you only want name1
(filter (comp #{"name1"} :application) data)
to sort by name
(group-by :application data)