Let's make frequency map:
(reduce #(update-in %1 [%2] (fnil inc 0)) {} ["a" "b" "a" "c" "c" "a"])
My concern is expression inside lambda #(...) - is it the canonical way to do it? Can I do it better/shorter?
EDIT: Another way I found:
(reduce #(assoc %1 %2 (inc %1 %2 0)) {} ["a" "b" "a" "c" "c" "a"])
Seems like very similar, what are pros/cons? Performance?
Since Clojure 1.2, there is a frequencies function in clojure.core:
user=> (doc frequencies)
-------------------------
clojure.core/frequencies
([coll])
Returns a map from distinct items in coll to the number of times
they appear.
Example:
user=> (frequencies ["a" "b" "a" "c" "c" "a"])
{"a" 3, "b" 1, "c" 2}
It happens to use transients and ternary get; see (source frequencies) for the code, which is as idiomatic as it gets while being highly performance-aware.
There is no need to use update-in. My way would be:
(defn frequencies [coll]
(reduce (fn [m e]
(assoc m e (inc (m e 0))))
{} coll))
Update: I assumed you knew frequencies was in core also and this was just an exercise.
I did a guest lecture a while ago in which I explained how you can get to this solution step by step. Won't be much new for you, since you were already close to the core solution, but maybe it is of value to someone else reading this question. The slides are in Dutch. If you change .html to .org it's easier to get the source code:
http://michielborkent.nl/gastcollege-han-20-06-2013/gastcollege.html
http://michielborkent.nl/gastcollege-han-20-06-2013/gastcollege.org
Another approach using only 'assoc' and recursion:
(defn my-frequencies-helper [freqs a-seq]
(if (empty? a-seq)
freqs
(let [fst (first a-seq)
new_set (if (contains? freqs fst)
(assoc freqs fst (inc (get freqs fst)))
(assoc freqs fst 1))]
(my-frequencies-helper new_set (rest a-seq)))))
(defn my-frequencies [a-seq]
(my-frequencies-helper {} a-seq))
(my-frequencies [1 1 2 2 :D :D :D])
=> {1 2, 2 2, :D 3}
(my-frequencies [:a "moi" :a "moi" "moi" :a 1])
=> {:a 3, "moi" 3, 1 1}
Related
I am fairly new to Clojure and would help help with some code. I have a function which takes a vector and i would like to loop through the vector and get the value at an index 'i' and the value of 'i' itself. 'i' is the value which is incremented in the loop.
I have checked 'for' at the clojure docs at for and wrote the following code.
(for [i some-vector]
(print (get-intersec i (.length some-vector) loop-count)))
The loop-count variable is supposed to be the loop count.
I have also checked loop but it does not seem like a feasible solution. Can someone help me with a clojure function i can use or help me write a macro or function that can do that.
Thank you.
Ps: To solve my problem, i use my own counter but would like a better solution.
First, keep in mind that for is for list comprehension, that is, creating new sequences. For looping through a sequence for some side effect, like printing, you probably want to use doseq.
To include a numeric count with each element as you loop through, you can use map-indexed:
(def xs [:a :b :c :d])
(doseq [[n elem] (map-indexed #(vector %1 %2) xs)]
(println n "->" elem))
Output:
0 -> :a
1 -> :b
2 -> :c
3 -> :d
If you find yourself doing this a lot, like I did, you can create a macro:
(defmacro doseq-indexed [[[item idx] coll] & forms]
`(doseq [[~idx ~item] (map-indexed #(vector %1 %2) ~coll)]
~#forms))
And use it like this:
> (doseq-indexed [[n elem] xs] (println n "->" elem))
0 -> :a
1 -> :b
2 -> :c
3 -> :d
Don't forget dotimes for simple stuff like this:
(let [data [:a :b :c :d]]
(dotimes [i (count data)]
(println i " -> " (data i))
; or (nth data i)
; or (get data i)
))
with result
0 -> :a
1 -> :b
2 -> :c
3 -> :d
Using loop/recur would look like this:
(let [data [:a :b :c :d]]
(loop [i 0
items data]
(let [curr (first items)]
(when curr
(println i "->" curr)
(recur (inc i) (rest items))))))
Update:
If you need this a lot, I already wrote a function that will add the index value to the beginning of each entry in a sequence:
(ns tst.demo.core
(:use tupelo.test)
(:require [tupelo.core :as t]) )
(dotest
(let [data [:a :b :c :d]]
(t/spy-pretty :indexed-data
(t/indexed data))))
with result
:indexed-data =>
([0 :a]
[1 :b]
[2 :c]
[3 :d])
The general signature is:
(indexed & colls)
Given one or more collections, returns a sequence of indexed tuples
from the collections like:
(indexed xs ys zs) -> [ [0 x0 y0 z0]
[1 x1 y1 z1]
[2 x2 y2 z2]
... ]
If your not set on using for, you could use map-indexed e.g.
(map-indexed (fn [i v]
(get-intersect v (.length some-vector) i))
some-vector))
I don't know what get-intersect is and assume .length is java interop? Anyway, map-indexed expects a function of 2 arguments, the 1st is the index and the second is the value.
Getting data from the database as a list of maps (LazySeq) leaves me in need of transforming it into a map of maps.
I tried to 'assoc' and 'merge', but that didn't bring the desired result because of the nesting.
This is the form of my data:
(def data (list {:structure 1 :cat "A" :item "item1" :val 0.1}
{:structure 1 :cat "A" :item "item2" :val 0.2}
{:structure 1 :cat "B" :item "item3" :val 0.4}
{:structure 2 :cat "A" :item "item1" :val 0.3}
{:structure 2 :cat "B" :item "item3" :val 0.5}))
I would like to get it in the form
=> {1 {"A" {"item1" 0.1}
"item2" 0.2}}
{"B" {"item3" 0.4}}
2 {"A" {"item1" 0.3}}
{"B" {"item3" 0.5}}}
I tried
(->> data
(map #(assoc {} (:structure %) {(:cat %) {(:item %) (:val %)}}))
(apply merge-with into))
This gives
{1 {"A" {"item2" 0.2}, "B" {"item3" 0.4}},
2 {"A" {"item1" 0.3}, "B" {"item3" 0.5}}}
By merging I lose some entries, but I can't think of any other way. Is there a simple way? I was even about to try to use specter.
Any thoughts would be appreciated.
If I'm dealing with nested maps, first stop is usually to think about update-in or assoc-in - these take a sequence of the nested keys. For a problem like this where the data is very regular, it's straightforward.
(assoc-in {} [1 "A" "item1"] 0.1)
;; =>
{1 {"A" {"item1" 0.1}}}
To consume a sequence into something else, reduce is the idiomatic choice. The reducing function is right on the edge of the complexity level I'd consider an anonymous fn for, so I'll pull it out instead for clarity.
(defn- add-val [acc line]
(assoc-in acc [(:structure line) (:cat line) (:item line)] (:val line)))
(reduce add-val {} data)
;; =>
{1 {"A" {"item1" 0.1, "item2" 0.2}, "B" {"item3" 0.4}},
2 {"A" {"item1" 0.3}, "B" {"item3" 0.5}}}
Which I think was the effect you were looking for.
Roads less travelled:
As your sequence is coming from a database, I wouldn't worry about using a transient collection to speed the aggregation up. Also, now I think about it, dealing with nested transient maps is a pain anyway.
update-in would be handy if you wanted to add up any values with the same key, for example, but the implication of your question is that structure/cat/item tuples are unique and so you just need the grouping.
juxt could be used to generate the key structure - i.e.
((juxt :structure :cat :item) (first data))
[1 "A" "item1"]
but it's not clear to me that there's any way to use this to make the add-val fn more readable.
You may continue to use your existing code. Only the final merge has to change:
(defn deep-merge [& xs]
(if (every? map? xs)
(apply merge-with deep-merge xs)
(apply merge xs)))
(->> data
(map #(assoc {} (:structure %) {(:cat %) {(:item %) (:val %)}}))
(apply deep-merge))
;; =>
{1
{"A" {"item1" 0.1, "item2" 0.2},
"B" {"item3" 0.4}},
2
{"A" {"item1" 0.3},
"B" {"item3" 0.5}}}
Explanation: your original (apply merge-with into) only merge one level down. deep-merge from above will recurse into all nested maps to do the merge.
Addendum: #pete23 - one use of juxt I can think of is to make the function reusable. For example, we can extract arbitrary fields with juxt, then convert them to nested maps (with yet another function ->nested) and finally do a deep-merge:
(->> data
(map (juxt :structure :cat :item :val))
(map ->nested)
(apply deep-merge))
where ->nested can be implemented like:
(defn ->nested [[k & [v & r :as t]]]
{k (if (seq r) (->nested t) v)})
(->nested [1 "A" "item1" 0.1])
;; => {1 {"A" {"item1" 0.1}}}
One sample application (sum val by category):
(let [ks [:cat :val]]
(->> data
(map (apply juxt ks))
(map ->nested)
(apply (partial deep-merge-with +))))
;; => {"A" 0.6000000000000001, "B" 0.9}
Note deep-merge-with is left as an exercise for our readers :)
(defn map-values [f m]
(into {} (map (fn [[k v]] [k (f v)])) m))
(defn- transform-structures [ss]
(map-values (fn [cs]
(into {} (map (juxt :item :val) cs))) (group-by :cat ss)))
(defn transform [data]
(map-values transform-structures (group-by :structure data)))
then
(transform data)
I've got this list of fields (that's Facebook's graph API fields list).
["a" "b" ["c" ["t"] "d"] "e" ["f"] "g"]
I want to generate a map out of it. The convention is following, if after a key vector follows, then its an inner object for the key. Example vector could be represented as a map as:
{"a" "value"
"b" {"c" {"t" "value"} "d" "value"}
"e" {"f" "value"}
"g" "value"}
So I have this solution so far
(defn traverse
[data]
(mapcat (fn [[left right]]
(if (vector? right)
(let [traversed (traverse right)]
(mapv (partial into [left]) traversed))
[[right]]))
(partition 2 1 (into [nil] data))))
(defn facebook-fields->map
[fields default-value]
(->> fields
(traverse)
(reduce #(assoc-in %1 %2 nil) {})
(clojure.walk/postwalk #(or % default-value))))
(let [data ["a" "b" ["c" ["t"] "d"] "e" ["f"] "g"]]
(facebook-fields->map data "value"))
#=> {"a" "value", "b" {"c" {"t" "value"}, "d" "value"}, "e" {"f" "value"}, "g" "value"}
But it is fat and difficult to follow. I am wondering if there is a more elegant solution.
Here's another way to do it using postwalk for the whole traversal, rather than using it only for default-value replacement:
(defn facebook-fields->map
[fields default-value]
(clojure.walk/postwalk
(fn [v] (if (coll? v)
(->> (partition-all 2 1 v)
(remove (comp coll? first))
(map (fn [[l r]] [l (if (coll? r) r default-value)]))
(into {}))
v))
fields))
(facebook-fields->map ["a" "b" ["c" ["t"] "d"] "e" ["f"] "g"] "value")
=> {"a" "value",
"b" {"c" {"t" "value"}, "d" "value"},
"e" {"f" "value"},
"g" "value"}
Trying to read heavily nested code makes my head hurt. It is worse when the answer is something of a "force-fit" with postwalk, which does things in a sort of "inside out" manner. Also, using partition-all is a bit of a waste, since we need to discard any pairs with two non-vectors.
To me, the most natural solution is a simple top-down recursion. The only problem is that we don't know in advance if we need to remove one or two items from the head of the input sequence. Thus, we can't use a simple for loop or map.
So, just write it as a straightforward recursion, and use an if to determine whether we consume 1 or 2 items from the head of the list.
If the 2nd item is a value, we consume one item and add in
:dummy-value to make a map entry.
If the 2nd item is a vector, we recurse and use that
as the value in the map entry.
The code:
(ns tst.demo.core
(:require [clojure.walk :as walk] ))
(def data ["a" "b" ["c" ["t"] "d"] "e" ["f"] "g"])
(defn parse [data]
(loop [result {}
data data]
(if (empty? data)
(walk/keywordize-keys result)
(let [a (first data)
b (second data)]
(if (sequential? b)
(recur
(into result {a (parse b)})
(drop 2 data))
(recur
(into result {a :dummy-value})
(drop 1 data)))))))
with result:
(parse data) =>
{:a :dummy-value,
:b {:c {:t :dummy-value}, :d :dummy-value},
:e {:f :dummy-value},
:g :dummy-value}
I added keywordize-keys at then end just to make the result a little more "Clojurey".
Since you're asking for a cleaner solution as opposed to a solution, and because I thought it was a neat little problem, here's another one.
(defn facebook-fields->map [coll]
(into {}
(keep (fn [[x y]]
(when-not (vector? x)
(if (vector? y)
[x (facebook-fields->map y)]
[x "value"]))))
(partition-all 2 1 coll)))
I'm trying to write a function with recur that cut the sequence as soon as it encounters a repetition ([1 2 3 1 4] should return [1 2 3]), this is my function:
(defn cut-at-repetition [a-seq]
(loop[[head & tail] a-seq, coll '()]
(if (empty? head)
coll
(if (contains? coll head)
coll
(recur (rest tail) (conj coll head))))))
The first problem is with the contains? that throws an exception, I tried replacing it with some but with no success. The second problem is in the recur part which will also throw an exception
You've made several mistakes:
You've used contains? on a sequence. It only works on associative
collections. Use some instead.
You've tested the first element of the sequence (head) for empty?.
Test the whole sequence.
Use a vector to accumulate the answer. conj adds elements to the
front of a list, reversing the answer.
Correcting these, we get
(defn cut-at-repetition [a-seq]
(loop [[head & tail :as all] a-seq, coll []]
(if (empty? all)
coll
(if (some #(= head %) coll)
coll
(recur tail (conj coll head))))))
(cut-at-repetition [1 2 3 1 4])
=> [1 2 3]
The above works, but it's slow, since it scans the whole sequence for every absent element. So better use a set.
Let's call the function take-distinct, since it is similar to take-while. If we follow that precedent and make it lazy, we can do it thus:
(defn take-distinct [coll]
(letfn [(td [seen unseen]
(lazy-seq
(when-let [[x & xs] (seq unseen)]
(when-not (contains? seen x)
(cons x (td (conj seen x) xs))))))]
(td #{} coll)))
We get the expected results for finite sequences:
(map (juxt identity take-distinct) [[] (range 5) [2 3 2]]
=> ([[] nil] [(0 1 2 3 4) (0 1 2 3 4)] [[2 3 2] (2 3)])
And we can take as much as we need from an endless result:
(take 10 (take-distinct (range)))
=> (0 1 2 3 4 5 6 7 8 9)
I would call your eager version take-distinctv, on the map -> mapv precedent. And I'd do it this way:
(defn take-distinctv [coll]
(loop [seen-vec [], seen-set #{}, unseen coll]
(if-let [[x & xs] (seq unseen)]
(if (contains? seen-set x)
seen-vec
(recur (conj seen-vec x) (conj seen-set x) xs))
seen-vec)))
Notice that we carry the seen elements twice:
as a vector, to return as the solution; and
as a set, to test for membership of.
Two of the three mistakes were commented on by #cfrick.
There is a tradeoff between saving a line or two and making the logic as simple & explicit as possible. To make it as obvious as possible, I would do it something like this:
(defn cut-at-repetition
[values]
(loop [remaining-values values
result []]
(if (empty? remaining-values)
result
(let [found-values (into #{} result)
new-value (first remaining-values)]
(if (contains? found-values new-value)
result
(recur
(rest remaining-values)
(conj result new-value)))))))
(cut-at-repetition [1 2 3 1 4]) => [1 2 3]
Also, be sure to bookmark The Clojure Cheatsheet and always keep a browser tab open to it.
I'd like to hear feedback on this utility function which I wrote for myself (uses filter with stateful pred instead of a loop):
(defn my-distinct
"Returns distinct values from a seq, as defined by id-getter."
[id-getter coll]
(let [seen-ids (volatile! #{})
seen? (fn [id] (if-not (contains? #seen-ids id)
(vswap! seen-ids conj id)))]
(filter (comp seen? id-getter) coll)))
(my-distinct identity "abracadabra")
; (\a \b \r \c \d)
(->> (for [i (range 50)] {:id (mod (* i i) 21) :value i})
(my-distinct :id)
pprint)
; ({:id 0, :value 0}
; {:id 1, :value 1}
; {:id 4, :value 2}
; {:id 9, :value 3}
; {:id 16, :value 4}
; {:id 15, :value 6}
; {:id 7, :value 7}
; {:id 18, :value 9})
Docs of filter says "pred must be free of side-effects" but I'm not sure if it is ok in this case. Is filter guaranteed to iterate over the sequence in order and not for example take skips forward?
Is there a way to mimic a this variable in something like (def foo {:two 2 :three (inc (:two this))})? Even better would be something like (def foo {:two 2 :three (inc ::two)}). I was told that there is a library that does exactly this, but I can't really find anything similar.
Thanks!
If you want a temporary name for something, that's what let is for.
(def foo (let [x {:two 2}]
(assoc x :three (inc (:two x)))))
I don't know of any library that does what you want. Every once in a while, someone suggests a "generalized arrow", like -> but with a magic symbol you can stick in the intermediary expressions which will be replaced by something else. See for example here and here. But this idea tends to be shot down because it's more complex and confusing for little benefit. let is your friend. See Rich's example:
(let [x []
x (conj x 1)
x (into x [2 3])
x (map inc x)]
...)
(Update: Rearranged & reworked. build-map and (a sketch of) -m> macros added.)
You could write this particular example as
(def foo (zipmap [:two :three] (iterate inc 2)))
The easiest general solution which occurs to me at this moment is
user> (-> {} (assoc :two 2) (#(assoc % :three (inc (:two %)))))
{:three 3, :two 2}
It's actually very flexible, although it does require you to write out assoc repeatedly.
To enable syntax similar to that from the question text, you could use something like this:
(defn build-map* [& kvs]
(reduce (fn [m [k v]]
(assoc m k (v m)))
{}
kvs))
(defmacro build-map [& raw-kvs]
(assert (even? (count raw-kvs)))
(let [kvs (map (fn [[k v]] [k `(fn [m#] (let [~'this m#] ~v))])
(partition 2 raw-kvs))]
`(build-map* ~#kvs)))
user> (build-map :two 2 :three (inc (:two this)))
{:three 3, :two 2}
You could easily change this to use a user-supplied symbol rather than the hardcoded this. Or you could switch to %, which is just a regular symbol outside anonymous function literals. Maybe add an explicit initial map argument, call it -m> (for map threading) and you can do
(-m> {} :two 2 :three (inc (:two %)))
for the same result.
Another funky way (mostly for the fun):
;;; from Alex Osborne's debug-repl,
;;; see http://gist.github.com/252421
;;; now changed to use &env
(defmacro local-bindings
"Produces a map of the names of local bindings to their values."
[]
(let [symbols (map key &env)]
(zipmap (map (fn [sym] `(quote ~sym)) symbols) symbols)))
(let [two 2
three (inc two)]
(into {} (map (fn [[k v]] [(keyword k) v]) (local-bindings))))
{:two 2, :three 3}
Note that this will also capture the bindings introduced by any outer let forms...