Clojure highest 3 values of a map - clojure

Hi l'm trying to do a function that returns the 3 most commom strings
(take 3 (sort-by val > (frequencies s))))
(freq ["hi" "hi" "hi" "ola" "hello" "hello" "string" "str" "ola" "hello" "hello" "str"])
l've got this so far but a noticed that if there are more than 1 string with the same frenquency it won't return. Is there a way to filter the values of the frequencies funcition by their highest (eventually the top 3 highest)?
Thanks in advance.

I would propose slightly different solution which involves inverting frequencies map with group-by value (which is the items' count):
(->> data
frequencies
(group-by val))
;;{3 [["hi" 3]],
;; 2 [["ola" 2] ["str" 2]],
;; 4 [["hello" 4]],
;; 1 [["string" 1]]}
so the only thing you need is to just sort and process it:
(->> data
frequencies
(group-by val)
(sort-by key >)
(take 3)
(mapv (fn [[k vs]] {:count k :items (mapv first vs)})))
;;[{:count 4, :items ["hello"]}
;; {:count 3, :items ["hi"]}
;; {:count 2, :items ["ola" "str"]}]

frequencies gives you a map, where the keys are the original values to
investigate and the values in that map are the frequency of those
values. For your result you are interested for all original values,
that have the most occurrences including those original values with the
same occurrences.
One way would be to "invert" the frequencies result, to get a map from
occurrences to all original values with that occurrence. Then you can
get the highest N keys and from this map and select them (by using
a "sorted map" for inverting the map, we get the sorting by keys without
further steps).
(defn invert-map
([source]
(invert-map source {}))
([source target]
(reduce (fn [m [k v]]
(update m v (fnil conj []) k))
target
source)))
(assert (=
{1 ["do" "re"]}
(invert-map {"do" 1 "re" 1})))
(defn freq
[n s]
(let [fs (invert-map (frequencies s) (sorted-map-by >))
top-keys (take n (keys fs))]
(select-keys fs top-keys)))
(assert (=
{4 ["hello"], 3 ["hi"], 2 ["ola" "str"]}
(freq 3 ["hi" "hi" "hi" "ola" "hello" "hello" "string" "str" "ola" "hello" "hello" "str"])))

Related

Calculating image of a Set in Clojure includes nil value?

(defn image-of
"computes the image of the element x under R"
[R x]
(set
(for [r R]
(when (= (first r) x)
(second r)))))
Function idea: Add the second variable in R when it's first is equal to x.
So this function is supposed to compute image of a relation. This is kinda successful. When running a test I get this result:
Input: (image-of #{[1 :a] [2 :b] [1 :c] [3 :a]} 1)
Expected: #{:c :a}
Actual: #{nil :c :a}
So it includes a nil value for some reason. What in the function causes this? I guess I could filter out any nil values but would like to have the solution on a single line.
So the problem was I didn't know exactly how to use when
This solution does it:
(set (for [r R
:when (= (first r) x)]
(second r)))
Let me suggest a different approach.
The natural way to represent a relation in Clojure is as a map from keys to sets (or other collections) of values. A function to convert your collection of pairs to this form is ...
(defn pairs->map [pairs]
(reduce
(fn [acc [key value]]
(assoc acc key (conj (acc key #{}) value)))
{}
pairs))
For example, ...
(pairs->map #{[1 :a] [2 :b] [1 :c] [3 :a]})
=> {2 #{:b}, 1 #{:c :a}, 3 #{:a}}
You can use this map as a function. I you feed it a key, it returns the corresponding value:
({2 #{:b}, 1 #{:c :a}, 3 #{:a}} 1)
=> #{:c :a}
You construct this map once and or all and use it as often as you like. Looking it up as a function is effectively a constant-time operation. But you run through the entire collection of pairs every time you evaluate image-of.

How do I create a map of string indexes in Clojure?

I want to create a map where the keys are characters in the string and the values of each key are lists of positions of given character in the string.
a bit shorter variant:
(defn process [^String s]
(group-by #(.charAt s %) (range (count s))))
user> (process "asdasdasd")
;;=> {\a [0 3 6], \s [1 4 7], \d [2 5 8]}
notice that indices here are sorted
I am sure there are several solutions for this. My first thought was use map-indexed to get a list of [index character] then reduce the collection in to a map.
(defn char-index-map [sz]
(reduce
(fn [accum [i ch]]
(update accum ch conj i))
{}
(map-indexed vector sz)))
(char-index-map "aabcab")
;;=> {\a (4 1 0), \b (5 2), \c (3)}

How to create a map from a list of key-value pairs using values as a predicate in Clojure?

Just started learning Clojure, so I imagine my main issue is I don't know how to formulate the problem correctly to find an existing solution. I have a map:
{[0 1 "a"] 2, [0 1 "b"] 1, [1 1 "a"] 1}
and I'd like to "transform" it to:
{[0 1] "a", [1 1] "a"}
i.e. use the two first elements of the composite key as they new key and the third element as the value for the key-value pair that had the highest value in the original map.
I can easily create a new map structure:
=> (into {} (for [[[x y z] v] {[0 1 "a"] 2, [0 1 "b"] 1, [1 1 "a"] 1}] [[x y] {z v}]))
{[0 1] {"b" 1}, [1 1] {"a" 1}}
but into accepts no predicates so last one wins. I also experimented with :let and merge-with but can't seem to correctly refer to the map, eliminate the unwanted pairs or replace values of the map while processing.
You can do this by threading together a series of sequence transformations.
(->> data
(group-by #(->> % key (take 2)))
vals
(map (comp first first (partial sort-by (comp - val))))
(map (juxt #(subvec % 0 2) #(% 2)))
(into {}))
;{[0 1] "a", [1 1] "a"}
... where
(def data {[0 1 "a"] 2, [0 1 "b"] 1, [1 1 "a"] 1})
You build up the solution line by line. I recommend you follow in the footsteps of the construction, starting with ...
(->> data
(group-by #(->> % key (take 2)))
;{(0 1) [[[0 1 "a"] 2] [[0 1 "b"] 1]], (1 1) [[[1 1 "a"] 1]]}
Stacking up layers of (lazy) sequences can run fairly slowly, but the transducers available in Clojure 1.7 will allow you to write faster code in this idiom, as seen in this excellent answer.
Into tends to be most useful when you just need to take a seq of values and with no additional transformation construct a result from it using only conj. Anything else where you are performing construction tends to be better suited by preprocessing such as sorting, or by a reduction which allows you to perform accumulator introspection such as you want here.
First of all we have to be able to compare two strings..
(defn greater? [^String a ^String b]
(> (.compareTo a b) 0))
Now we can write a transformation that compares the current value in the accumulator to the "next" value and keeps the maximum. -> used somewhat gratuitusly to make the update function more readable.
(defn transform [input]
(-> (fn [acc [[x y z] _]] ;; take the acc, [k, v], destructure k discard v
(let [key [x y]] ;; construct key into accumulator
(if-let [v (acc key)] ;; if the key is set
(if (greater? z v) ;; and z (the new val) is greater
(assoc acc key z) ;; then update
acc) ;; else do nothing
(assoc acc key z)))) ;; else update
(reduce {} input))) ;; do that over all [k, v]s from empty acc
user> (def m {[0 1 "a"] 2, [0 1 "b"] 1, [1 1 "a"] 1})
#'user/m
user> (->> m
keys
sort
reverse
(mapcat (fn [x]
(vector (-> x butlast vec)
(last x))))
(apply sorted-map))
;=> {[0 1] "a", [1 1] "a"}

Clojure zip function

I need to build a seq of seqs (vec of vecs) by combining first, second, etc elements of the given seqs.
After a quick searching and looking at the cheat sheet. I haven't found one and finished with writing my own:
(defn zip
"From the sequence of sequences return a another sequence of sequenses
where first result sequense consist of first elements of input sequences
second element consist of second elements of input sequenses etc.
Example:
[[:a 0 \\a] [:b 1 \\b] [:c 2 \\c]] => ([:a :b :c] [0 1 2] [\\a \\b \\c])"
[coll]
(let [num-elems (count (first coll))
inits (for [_ (range num-elems)] [])]
(reduce (fn [cols elems] (map-indexed
(fn [idx coll] (conj coll (elems idx))) cols))
inits coll)))
I'm interested if there is a standard method for this?
(apply map vector [[:a 0 \a] [:b 1 \b] [:c 2 \c]])
;; ([:a :b :c] [0 1 2] [\a \b \c])
You can use the variable arity of map to accomplish this.
From the map docstring:
... Returns a lazy sequence consisting of the result of applying f to
the set of first items of each coll, followed by applying f to the set
of second items in each coll, until any one of the colls is exhausted.
Any remaining items in other colls are ignored....
Kyle's solution is a great one and I see no reason why not to use it, but if you want to write such a function from scratch you could write something like the following:
(defn zip
([ret s]
(let [a (map first s)]
(if (every? nil? a)
ret
(recur (conj ret a) (map rest s)))))
([s]
(reverse (zip nil s))))

How do I operate on every item in a vector AND refer to a previous value in Clojure?

Given:
(def my-vec [{:a "foo" :b 10} {:a "bar" :b 13} {:a "baz" :b 7}])
How could iterate over each element to print that element's :a and the sum of all :b's to that point? That is:
"foo" 10
"bar" 23
"baz" 30
I'm trying things like this to no avail:
; Does not work!
(map #(prn (:a %2) %1) (iterate #(+ (:b %2) %1) 0)) my-vec)
This doesn't work because the "iterate" lazy-seq can't refer to the current element in my-vec (as far as I can tell).
TIA! Sean
user> (reduce (fn [total {:keys [a b]}]
(let [total (+ total b)]
(prn a total)
total))
0 my-vec)
"foo" 10
"bar" 23
"baz" 30
30
You could look at this as starting with a sequence of maps, filtering out a sequence of the :a values and a separate sequence of the rolling sum of the :b values and then mapping a function of two arguments onto the two derived sequences.
create sequence of just the :a and :b values with
(map :a my-vec)
(map :b my-vec)
then a function to get the rolling sum:
(defn sums [sum seq]
"produce a seq of the rolling sum"
(if (empty? seq)
sum
(lazy-seq
(cons sum
(recur (+ sum (first seq)) (rest seq))))))
then put them together:
(map #(prn %1 %s) (map :a my-vec) (sums 0 (map :b my-vec)))
This separates the problem of generating the data from processing it. Hopefully this makes life easier.
PS: whats a better way of getting the rolling sum?
Transform it into the summed sequence:
(defn f [start mapvec]
(if (empty? mapvec) '()
(let [[ m & tail ] mapvec]
(cons [(m :a)(+ start (m :b))] (f (+ start (m :b)) tail)))))
Called as:
(f 0 my-vec)
returns:
(["foo" 10] ["bar" 23] ["baz" 30])