This question already has an answer here:
Count number of times a letter is in a word
(1 answer)
Closed 3 years ago.
Clojure: Is there a better (or more idiomatic to the language) way to count the letters of a string into a hashmap, than:
; Clojure 1.10.1
user=> (def s1 "A string with some letters")
user=> (def d1 (apply merge-with + (map #(hash-map % 1) (seq s1))))
user=> d1
{\space 4, \A 1, \e 3, \g 1, \h 1, \i 2, \l 1, \m 1, \n 1, \o 1, \r 2, \s 3, \t 4, \w 1}
?
Use frequencies (part of clojure.core)
Description: https://clojuredocs.org/clojure.core/frequencies
Implementation: https://github.com/clojure/clojure/blob/clojure-1.9.0/src/clj/clojure/core.clj#L7123
(def s1 "A string with some letters")
(def d1 (apply merge-with + (map #(hash-map % 1) (seq s1))))
(def d2 (frequencies s1))
(println d1) ; { 4, A 1, e 3, g 1, h 1, i 2, l 1, m 1, n 1, o 1, r 2, s 3, t 4, w 1}
(println d2) ; { 4, A 1, e 3, g 1, h 1, i 2, l 1, m 1, n 1, o 1, r 2, s 3, t 4, w 1}
(println (= d1 d2)) ; true
Performance
(defn rand-str [len]
" Generates Random String "
(apply str (take len (repeatedly #(char (+ (rand 26) 65))))))
(def s (rand-str 100000))
(time (frequencies s)) # ~100 ms
(time (apply merge-with + (map #(hash-map % 1) (seq s)))) # ~600 ms
Thus frequencies ~6X faster with 100, 000 elements
The better performance of frequencies is probably due to its use of transient data structures.
Related
(defn to-percentage [wins total]
(if (= wins 0) 0
(* (/ wins total) 100)))
(defn calc-winrate [matches]
(let [data (r/atom [])]
(loop [wins 0
total 0]
(if (= total (count matches))
#data
(recur (if (= (get (nth matches total) :result) 1)
(inc wins))
(do
(swap! data conj (to-percentage wins total))
(inc total)))))))
(calc-winrate [{:result 0} {:result 1} {:result 0} {:result 1} {:result 1}])
I got the following code, calc-winrate on the last line returns [0 0 50 0 25]. I'm trying to make it return [0 50 33.33333333333333 50 60].
Am I doing the increment for wins wrong? When I print the value of wins for each iteration I get
0
nil
1
nil
1
so I'm guessing I somehow reset or nil wins somehow?
Also, could this whole loop be replaced with map/map-indexed or something? It feels like map would be perfect to use but I need to keep the previous iteration wins/total in mind for each iteration.
Thanks!
Here's a lazy solution using reductions to get a sequence of running win totals, and transducers to 1) join the round numbers with the running totals 2) divide the pairs 3) convert fractions to percentages:
(defn calc-win-rate [results]
(->> results
(map :result)
(reductions +)
(sequence
(comp
(map-indexed (fn [round win-total] [win-total (inc round)]))
(map (partial apply /))
(map #(* 100 %))
(map float)))))
(calc-win-rate [{:result 0} {:result 1} {:result 0} {:result 1} {:result 1}])
=> (0.0 50.0 33.333332 50.0 60.0)
You can calculate the running win rates as follows:
(defn calc-winrate [matches]
(map
(comp float #(* 100 %) /)
(reductions + (map :result matches))
(rest (range))))
For example,
=> (calc-winrate [{:result 0} {:result 1} {:result 0} {:result 1} {:result 1}])
(0.0 50.0 33.333332 50.0 60.0)
The map operates on two sequences:
(reductions + (map :result matches)) - the running total of wins;
(rest (range)))) - (1 2 3 ... ), the corresponding number of matches.
The mapping function, (comp float #(* 100 %) /),
divides the corresponding elements of the sequences,
multiplies it by 100, and
turns it into floating point.
Here's a solution with reduce:
(defn calc-winrate [matches]
(let [total-matches (count matches)]
(->> matches
(map :result)
(reduce (fn [{:keys [wins matches percentage] :as result} match]
(let [wins (+ wins match)
matches (inc matches)]
{:wins wins
:matches matches
:percentage (conj percentage (to-percentage wins matches))}))
{:wins 0
:matches 0
:percentage []}))))
So the thing here is to maintain (and update) the state of the calculation thus far.
We do that in the map that's
{:wins 0
:matches 0
:percentage []}
Wins will contain the wins so far, matches are the number of matches we've analysed, and percentage is the percentage for so far.
(if (= (get (nth matches total) :result) 1)
(inc wins))
your if shall be written as follows:
(if (= (get (nth matches total) :result) 1)
(inc wins)
wins ; missing here , other wise it will return a nil, binding to wins in the loop
)
if you go with a reductions ,
(defn calc-winrate2 [ x y ]
(let [ {total :total r :wins } x
{re :result } y]
(if (pos? re )
{:total (inc total) :wins (inc r)}
{:total (inc total) :wins r}
)
)
)
(reductions calc-winrate2 {:total 0 :wins 0} [ {:result 0} {:result 1} {:result 0} {:result 1} {:result 1}])
I'm converting some C++ code to Clojure, and I want
to return a graph g with a bunch of edges added to it.
I pass in the the number of vertices, the graph, and
the test predicate (eg, a function that could depend on i, j, randomness, ...) something like this:
(defn addSomeEdges [v g test-p]
(doseq [i (range v)]
(doseq [j (range (dec i))]
(if test-p
(add-edges g [i j] )
)))
g)
the problem, of course, is that (add-edges) returns a new g. How can I capture this updated graph using best practices Clojure, please? It seems so simple and natural in C++.
Iterativly accumulating information looks like a reducing function if you split it into two parts:
Generate a bunch of edges to consider including.
Test each edge and if it passes, include it. Otherwise pass the result on unchanged
Which can be written using reduce
user> (defn add-edge [g i j]
(assoc g i j))
#'user/add-edge
user> (add-edge {1 2} 2 1)
{1 2, 2 1}
user> (defn addSomeEdges [v g test-p]
(reduce (fn [graph [i j]] ;; this takes the current graph, the points,
(if (test-p graph i j) ;; decides if the edge should be created.
(add-edge graph i j) ;; and returns the next graph
graph)) ;; or returns the graph unchanged.
g ;; This is the initial graph
(for [i (range v)
j (range (dec i))]
[i j]))) ;; this generates the candidate edges to check.
#'user/addSomeEdges
and let's run it!
user> (addSomeEdges 4 {1 2} (fn [g i j] (rand-nth [true false])))
{1 2, 2 0}
user> (addSomeEdges 4 {1 2} (fn [g i j] (rand-nth [true false])))
{1 2, 3 0}
user> (addSomeEdges 4 {1 2} (fn [g i j] (rand-nth [true false])))
{1 2, 2 0, 3 1}
When you think of other tests you can thread these calls together:
user> (as-> {1 2} g
(addSomeEdges 4 g (fn [g i j] (rand-nth [true false])))
(addSomeEdges 7 g (fn [g i j] (< i j)))
(addSomeEdges 9 g (fn [g i j] (contains? (set (keys g)) j))))
{1 2, 3 1, 4 1, 5 3, 6 4, 7 5, 8 6}
There is more than one solution to this. Sometimes, though, when you have a fundamentally mutable/imperative problem, you should just use a mutable/imperative solution:
; simplest version using mutation
(defn addSomeEdges [v g test-p]
(let [g-local (atom g)]
(doseq [i (range v)]
(doseq [j (range (dec i))]
(when (test-p i j ...) ; what other args does this need?
(swap! g-local add-edges [i j]))))
#g-local))
I was a little uncertain on the semtantics of test-p, so that part may need refinement.
Note the swap! will call add-edges like so:
(add-edges <curr val of g-local> [i j])
See the Clojure CheatSheet & ClojureDocs.org for more info.
In this question, you are given a value V and a list of unique integers. Your job is to find the number of distinct subsets of size 4 that sum up to V. Each element in the list can be used only once. If none of such subset can be found, output 0 instead.
For example, if the integers are [3, 4, 5, 6, 7, 8, 9, 10] and the value is 30, the output should be 5. The subsets are:
[3, 8, 9, 10]
[4, 7, 9, 10]
[5, 6, 9, 10]
[5, 7, 8, 10]
[6, 7, 8, 9].
It is not hard to solve this question, the most direct way is to nest for-loop four times. What's the Clojure way to do it?
Here is how I would've done it:
(ns example.solve
(:require [clojure.math.combinatorics :as combo]))
(defn solve
[s n v]
(filter (comp (partial = v)
(partial reduce +))
(combo/combinations s n)))
I'm using math.combinatorics in my example, because it's a simplest way to get all combinations of 4 elements from a list.
Here is an example of using solve:
=> (solve [3 4 5 6 7 8 9 10] 4 30)
((3 8 9 10) (4 7 9 10) (5 6 9 10) (5 7 8 10) (6 7 8 9))
I would use clojure.map.combinatorics/combinations to get all 4-element subsets and then filter out those that do not sum up to V.
Interestingly, this problem admits a (doubly?) recursive solution which involves only summing and counting (without actually generating the subsets.)
If you look at the initial element 3, then part of the solution is the number of sums taken from 3 elements in the rest of the sequence where the sum is 27, which is a smaller form of the same problem and can thus be solved recursively. The bottom of the recursion is when you are looking for sums produced from 1 element, which boils down to a simple check to see if the desired sum is in the list.
The other part of the solution involves looking at the next element 4, looking for sums in the rest of the list beyond the 4 equal to 26, and so on... This part can also be treated recursively.
Putting this together as a recursive function looks like the following, which produces the desired answer 5 for the example sequence.
(defn solve [xs n len]
(if (seq xs)
(if (= len 1)
(if (some #{n} xs) 1 0)
(+ (solve (rest xs)
(- n (first xs))
(dec len))
(solve (rest xs)
n
len)))
0))
(solve [3 4 5 6 7 8 9 10] 30 4)
;=> 5
In terms of directly answering the question, here is how you could do it using indexes and a for loop:
(defn solve-for [xs v]
(for [ndx0 (range 0 (- (count xs) 3))
ndx1 (range (inc ndx0) (- (count xs) 2))
ndx2 (range (inc ndx1) (- (count xs) 1))
ndx3 (range (inc ndx2) (count xs))
:when (= v (+ (xs ndx0) (xs ndx1) (xs ndx2) (xs ndx3)))]
(list (xs ndx0) (xs ndx1) (xs ndx2) (xs ndx3))))
FWIW, this turns out to be about 70% faster than the approach using clojure.math.combinatorics but twice as slow as the doubly-recursive solution.
I'm currently using re-seq to find the matches of comments inside a piece of java source code.
(re-seq #"(?:/\*(?:[^*]|(?:\*+[^*/]))*\*+/)|(?://.*)" code)
How can I get the index / indices of the matches in the original string code? i.e. To find the starting (and ending) point of the original string code.
You can modify re-seq with the requisite Java interop:
(defn re-seq-pos [pattern string]
(let [m (re-matcher pattern string)]
((fn step []
(when (. m find)
(cons {:start (. m start) :end (. m end) :group (. m group)}
(lazy-seq (step))))))))
Example
(re-seq-pos #"\w+" "foo bar baz") ;=>
({:start 0, :end 3, :group "foo"}
{:start 4, :end 7, :group "bar"}
{:start 8, :end 11, :group "baz"})
In Scala, the partition method splits a sequence into two separate sequences -- those for which the predicate is true and those for which it is false:
scala> List(1, 5, 2, 4, 6, 3, 7, 9, 0, 8).partition(_ % 2 == 0)
res1: (List[Int], List[Int]) = (List(2, 4, 6, 0, 8),List(1, 5, 3, 7, 9))
Note that the Scala implementation only traverses the sequence once.
In Clojure the partition-by function splits the sequence into multiple sub-sequences, each the longest subset that either does or does not meet the predicate:
user=> (partition-by #(= 0 (rem % 2)) [1, 5, 2, 4, 6, 3, 7, 9, 0, 8])
((1 5) (2 4 6) (3 7 9) (0 8))
while the split-by produces:
user=> (split-with #(= 0 (rem % 2)) [1, 5, 2, 4, 6, 3, 7, 9, 0, 8])
[() (1 5 2 4 6 3 7 9 0 8)]
Is there a built-in Clojure function that does the same thing as the Scala partition method?
I believe the function you are looking for is clojure.core/group-by. It returns a map of keys to lists of items in the original sequence for which the grouping function returns that key. If you use a true/false producing predicate, you will get the split that you are looking for.
user=> (group-by even? [1, 5, 2, 4, 6, 3, 7, 9, 0, 8])
{false [1 5 3 7 9], true [2 4 6 0 8]}
If you take a look at the implementation, it fulfills your requirement that it only use one pass. Plus, it uses transients under the hood so it should be faster than the other solutions posted thus far. One caveat is that you should be sure of the keys that your grouping function is producing. If it produces nil instead of false, then your map will list failing items under the nil key. If your grouping function produces non-nil values instead of true, then you could have passing values listed under multiple keys. Not a big problem, just be aware that you need to use a true/false producing predicate for your grouping function.
The nice thing about group-by is that it is more general than just splitting a sequence into passing and failing items. You can easily use this function to group your sequence into as many categories as you need. Very useful and flexible. That is probably why group-by is in clojure.core instead of separate.
Part of clojure.contrib.seq-utils:
user> (use '[clojure.contrib.seq-utils :only [separate]])
nil
user> (separate even? [1, 5, 2, 4, 6, 3, 7, 9, 0, 8])
[(2 4 6 0 8) (1 5 3 7 9)]
Please note that the answers of Jürgen, Adrian and Mikera all traverse the input sequence twice.
(defn single-pass-separate
[pred coll]
(reduce (fn [[yes no] item]
(if (pred item)
[(conj yes item) no]
[yes (conj no item)]))
[[] []]
coll))
A single pass can only be eager. Lazy has to be two pass plus weakly holding onto the head.
Edit: lazy-single-pass-separate is possible but hard to understand. And in fact, I believe this is slower then a simple second pass. But I haven't checked that.
(defn lazy-single-pass-separate
[pred coll]
(let [coll (atom coll)
yes (atom clojure.lang.PersistentQueue/EMPTY)
no (atom clojure.lang.PersistentQueue/EMPTY)
fill-queue (fn [q]
(while (zero? (count #q))
(locking coll
(when (zero? (count #q))
(when-let [s (seq #coll)]
(let [fst (first s)]
(if (pred fst)
(swap! yes conj fst)
(swap! no conj fst))
(swap! coll rest)))))))
queue (fn queue [q]
(lazy-seq
(fill-queue q)
(when (pos? (count #q))
(let [item (peek #q)]
(swap! q pop)
(cons item (queue q))))))]
[(queue yes) (queue no)]))
This is as lazy as you can get:
user=> (let [[y n] (lazy-single-pass-separate even? (report-seq))] (def yes y) (def no n))
#'user/no
user=> (first yes)
">0<"
0
user=> (second no)
">1<"
">2<"
">3<"
3
user=> (second yes)
2
Looking at the above, I'd say "go eager" or "go two pass."
It's not hard to write something that does the trick:
(defn partition-2 [pred coll]
((juxt
(partial filter pred)
(partial filter (complement pred)))
coll))
(partition-2 even? (range 10))
=> [(0 2 4 6 8) (1 3 5 7 9)]
Maybe see https://github.com/amalloy/clojure-useful/blob/master/src/useful.clj#L50 - whether it traverses the sequence twice depends on what you mean by "traverse the sequence".
Edit: Now that I'm not on my phone, I guess it's silly to link instead of paste:
(defn separate
[pred coll]
(let [coll (map (fn [x]
[x (pred x)])
coll)]
(vec (map #(map first (% second coll))
[filter remove]))))