Consider a 2d vector like this:
(def v2d [["a" "b" "c"]
["d" "e" "f"]
["g" "h" "i"]])
I need to swap the x and the v axis. So it needs to be turned into:
[["a" "d" "g"]
["b" "e" "h"]
["c" "f" "i"]]
I came up with this solution:
(defn swap-xy [v2d]
(apply mapv (fn [& args] (vec args)) v2d))
I'm just asking myself if there is one unnecessary step inside. The type of the variable args turns out to be: clojure.lang.PersistentVector$ChunkedSeq . That's why I have to turn it to a vector again by the use of the function vec.
Could the conversion be avoided?
here's a small benchmark regarding the questioned conversion:
user=> (time (def x (apply (fn [& args] args) (range 10000000))))
"Elapsed time: 0.446455 msecs"
#'user/x
user=> (time (def x (apply (fn [& args] (vec args)) (range 10000000))))
"Elapsed time: 721.011768 msecs"
#'user/x
Your function redefines vector, it would be more idiomatic to write it as (apply mapv vector v2d).
BTW, do you really need it to be a vector? It's significantly faster to generate a sequence of vectors: (apply map vector v2d).
Also, benchmarking with time is not very accurate. It's best to use a tool such as Criterium.
Related
I want to create a nested vector [[1 randint1] [2 randint2] ...] up until 100 without looping, but I'm not sure if it's possible.
I've tried creating a multiple hash-maps, but once they're stored in a vector I can't convert the inner maps to vectors as well.
(def rands (take 100 (repeatedly #(rand-int 100))))
(println (vec(map #(hash-map %1 %2) (range 100) rands)))
expect [[1 randint1] [2 randint2] ...] but get [{1 randint1} {2 randint2} ...]
Here's a loop variant that produces the correct output
(def foo {})
(loop
[i 1]
(when (< i 100)
(def foo (conj foo [i (rand-int 100)]))
(recur (inc i))))
Thanks to #akond for the help. This works:
(vec (for [i (range 100)] [(inc i) (rand-int 100)])))
As a newbie to Clojure I often have difficulties to express the simplest things. For example, for replacing the last element in a vector, which would be
v[-1]=new_value
in python, I end up with the following variants in Clojure:
(assoc v (dec (count v)) new_value)
which is pretty long and inexpressive to say the least, or
(conj (vec (butlast v)) new_value)
which even worse, as it has O(n) running time.
That leaves me feeling silly, like a caveman trying to repair a Swiss watch with a club.
What is the right Clojure way to replace the last element in a vector?
To support my O(n)-claim for butlast-version (Clojure 1.8):
(def v (vec (range 1e6)))
#'user/v
user=> (time (first (conj (vec (butlast v)) 55)))
"Elapsed time: 232.686159 msecs"
0
(def v (vec (range 1e7)))
#'user/v
user=> (time (first (conj (vec (butlast v)) 55)))
"Elapsed time: 2423.828127 msecs"
0
So basically for 10 time the number of elements it is 10 times slower.
I'd use
(defn set-top [coll x]
(conj (pop coll) x))
For example,
(set-top [1 2 3] :a)
=> [1 2 :a]
But it also works on the front of lists:
(set-top '(1 2 3) :a)
=> (:a 2 3)
The Clojure stack functions - peek, pop, and conj - work on the natural open end of a sequential collection.
But there is no one right way.
How do the various solutions react to an empty vector?
Your Python v[-1]=new_value throws an exception, as does your (assoc v (dec (count v)) new_value) and my (defn set-top [coll x] (conj (pop coll) x)).
Your (conj (vec (butlast v)) new_value) returns [new_value]. The butlast has no effect.
If you insist on being "pure", your 2nd or 3rd solutions will work. I prefer to be simpler & more explicit using the helper functions from the Tupelo library:
(s/defn replace-at :- ts/List
"Replaces an element in a collection at the specified index."
[coll :- ts/List
index :- s/Int
elem :- s/Any]
...)
(is (= [9 1 2] (replace-at (range 3) 0 9)))
(is (= [0 9 2] (replace-at (range 3) 1 9)))
(is (= [0 1 9] (replace-at (range 3) 2 9)))
As with drop-at, replace-at will throw an exception for invalid values of index.
Similar helper functions exist for
insert-at
drop-at
prepend
append
Note that all of the above work equally well for either a Clojure list (eager or lazy) or a Clojure vector. The conj solution will fail unless you are careful to always coerce the input to a vector first as in your example.
I know I can destructure a vector "from the front" like this:
(fn [[a b & rest]] (+ a b))
Is there any (short) way to access the last two elements instead?
(fn [[rest & a b]] (+ a b)) ;;Not legal
My current alternative is to
(fn [my-vector] (let [[a b] (take-last 2 my-vector)] (+ a b)))
and it was trying to figure out if there is way to do that in a more convenient way directly in the function arguments.
You can peel off the last two elements and add them thus:
((fn [v] (let [[b a] (rseq v)] (+ a b))) [1 2 3 4])
; 7
rseq supplies a reverse sequence for a vector in quick time.
We just destructure its first two elements.
We needn't mention the rest of it, which we don't do anything with.
user=> (def v (vec (range 0 10000000)))
#'user/v
user=> (time ((fn [my-vector] (let [[a b] (take-last 2 my-vector)] (+ a b))) v))
"Elapsed time: 482.965121 msecs"
19999997
user=> (time ((fn [my-vector] (let [a (peek my-vector) b (peek (pop my-vector))] (+ a b))) v))
"Elapsed time: 0.175539 msecs"
19999997
My advice would be to throw convenience to the wind and use peek and pop to work with the end of a vector. When your input vector is very large, you'll see tremendous performance gains.
(Also, to answer the question in the title: no.)
Given a collection"
[{:key "key_1" :value "value_1"}, {:key "key_2" :value "value_2"}]
I would like to convert this to:
{"key_1" "value_1" "key_2" "value_2"}
An function to do this would be:
(defn long->wide [xs]
(apply hash-map (flatten (map vals xs))))
I might simplify this using the threading macro:
(defn long->wide [xs]
(->> xs
(map vals)
(flatten)
(apply hash-map)))
This still requires explicit definition of the function argument which I am not doing anything with other than passing to the first function. I might then rewrite this using comp to remove this:
(def long->wide
(comp (partial apply hash-map) flatten (partial map vals)))
This however requires repeated use of partial which to me is a lot of noise in the function.
Is there a some function in clojure that combines comp and ->> so I can create a higher order function without repeated use of partial, and also which out having to create a new function?
Since many of the answers here already don't answer the original question, but
suggest different approaches, I put that one back up too.
I'd go with reduce and destructuring:
(reduce
(fn [m {:keys [key value]}]
(assoc m key value))
{}
[{:key "key_1" :value "value_1"}, {:key "key_2" :value "value_2"}])
Note, that this will also work with string keys (which you mentioned in the comments) (note :strs):
(reduce
(fn [m {:strs [key value]}]
(assoc m key value))
{}
[{"key" "key_1" "value" "value_1"}, {"key" "key_2" "value" "value_2"}])
Another (point-free) version, when using keywords:
(partial (into {} (map (juxt :key :value))))
Since you mentioned in the comments, that you are using values from a DB, there might also be the chance, that you can switch to just return value tuples. Then the whole operation is just:
(into {} [["key_1" "value_1"]["key_2" "value_2"]])
Also note, that the use of vals on a map and expecting "insertion order" is
dangerous. Small maps are ordered only by accident:
user=> (take 3 (zipmap (range 3) (range 3)))
([0 0] [1 1] [2 2])
user=> (take 3 (zipmap (range 100) (range 100)))
([0 0] [65 65] [70 70])
An other alternative to the nice answers is also:
(apply hash-map (mapcat vals [{:key "key_1" :value "value_1"}, {:key "key_2" :value "value_2"}]))
or:
((comp #(apply hash-map %) #(mapcat vals %)) [{:key "key_1" :value "value_1"}, {:key "key_2" :value "value_2"}])
which are exactly the same.
As with clojure, so many ways to solve most problems.
(partial #(reduce (fn [r m] (assoc r (m :key) (m :value)))
{}
%)))
Not sure if the creation of anonymous functions violates your condition or not but this isn't adding functions to the namespace so I thought I'd throw it out there. This also has the benefit of not requiring the keys in the input maps to be keywords as :key and :value can be replaced with values of any type since the map is in the function position. For example:
(partial #(reduce (fn [r m] (assoc r (m "key") (m "value")))
{}
%)))
What if map and doseq had a baby? I'm trying to write a function or macro like Common Lisp's mapc, but in Clojure. This does essentially what map does, but only for side-effects, so it doesn't need to generate a sequence of results, and wouldn't be lazy. I know that one can iterate over a single sequence using doseq, but map can iterate over multiple sequences, applying a function to each element in turn of all of the sequences. I also know that one can wrap map in dorun. (Note: This question has been extensively edited after many comments and a very thorough answer. The original question focused on macros, but those macro issues turned out to be peripheral.)
This is fast (according to criterium):
(defn domap2
[f coll]
(dotimes [i (count coll)]
(f (nth coll i))))
but it only accepts one collection. This accepts arbitrary collections:
(defn domap3
[f & colls]
(dotimes [i (apply min (map count colls))]
(apply f (map #(nth % i) colls))))
but it's very slow by comparison. I could also write a version like the first, but with different parameter cases [f c1 c2], [f c1 c2 c3], etc., but in the end, I'll need a case that handles arbitrary numbers of collections, like the last example, which is simpler anyway. I've tried many other solutions as well.
Since the second example is very much like the first except for the use of apply and the map inside the loop, I suspect that getting rid of them would speed things up a lot. I have tried to do this by writing domap2 as a macro, but the way that the catch-all variable after & is handled keeps tripping me up, as illustrated above.
Other examples (out of 15 or 20 different versions), benchmark code, and times on a Macbook Pro that's a few years old (full source here):
(defn domap1
[f coll]
(doseq [e coll]
(f e)))
(defn domap7
[f coll]
(dorun (map f coll)))
(defn domap18
[f & colls]
(dorun (apply map f colls)))
(defn domap15
[f coll]
(when (seq coll)
(f (first coll))
(recur f (rest coll))))
(defn domap17
[f & colls]
(let [argvecs (apply (partial map vector) colls)] ; seq of ntuples of interleaved vals
(doseq [args argvecs]
(apply f args))))
I'm working on an application that uses core.matrix matrices and vectors, but feel free to substitute your own side-effecting functions below.
(ns tst
(:use criterium.core
[clojure.core.matrix :as mx]))
(def howmany 1000)
(def a-coll (vec (range howmany)))
(def maskvec (zero-vector :vectorz howmany))
(defn unmaskit!
[idx]
(mx/mset! maskvec idx 1.0)) ; sets element idx of maskvec to 1.0
(defn runbench
[domapfn label]
(print (str "\n" label ":\n"))
(bench (def _ (domapfn unmaskit! a-coll))))
Mean execution times according to Criterium, in microseconds:
domap1: 12.317551 [doseq]
domap2: 19.065317 [dotimes]
domap3: 265.983779 [dotimes with apply, map]
domap7: 53.263230 [map with dorun]
domap18: 54.456801 [map with dorun, multiple collections]
domap15: 32.034993 [recur]
domap17: 95.259984 [doseq, multiple collections interleaved using map]
EDIT: It may be that dorun+map is the best way to implement domap for multiple large lazy sequence arguments, but doseq is still king when it comes to single lazy sequences. Performing the same operation as unmask! above, but running the index through (mod idx 1000), and iterating over (range 100000000), doseq is about twice as fast as dorun+map in my tests (i.e. (def domap25 (comp dorun map))).
You don't need a macro, and I don't see why a macro would be helpful here.
user> (defn do-map [f & lists] (apply mapv f lists) nil)
#'user/do-map
user> (do-map (comp println +) (range 2 6) (range 8 11) (range 22 40))
32
35
38
nil
note do-map here is eager (thanks to mapv) and only executes for side effects
Macros can use varargs lists, as the (useless!) macro version of do-map demonstrates:
user> (defmacro do-map-macro [f & lists] `(do (mapv ~f ~#lists) nil))
#'user/do-map-macro
user> (do-map-macro (comp println +) (range 2 6) (range 8 11) (range 22 40))
32
35
38
nil
user> (macroexpand-1 '(do-map-macro (comp println +) (range 2 6) (range 8 11) (range 22 40)))
(do (clojure.core/mapv (comp println +) (range 2 6) (range 8 11) (range 22 40)) nil)
Addendum:
addressing the efficiency / garbage-creation concerns:
note that below I truncate the output of the criterium bench function, for conciseness reasons:
(defn do-map-loop
[f & lists]
(loop [heads lists]
(when (every? seq heads)
(apply f (map first heads))
(recur (map rest heads)))))
user> (crit/bench (with-out-str (do-map-loop (comp println +) (range 2 6) (range 8 11) (range 22 40))))
...
Execution time mean : 11.367804 µs
...
This looks promising because it doesn't create a data structure that we aren't using anyway (unlike mapv above). But it turns out it is slower than the previous (maybe because of the two map calls?).
user> (crit/bench (with-out-str (do-map-macro (comp println +) (range 2 6) (range 8 11) (range 22 40))))
...
Execution time mean : 7.427182 µs
...
user> (crit/bench (with-out-str (do-map (comp println +) (range 2 6) (range 8 11) (range 22 40))))
...
Execution time mean : 8.355587 µs
...
Since the loop still wasn't faster, let's try a version which specializes on arity, so that we don't need to call map twice on every iteration:
(defn do-map-loop-3
[f a b c]
(loop [[a & as] a
[b & bs] b
[c & cs] c]
(when (and a b c)
(f a b c)
(recur as bs cs))))
Remarkably, though this is faster, it is still slower than the version that just used mapv:
user> (crit/bench (with-out-str (do-map-loop-3 (comp println +) (range 2 6) (range 8 11) (range 22 40))))
...
Execution time mean : 9.450108 µs
...
Next I wondered if the size of the input was a factor. With larger inputs...
user> (def test-input (repeatedly 3 #(range (rand-int 100) (rand-int 1000))))
#'user/test-input
user> (map count test-input)
(475 531 511)
user> (crit/bench (with-out-str (apply do-map-loop-3 (comp println +) test-input)))
...
Execution time mean : 1.005073 ms
...
user> (crit/bench (with-out-str (apply do-map (comp println +) test-input)))
...
Execution time mean : 756.955238 µs
...
Finally, for completeness, the timing of do-map-loop (which as expected is slightly slower than do-map-loop-3)
user> (crit/bench (with-out-str (apply do-map-loop (comp println +) test-input)))
...
Execution time mean : 1.553932 ms
As we see, even with larger input sizes, mapv is faster.
(I should note for completeness here that map is slightly faster than mapv, but not by a large degree).