Grouping a seq by different sizes - Clojure

Grouping a seq by different sizes - Clojure - clojure

I have the following data:
(def letters [:a :b :c :d :e :f :g ])
(def group-sizes [2 3 2])
What would be an idiomatic way to group letters by size, such that I get:
[[:a :b] [:c :d :e] [:f :g]]
Thanks.

(->> group-sizes
(reductions + 0)
(partition 2 1)
(map (partial apply subvec letters)))
This algorithm requires the input coll letters to be a vector and to have at least the required amount of (apply + group-sizes) elements. It returns a lazy seq (or a vector if you use mapv) of vectors that share structure with the input vector.
Thanks to subvec they are created in O(1), constant time so the overall time complexity should be O(N) where N is (count group-sizes), compared to Diegos algorithm where N would be the drastically higher (count letters).

After I started writing my answer, I noticed, that Leon Grapenthin's solution is almost identical to mine.
Here is my version of it:
(let [end (reductions + group-sizes)
start (cons 0 end)]
(map (partial subvec letters) start end))
The only difference from Leon Grapenthin's solution is that I'm using let and cons instead of partition and apply.
Note, that both solutions consume group-sizes lazily, thus producing a lazy sequence as an output.

Not necessarily the best way (e.g. you may want to check that the sum of the group sizes is the same as the size of letters to avoid an NPE) but it was my first thought:
(defn sp [[f & r] l]
(when (seq l)
(cons (take f l)
(sp r (drop f l)))))
You could also do it with an accumulator and recur if you have a long list and don't want to blow up the stack.

Related

Combine transduction output with input into a hashmap

I want to do the following in Clojure as idiomatically as possible:
transduce a collection
associate each element of the input collection with the corresponding element in the output collection
return the result in a hashmap
Is there a succinct way to do this using core library functions?
If not, what improvements can you suggest to the following implementation?
(defn to-hash [coll xform]
(reduce
merge
(map
#(apply hash-map %)
(mapcat hash-map coll (into [] xform coll)))))

something like this should do the trick without intermediate collections:
(defn process [data xform]
(zipmap data (eduction xform data)))
user> (process [1 2 3] (comp (map inc) (map #(* % %))))
;;=> {1 4, 2 9, 3 16}
the docs on eduction say the following:
Returns a reducible/iterable application of the transducers
to the items in coll. Transducers are applied in order as if
combined with comp. Note that these applications will be
performed every time reduce/iterator is called.
so no additional collection is created.
This is any good, of course, as long as there is one-to-one relationship between input and output elements. What is desired output for (process [1 -2 3] (filter pos?)) or (process [1 1 1 2 2 2] (dedupe)) ?
(by the way, your to-hash implementation has the same flaw)

A transducer is a function that takes a reducing function and returns a new reducing function. To make it work with transducers where there is not a one-to-one mapping from elements in the input collection to the output, you will have to use your transducer to create a new reducing function (step2 in the code below) that will associate elements into your hash map. Something like this.
(def ^:dynamic assoc-k nil)
(defn assoc-step [dst x]
(assoc dst assoc-k x))
(defn to-hash [coll xform]
(let [step (xform (completing assoc-step))
step2 (fn [dst x] (binding [assoc-k x] (step dst x)))]
(reduce step2 {} coll)))
This implementation is quite basic and I am not sure to which extent it will work with stateful transducers. But it will work with the stateless ones, such as map and filter.
And we can test it with a transducer that keeps odd elements in the input collection and squares them:
(defn square [x] (* x x))
(to-hash (range 10) (comp (filter odd?) (map square)))
;; => {1 1, 3 9, 5 25, 7 49, 9 81}

Good way in clojure to map function on multiple items of coll or seqence

I'm currently learning Clojure, and I'm trying to learn how to do things the best way. Today I'm looking at the basic concept of doing things on a sequence, I know the basics of map, filter and reduce. Now I want to try to do a thing to pairs of elements in a sequence, and I found two ways of doing it. The function I apply is println. The output is simply 12 34 56 7
(def xs [1 2 3 4 5 6 7])
(defn work_on_pairs [xs]
(loop [data xs]
(if (empty? data)
data
(do
(println (str (first data) (second data)))
(recur (drop 2 data))))))
(work_on_pairs xs)
I mean, I could do like this
(map println (zipmap (take-nth 2 xs) (take-nth 2 (drop 1 xs))))
;; prints [1 2] [3 4] [5 6], and we loose the last element because zip.
But it is not really nice.. My background is in Python, where I could just say zip(xs[::2], xs[1::2]) But I guess this is not the Clojure way to do it.
So I'm looking for suggestions on how to do this same thing, in the best Clojure way.
I realize I'm so new to Clojure I don't even know what this kind of operation is called.
Thanks for any input

This can be done with partition-all:
(def xs [1 2 3 4 5 6 7])
(->> xs
(partition-all 2) ; Gives ((1 2) (3 4) (5 6) (7))
(map (partial apply str)) ; or use (map #(apply str %))
(apply println))
12 34 56 7
The map line is just to join the pairs so the "()" don't end up in the output.
If you want each pair printed on its own line, change (apply println) to (run! println). Your expected output seems to disagree with your code, so that's unclear.

If you want to dip into transducers, you can do something similar to the threading (->>) form of the accepted answer, but in a single pass over the data.
Assuming
(def xs [1 2 3 4 5 6 7])
has been evaluated already,
(transduce
(comp
(partition-all 2)
(map #(apply str %)))
conj
[]
xs)
should give you the same output if you wrap it in
(apply println ...)
We supply conj (reducing fn) and [] (initial data structure) to specify how the reduce process inside transduce should build up the result.
I wouldn't use a transducer for a list that small, or a process that simple, but it's good to know what's possible!

update or assoc a list rather than a vector

Updating a vector works fine:
(update [{:idx :a} {:idx :b}] 1 (fn [_] {:idx "Hi"}))
;; => [{:idx :a} {:idx "Hi"}]
However trying to do the same thing with a list does not work:
(update '({:idx :a} {:idx :b}) 1 (fn [_] {:idx "Hi"}))
;; => ClassCastException clojure.lang.PersistentList cannot be cast to clojure.lang.Associative clojure.lang.RT.assoc (RT.java:807)
Exactly the same problem exists for assoc.
I would like to do update and overwrite operations on lazy types rather than vectors. What is the underlying issue here, and is there a way I can get around it?

The underlying issue is that the update function works on associative structures, i.e. vectors and maps. Lists can't take a key as a function to look up a value.
user=> (associative? [])
true
user=> (associative? {})
true
user=> (associative? `())
false
update uses get behind the scenes to do its random access work.
I would like to do update and overwrite operations on lazy types
rather than vectors
It's not clear what want to achieve here. You're correct that vectors aren't lazy, but if you wish to do random access operations on a collection then vectors are ideal for this scenario and lists aren't.
and is there a way I can get around it?
Yes, but you still wouldn't be able to use the update function, and it doesn't look like there would be any benefit in doing so, in your case.
With a list you'd have to walk the list in order to access an index somewhere in the list - so in many cases you'd have to realise a great deal of the sequence even if it was lazy.

You can define your own function, using take and drop:
(defn lupdate [list n function]
(let [[head & tail] (drop n list)]
(concat (take n list)
(cons (function head) tail))))
user=> (lupdate '(a b c d e f g h) 4 str)
(a b c d "e" f g h)
With lazy sequences, that means that you will compute the n first values (but not the remaining ones, which after all is an important part of why we use lazy sequences). You have also to take into account space and time complexity (concat, etc.). But if you truly need to operate on lazy sequences, that's the way to go.

Looking behind your question to the problem you are trying to solve:
You can use Clojure's sequence functions to construct a simple solution:
(defn elf [n]
(loop [population (range 1 (inc n))]
(if (<= (count population) 1)
(first population)
(let [survivors (->> population
(take-nth 2)
((if (-> population count odd?) rest identity)))]
(recur survivors)))))
For example,
(map (juxt identity elf) (range 1 8))
;([1 1] [2 1] [3 3] [4 1] [5 3] [6 5] [7 7])
This has complexity O(n). You can speed up count by passing the population count as a redundant argument in the loop, or by dumping the population and survivors into vectors. The sequence functions - take-nth and rest - are quite capable of doing the weeding.
I hope I got it right!

Transforming list of hashmaps into set

Having two maps:
(def a {:a 1 :b 2 :c 3})
(def b {:b 222 :d 4})
placed into one vector:
(def l [a b])
what's the easiest way to construct a set (in terms of structure of unique keys) where the priority in case of key conflict (:b in this case) has a left-hand operand (:b 2 in this case). In other words I'd like to get a result:
{:a 1 :b 2 :c 3 :d 4}
Two solutions which came to mind mind are:
(apply merge-with (fn [left _] left) l)
(reduce conj (reverse l))
First one doesn't seem idiomatic for me, second one worries me because of eager list reversing which sounds a bit inneficient. Any other ideas?

Numerous other possibilities of which (reduce #(into %2 %1) l) (or with merge instead of into) could be considered. Your merge-with solution is absolutely fine.

How about
(apply merge (reverse l))
it's seems fine and simular to second one.

Clojure 2d list to hash-map

I have an infinite list like that:
((1 1)(3 9)(5 17)...)
I would like to make a hash map out of it:
{:1 1 :3 9 :5 17 ...)
Basically 1st element of the 'inner' list would be a keyword, while second element a value. I am not sure if it would not be easier to do it at creation time, to create the list I use:
(iterate (fn [[a b]] [(computation for a) (computation for b)]) [1 1])
Computation of (b) requires (a) so I believe at this point (a) could not be a keyword... The whole point of that is so one can easily access a value (b) given (a).
Any ideas would be greatly appreciated...
--EDIT--
Ok so I figured it out:
(def my-map (into {} (map #(hash-map (keyword (str (first %))) (first (rest %))) my-list)))
The problem is: it does not seem to be lazy... it just goes forever even though I haven't consumed it. Is there a way to force it to be lazy?

The problem is that hash-maps can be neither infinite nor lazy. They designed for fast key-value access. So, if you have a hash-map you'll be able to perform fast key look-up. Key-value access is the core idea of hash-maps, but it makes creation of lazy infinite hash-map impossible.
Suppose, we have an infinite 2d list, then you can just use into to create hash-map:
(into {} (vec (map vec my-list)))
But there is no way to make this hash-map infinite. So, the only solution for you is to create your own hash-map, like Chouser suggested. In this case you'll have an infinite 2d sequence and a function to perform lazy key lookup in it.
Actually, his solution can be slightly improved:
(def my-map (atom {}))
(def my-seq (atom (partition 2 (range))))
(defn build-map [stop]
(when-let [[k v] (first #my-seq)]
(swap! my-seq rest)
(swap! my-map #(assoc % k v))
(if (= k stop)
v
(recur stop))))
(defn get-val [k]
(if-let [v (#my-map k)]
v
(build-map k)))
my-map in my example stores the current hash-map and my-seq stores the sequence of not yet processed elements. get-val function performs a lazy look-up, using already processed elements in my-map to improve its performance:
(get-val 4)
=> 5
#my-map
=> {4 5, 2 3, 0 1}
And a speed-up:
(time (get-val 1000))
=> Elapsed time: 7.592444 msecs
(time (get-val 1000))
=> Elapsed time: 0.048192 msecs

In order to be lazy, the computer will have to do a linear scan of the input sequence each time a key is requested, at the very least if the key is beyond what has been scanned so far. A naive solution is just to scan the sequence every time, like this:
(defn get-val [coll k]
(some (fn [[a b]] (when (= k a) b)) coll))
(get-val '((1 1)(3 9)(5 17))
3)
;=> 9
A slightly less naive solution would be to use memoize to cache the results of get-val, though this would still scan the input sequence more than strictly necessary. A more aggressively caching solution would be to use an atom (as memoize does internally) to cache each pair as it is seen, thereby only consuming more of the input sequence when a lookup requires something not yet seen.
Regardless, I would not recommend wrapping this up in a hash-map API, as that would imply efficient immutable "updates" that would likely not be needed and yet would be difficult to implement. I would also generally not recommend keywordizing the keys.

If you flatten it down to a list of (k v k v k v k v) with flatten then you can use apply to call hash-map with that list as it's arguments which will git you the list you seek.
user> (apply hash-map (flatten '((1 1)(3 9)(5 17))))
{1 1, 3 9, 5 17}
though it does not keywordize the first argument.
At least in clojure the last value associated with a key is said to be the value for that key. If this is not the case then you can't produce a new map with a different value for a key that is already in the map, because the first (and now shadowed key) would be returned by the lookup function. If the lookup function searches to the end then it is not lazy. You can solve this by writing your own map implementation that uses association lists, though it would lack the performance guarantees of Clojure's trei based maps because it would devolve to linear time in the worst case.
Im not sure keeping the input sequence lazy will have the desired results.

To make a hashmap from your sequence you could try:
(defn to-map [s] (zipmap (map (comp keyword str first) s) (map second s)))
=> (to-map '((1 1)(3 9)(5 17)))
=> {:5 17, :3 9, :1 1}

You can convert that structure to a hash-map later this way
(def it #(iterate (fn [[a b]] [(+ a 1) (+ b 1)]) [1 1]))
(apply hash-map (apply concat (take 3 (it))))
=> {1 1, 2 2, 3 3}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Grouping a seq by different sizes - Clojure - clojure

I have the following data: (def letters [:a :b :c :d :e :f :g ]) (def group-sizes [2 3 2]) What would be an idiomatic way to group letters by size, such that I get: [[:a :b] [:c :d :e] [:f :g]] Thanks.

Related

Combine transduction output with input into a hashmap

Good way in clojure to map function on multiple items of coll or seqence

update or assoc a list rather than a vector

Transforming list of hashmaps into set

Clojure 2d list to hash-map

Categories

Resources