Modifying nested data in clojure - clojure

How can I manipulate a nested data structure?
I have a list of this kind
[["first_string" {:one 1 :two 2}]
["second_string" {:three 3 :four 4}]
["third_string" {:five 5 :six 6}]
["fourth_string" {:seven 7 :eight 8}]]
And I need to change it to this form:
[["first_string" 1]
["second_string" 3]
["third_string" 5]
["fourth_string" 7]]
Essentially, I want only the first element of each of the inner vectors first key of the map

Try defining a function that operates on a single entry in the vector and then map over it:
(defn manipulate-nested
[entry]
[(first entry) (last (first (last entry)))])
(let [input [["first_string" {:one 1 :two 2}]
["second_string" {:three 3 :four 4}]
["third_string" {:five 5 :six 6}]
["fourth_string" {:seven 7 :eight 8}]]]
(into [] (map manipulate-nested input)))
;; [["first_string" 1]
;; ["second_string" 3]
;; ["third_string" 5]
;; ["fourth_string" 7]]
I need to change it to this form
NB: Keep in mind that strictly speaking you're not changing (mutating) the original vector, but describing a modification of it.

You can't get a reliable first key of a hash-map because hash-maps are an unsorted data structure and seqing them thus has no order guarantees. So there is no first or second.
There is no way of numerically ordering the keywords :one, :two, :three without parsing their names which I leave as a separate problem.
Here is your problem reposed with ordered structures in place of the hash-maps:
(def data [["first_string" [[:one 1] [:two 2]]]
["second_string" [[:three 3] [:four 4]]]
["third_string" [[:five 5] [:six 6]]]
["fourth_string" [[:seven 7] [:eight 8]]]]
One typical and idiomatic solution is to extract from each vector in data independently via map, using destructuring in the transformation functions binding vector to bind the desired nested elements and returning this extraction in a new vector:
(map (fn [[s [[_ n] _]]] [s n]) data)
The input data structure of your specific problem offers a way with less overhead by reusing the passed vector instead of constructing a new one in each step:
(map #(update % 1 (comp second first)) data)

Related

Clojure list comprehension - run inner loop only once in certain circumstance?

Is there a way to use a single list comp to achieve the following or do I need some other method?
(def args '([[:db/foo 1 2] [:db/bar 3 4]] {:a 1 :b 2}))
(defn myfunc []
"Want to run inner loop only once if arg is a map."
(for [arg args [cmd & params] arg] ; 'cmd & params' are meaningless if map.
(if (map? arg)
arg
(into [cmd] params))))
Above code produces
=> [[:db/foo 1 2] [:db/bar 3 4] {:a 1, :b 2} {:a 1, :b 2}]
But I actually want
=> [[:db/foo 1 2] [:db/bar 3 4] {:a 1, :b 2}]
Obviously this isn't the full function, I'm doing transforms if the arg is in vector form but want to let map form pass straight through (without being duplicated).
Update: I've found that a list comprehension has its roots in set builder notation, whereas 'conditionally executing the inner loop' is actually an imperative notion, so it's not surprising this isn't easy to express with a single list comprehension, which is a totally fp construct.
Is this what you want to do?:
Build a new sequence from an existing sequence. For every element in the input sequence, there are two possible cases:
It is a map: In that case, just append it to the result sequence
Otherwise it is a sequence and its elements should possibly be transformed and appended to the result sequence
If this is what you want to do, this computation can conveniently be expressed by first mapping the elements of the input sequence to subsequences and then concatenating those sequences to form the resulting sequence.
As suggested by #cfrick, there is mapcat that combines mapping with concatenation, either by passing in a sequence directly to mapcat
(mapcat (fn [x] (if (map? x) [x] x)) args)
or by using mapcat as a transducer
(into [] (mapcat (fn [x] (if (map? x) [x] x))) args)
I don't think for is suitable for expressing this computation.
before I propose a solution, I have a few remarks on the function myfunc that you provided:
the doc string is misplaced. I know it's weird but the doc string must be before the arguments of a function, so before [],
your usage of for instruction is not what you expect I think. You should take a look on the example in the documentation here. Mainly the for instruction is nice to construct Cartesian product from list.
I hope that I understood correctly what you want to achieve. In my proposition, I use a recursion through loop instruction to build a vector which will contain every arguments:
(defn myfunc "Want to run inner loop only once if arg is a map."
[& args]
(loop [arg-to-parse args
res []]
(if (empty? arg-to-parse)
res
(let [arg (first arg-to-parse)
new-arg-to-parse (rest arg-to-parse)]
(if (map? arg)
(recur new-arg-to-parse (conj res arg))
(recur new-arg-to-parse (into res arg))
)))))
(myfunc [[:db/foo 1 2] [:db/bar 3 4]] {:a 1 :b 2})
;; => [[:db/foo 1 2] [:db/bar 3 4] {:a 1, :b 2}]
(myfunc {:a 1 :b 2} [[:db/foo 1 2] [:db/bar 3 4]])
;; => [{:a 1, :b 2} [:db/foo 1 2] [:db/bar 3 4]]

Return a sequence with the elements not in common to two original sequences by using clojure

I have two sequences, which can be vector or list. Now I want to return a sequence whose elements are not in common to the two sequences.
Here is an example:
(removedupl [1 2 3 4] [2 4 5 6]) = [1 3 5 6]
(removeddpl [] [1 2 3 4]) = [1 2 3 4]
I am pretty puzzled now. This is my code:
(defn remove-dupl [seq1 seq2]
(loop [a seq1 b seq2]
(if (not= (first a) (first b))
(recur a (rest b)))))
But I don't know what to do next.
I encourage you to think about this problem in terms of set operations
(defn extrasection [& ss]
(clojure.set/difference
(apply clojure.set/union ss)
(apply clojure.set/intersection ss)))
Such a formulation assumes that the inputs are sets.
(extrasection #{1 2 3 4} #{2 4 5 6})
=> #{1 6 3 5}
Which is easily achieved by calling the (set ...) function on lists, sequences, or vectors.
Even if you prefer to stick with a sequence oriented solution, keep in mind that searching both sequences is an O(n*n) task if you scan both sequences [unless they are sorted]. Sets can be constructed in one pass, and lookup is very fast. Checking for duplicates is an O(nlogn) task using a set.
I'm still new to Clojure but I think the functional mindset is more into composing functions than actually doing it "by hand", so I propose the following solution:
(defn remove-dupl [seq1 seq2]
(concat
(remove #(some #{%} seq1) seq2)
(remove #(some #{%} seq2) seq1)))
EDIT: I think it is better if we define that remove part as a local function and reuse it:
(defn remove-dupl [seq1 seq2]
(let [removing (fn [x y] (remove #(some #{%} x) y))]
(concat (removing seq1 seq2) (removing seq2 seq1))))
EDIT2: As commented by TimothyPratley
(defn remove-dupl [seq1 seq2]
(let [removing (fn [x y] (remove (set x) y))]
(concat (removing seq1 seq2) (removing seq2 seq1))))
There are several problems with your code.
It doesn't test for the end of either sequence argument.
It steps through b but not a.
It implicitly returns nil when any two sequences have the same
first element.
You want to remove the common elements from the concatenated sequences. You have to work out the common elements first, otherwise you don't know what to remove. So ...
We use
clojure.set/intersection to find the common elements,
concat to stitch the collections together.
remove to remove (1) from (2).
vec to convert to a vector.
Thus
(defn removedupl [coll1 coll2]
(let [common (clojure.set/intersection (set coll1) (set coll2))]
(vec (remove common (concat coll1 coll2)))))
... which gives
(removedupl [1 2 3 4] [2 4 5 6]) ; [1 3 5 6]
(removedupl [] [1 2 3 4]) ; [1 2 3 4]
... as required.

How to convert map to a sequence?

Answers to this question explain how to convert maps, sequences, etc. to various sequences and collections, but do not say how to convert a map to a sequence of alternating keys and values. Here is one way:
(apply concat {:a 1 :b 2})
=> (:b 2 :a 1)
Some alternatives that one might naively think would produce the same result, don't, including passing the map to vec, vector, seq, sequence, into [], into (), and flatten. (Sometimes it's easier to try it than to think it through.)
Is there anything simpler than apply concat?
You can also do
(mapcat identity {:a 1 :b 2})
or
(mapcat seq {:a 1 :b 2})
As #noisesmith gently hints below, the following answer is seductive but wrong: left as a warning to other unwary souls! Counterexample:
((comp flatten seq) {[1 2] [3 4], 5 [6 7]})
; (1 2 3 4 5 6 7)
(comp flatten seq) does the job:
((comp flatten seq) {1 2, 3 4})
; (1 2 3 4)
But flatten on its own doesn't:
(flatten {1 2, 3 4})
; ()
I'm surprised it doesn't work, and in that case it should return nil, not ().
None of the others you mention: vec, vector ... , does anything to the individual [key value] pairs that the map presents itself as a sequence of.

Split a vector into vector of vectors in clojure instead of vector of lists

The clojure documentation of split-at states that it takes a collection of elements and returns a vector of two lists, each containing elements greater or smaller than a given index:
(split-at 2 [1 2 3 4 5])
[(1 2) (3 4 5)]
What I want is this:
(split-at' 2 [1 2 3 4 5])
[[1 2] [3 4 5]]
This is a collection cut into two collections that keep the order of the elements (like vectors), preferably without performance penalties.
What is the usual way to do this and are there any performance optimized ways to do it?
If you're working exclusively with vectors, one option would be to use subvec.
(defn split-at' [idx v]
[(subvec v 0 idx) (subvec v idx)])
(split-at' 2 [1 2 3 4 5])
;; => [[1 2] [3 4 5]]
As regards to performance, the docs on subvec state:
This operation is O(1) and very fast, as
the resulting vector shares structure with the original and no
trimming is done.
Why not extend the core function with "vec" function ?
So based on split-at definition:
(defn split-at
"Returns a vector of [(take n coll) (drop n coll)]"
{:added "1.0"
:static true}
[n coll]
[(take n coll) (drop n coll)])
We can add vec to each element of the vector result
(defn split-at-vec
[n coll]
[(vec (take n coll)) (vec (drop n coll))])
Releated to "performance penalties" i think that when you transform your lazy seqs in favor of vector then you loose the lazy performance.

Convert map of list into list of maps (i.e. rows to colums)

I have the following data structure in Clojure
{:a [1 2 3]
:b [4 5 6]
:c [7 8 9]}
And I'd like to convert it into something like
[{:a 1 :b 4 :c 7}
{:a 2 :b 5 :c 8}
{:a 3 :b 6 :c 9}]
At the moment I'm kinda stumped as to how to do this.
In Clojure you can never guarantee the order of keys in maps after transformations. They're indexed by key, not by order.
Vectors are, however. And with get-in you can do a lookup on position with a vector of coordinates .
=> (def mat
[[1 2 3]
[4 5 6]
[7 8 9]])
=> (defn transpose
[m]
(apply mapv vector m))
=> (get-in (transpose mat) [1 2])
8
Got it:
(defn transpose-lists [x]
(map (fn [m] (zipmap (keys x) m)) (apply map vector (vals x))))
Unfortunately it doesn't preserve order of the keys.
If anyone has a better solution then of course I'd like to hear it!