how to paritally flatten a list in clojure? - clojure

Let's say I have a data structure like so:
[[1 2 3] [4 5 6] [[7 8 9] [10 11 12]]]
And what I want to end up with is:
[[1 2 3] [4 5 6] [7 8 9] [10 11 12]]
Is there any function that does this automatically?
Basically I'm converting/transforming a SQL result set to CSV, and there are some rows that will transform to 2 rows in the CSV. So my map function in the normal case returns a vector, but sometimes returns a vector of vectors. Clojure.data.csv needs a list of vectors only, so I need to flatten out the rows that got pivoted.

Mapcat is useful for mapping where each element can expand into 0 or more output elements, like this:
(mapcat #(if (vector? (first %)) % [%]) data)
Though I'm not sure if (vector? (first %)) is a sufficient test for your data.

A different approach using tree-seq:
(def a [[1 2 3] [4 5 6] [[7 8 9] [10 11 12]]])
(filter (comp not vector? first)
(tree-seq (comp vector? first) seq a))
I am stretching to use tree-seq here. Would someone with more experience care to comment on if there is a better way to return only the children other than using what is effectively a filter of (not branch?)

Clojure: Semi-Flattening a nested Sequence answers your question, but I don't want to mark this question as a duplicate of that one, since you're really asking a different question than he was; I wonder if it's possible to move his answer over here.

Related

Apply reduce for each element of seq

I have collection of lists and I want to apply "reduce +" for each list in collection. I think I should combine "apply", "map" and "reduce +", but I can't understand how.
Example:
[[1 2 3] [4 5 3] [2 5 1]] => [6 12 8]
No need for apply. map and reduce will work fine:
(map (partial reduce +) [[1 2 3] [4 5 3] [2 5 1]])
map will call the function on each member of the list and partial simply creates a 'curried' version of reduce that expects one parameter. it could also be written like #(reduce + %) or (fn [lst] (reduce + lst))
Update
You could actually use apply in place of reduce here as well (just not both):
(map (partial apply +) [[1 2 3] [4 5 3] [2 5 1]])
Further Update
If you have any performance concerns, see the comments on this answer for some great tips by #AlexMiller

Clojure: more idiomatic pairing of elements from lists of unequal sizes?

I would like to create a list of pairs from cols, and patch. cols would have much more elements. Element in patch would be repeated in the pairing.
For example,
(element-wise-patch '(1 3 5 7 9) '(2 4) '())
([1 2] [3 4] [5 2] [7 4] [9 2])
Here is my attempt to implement the semantics. I hope to learn more idiomatic, and simpler solution.
(defn element-wise-patch [cols patch patched]
(if (<= (count cols) (count patch))
(concat patched (map vector cols patch))
(let [[compatible remaining] (split-at (count patch) cols)]
(element-wise-patch remaining patch (concat patched (map vector compatible patch)))))
I feel that there might be already existing construct to do such patching pairing. Also my description might not be proper enough to associate similar solutions.
Please give me some pointer, or just help me define my problem clearer.
Thanks in advance for your help!
Quite simply:
(map vector [1 3 5 7 9] (cycle [2 4]))

Shadowing PersistentHashMap's 'get' function

I'm trying to come up with a data structure for exploring data that have been been marked with key terms, like "systems theory" or "Internet", using some set and lattice theory concepts I like. I thought maybe I could extend the way that hash maps work. I wrote some tests for the behavior I want, and then I realized I don't really understand how to work, or to do work, with types and protocols.
Here's the idea. I want to index a collection of data by sets of strings. E.g.,
(def data { #{"systems theory" "internet"} [1 2 3]
#{"systems theory" "biology"} [4 5]
#{"systems theory"} [6 7 8] })
For free, I get
(data #{"systems theory"})
;=> [6 7 8]
which is good.
But it would also be slick to be able to do something like
(data "biology")
;=> { #{"systems theory"} [4 5] }
When I thought of this I figured it wouldn't be difficult to tell the get method of PersistentHashMap to act as normal, unless its being asked to use a String as a key, in which case, do whatever is necessary to get the new data structure. But when it came to write code I just had a mess and I don't actually know how to design this thing.
I have my copy of Fogus's The Joy of Clojure and I'm going to read about types and protocols and extend-type and such and see if I can make sense of how and where built-in functions are defined and changed. But I would also love a hint.
I would not create a new specialized map implementation but create a simple index map from the original data:
(def data {#{"systems theory" "internet"} [1 2 3]
#{"systems theory" "biology"} [4 5]
#{"systems theory"} [6 7 8] })
(def cats (->> data
(map (fn [[cats val]]
(->> cats
(map (juxt identity #(hash-map (disj cats %) val)))
(into {}))))
(apply merge)))
(get cats "internet")
;=> {#{"systems theory"} [1 2 3]}
(get cats "biology")
;=> {#{"systems theory"} [4 5]}
(get cats "systems theory")
;=> {#{"biology"} [4 5]}
You could also merge both of them if you want to:
(def full-index (merge data cats))
(get full-index "internet") ;=> {#{"systems theory"} [1 2 3]}
(get full-index #{"systems theory"}) ;=> [6 7 8]
If you still want to create the specialized map implementation, you might want to take a look into the following:
PersistenHashMap
implementation
sorted.clj: "An
implementation of Clojure's sorted collections written in Clojure".
For instance, see the code for
PersistentTreeMap
which is used to implement sorted-map
data.avl: "Persistent sorted
maps and sets with log-time rank queries"
data.priority-map:
"A priority map is very similar to a sorted map, but whereas a sorted
map produces a sequence of the entries sorted by key, a priority map
produces the entries sorted by value.". Perhaps the code is easier to
understand than the others.
It won't be easy if you want to keep the semantics of a hash-map (for example count should return the sum of the original map count plus the new keys count). You might want to use collection-check to test your implementation.
What you're describing may be possible, but I think you would be better off just writing a function to filter through your list for any sets containing your search terms.
Also, consider the access patterns you are going to be using, I suspect having the strings as keys and the document ids in a set may be more efficient and more flexible.

Add two collections in clojure

How to add two collections efficiently in clojure ?
I tried following one. I want to know is there any other method efficient than this.
(reduce #(conj %1 %2) collection01 collection02)
It depends on what you want to achieve. If what you want in the result is a collection of specified type, that contains all element of given collections, then into is appropriate: (into coll1 coll2) returns collection of type (type coll1) with elements from coll1 and coll2.
On the other hand, if you just want to iterate over many collections (i.e. create a sequence of elements in the collections) then it is more efficient to use concat:
user> (concat [1 2 3] (list 4 5 6))
(1 2 3 4 5 6)
use into:
user> (into [1 2 3] [4 5 6])
[1 2 3 4 5 6]
user> (doc into)
-------------------------
clojure.core/into
([to from])
Returns a new coll consisting of to-coll with all of the items of
from-coll conjoined.
nil

Idiomatically iterating over a 2 (or higher) dimensional sequence in Clojure

Is there a 'proper' way to iterate over a two-dimensional sequence in Clojure?
Suppose I had a list of lists of numbers, like this
((1 2 3)
(4 5 6)
(7 8 9))
and I wanted to generate a new list of lists with each number incremented by one. Is there an easy way to do this in Clojure without relying on nested maps or loop/recurs? I've been able to do it, but my solutions are ugly and I find them difficult to understand when I re-read them.
Thanks
What you describe is precisely what clojure.walk is for:
(def matrix [[1 2 3]
[4 5 6]
[7 8 9]])
(use 'clojure.walk :only [prewalk])
(prewalk #(if (number? %) (inc %) %) matrix)
=> [[2 3 4] [5 6 7] [8 9 10]]
Note 1: it is idiomatic to use vectors instead of parentheses for literal sequential collections.
Note 2: walk preserves type.
You can always just use a list comprehension. I find myself using them quite often coming from an imperative background so I don't know how idiomatic it is. In your specific case, you can do:
(for [my-list my-matrix] (map inc my-list))
For the two-dimensional case, you could do something like:
(map #(map inc %) my-two-d-list)
That's not too bad to read: apply the function #(map inc %) to each element in a list.
For the higher-order case, you're basically talking about tree-traversal. You'd want a function that takes in a tree and a function, and applies that function to each node in the tree. You can find functions for this in clojure.walk.
The other answers by Sean and Matt both show concise and effective ways of getting the right result.
However there are some important extensions you can make to this:
It would be nice to handle the case of higher dimensions
It is good to wrap the functionality in a higher order function
Example code:
;; general higher order function
(defn map-dimensions [n f coll]
(if (= n 1)
(map f coll)
(map #(map-dimensions (dec n) f %) coll)))
;; use partial application to specialise to 2 dimensions
(def map-2d (partial map-dimensions 2))
(map-2d inc
'((1 2 3)
(4 5 6)
(7 8 9)))
=> ((2 3 4) (5 6 7) (8 9 10))
Since the introduction of core.matrix in 2013, this is now a much better way of handling operations over multi-dimensional arrays:
(use 'clojure.core.matrix)
(def M [[1 2 3]
[4 5 6]
[7 8 9]])
(emap inc M)
=> [[2 3 4 ]
[5 6 7 ]
[8 9 10]]
Advantages of using core.matrix:
Clean, idiomatic Clojure code
Lots of general purpose n-dimensional array manipulation functions - transpose, shape, reshape, slice, subarray etc.
Ability to plug in high performance array implementations (e.g. for big numerical arrays)
A belated answer, and maybe not exactly what is needed: you could try flatten. It will return a seq that you can iterate over:
(flatten '((1 2 3)
(4 5 6)
(7 8 9)))
user=> (1 2 3 4 5 6 7 8 9)
And in order to increment matrix elements and reassemble the matrix:
(partition 3 (map inc (flatten '((1 2 3)
(4 5 6)
(7 8 9)))))