I started with a csv-file. I have slurped it and added some structure with partition-by and sorted with sort-by. But now I'd like to add values to keep track of processing. But get-in and assoc-in don't like lists. And although I haven't added lists the partition-by seems to like them.
So, how to get out of this situation? Is there a way to transform all lists inside a structure to vectors or an alternative version of partition-by that isn't such a crybaby about lists or do I need to rethink my solution somehow? :-)
A simple example of a structure
(sort '(1 2 3 4 5))
If you want to transform the lists and then use the associative interface, clojure.walk has utilities that allow you to do that transformation on arbitrarily nested structures.
(let[my-nested-structure {:foo '(1 2 4) '(0) :bar :baz {42 '()}}]
(clojure.walk/postwalk #(if (list? %) (vec %) %) my-nested-structure))
;; => {:foo [1 2 4], [0] :bar, :baz {42 []}}
postwalk and prewalk are effectively the same in this instance but the difference can matter if your replacement function adds/removes sub-entries.
The specter library might also interest you - its transform, select and so on allow you to approach get-in type jobs from a slightly different direction and with that data structure agnosticism.
If your structure is a list of lists called xs, then you can:
(mapv vec xs)
That will give you a vector of vectors. A vector is an associative data structure (unlike a list), so get-in and assoc-in will work. However if there are more meaningful keys than the position of an element in a vector, then you might prefer to work with maps, in which case you would have a collection (vector or list) of homogeneous maps, which is a common way of working with data.
And just for verification:
(mapv vec '((1 2 3) (4 5 6)))
;;=> [[1 2 3] [4 5 6]]
Related
I am trying to implement the behavior of a stack in Clojure. Taking a cue from the implementation of frequencies I created a transient vector which I am conj!ing elements to (a la "push"). My issue is pop! removes elements from the end and a few other fns (rest,drop) only work on lazy sequences.
I know I could accomplish this using loop/recur (or reverseing and pop!ing) but I want to better understand why removing from the beginning of a transient vector isn't allowed. I read this, is it because the implementation that allows them to be mutated is only O(1) because yr only editing nodes on the end and if you changed the first node that requires copying the entire vector?
Your difficulty is not with transients: pop and peek always work at the same end of a collection as conj does:
the end of a vector or
the head of a list.
So ...
(= (pop (conj coll x)) coll)
and
(= (peek (conj coll x)) x)
... are true for any x for any coll that implements IPersistentStack:
Vectors and lists do.
Conses and ranges don't.
If you want to look at your stack both ways, use a vector, and use the (constant time) rseq to reverse it. You'd have to leave transients, though, as there is no rseq!. Mind you (comp rseqpersistent!) is still constant time.
By the way, rest and drop work on any sequence, lazy or not:
=> (rest [1 2 3])
(2 3)
I'm working Day 6 of Advent of Code 2018, in which I need to store a 2D map of locations, and then do mapping + filtering on them based on their coordinates. I was thinking of storing the locations in a 2D vector, so that the indices of the vectors denote their coordinates, as this is how I would have done it in imperative languages.
However, the majority of the sequence operations only pass the element to the function, so there is no way to access the index of the element from the function passed to e.g. map. Yes, map-indexed exists, but it doesn't feel very clean to have two nested calls to it everytime I operate on the data.
I saw some suggest storing the index, or in this case (x,y) coordinate pair, with the element in the vector: [[[0, 0] "loc1"] [[0, 1] "loc2"] ...]. Would this be better than using nested map-indexed calls, or is there an even cleaner, more idiomatic alternative to storing 2D data and accessing the data with its index?
For this specific problem, the 2D nature of the problem doesn't really matter. So, I'd suggest storing the points as a vector of maps like so:
{:x x
:y y
:nearest-point :A}
and locations like:
{:x x
:y y
:name :A}
for example. For each point, loop over the locations and save the closest one. Then, throw out the infinite ones:
(remove #(is-it-infinite? %) points)
Then
(group-by :nearest-point points)
and count the size of each group to get the final answer.
You could use matrices:
(require '[clojure.core.matrix :as m])
(def A (m/matrix [[1 4 56] [5 2 8] [35 1 677]]))
(m/emap-indexed (fn [[x y] v] (prn [x y v])) A)
Vector and map can act as function to get its elements from key
([1 2 3] 2) ;=> 3
({:a 1 :b 2} :a) ;=> 1
but why I can not do this for list?
('(1 2 3) 2)
;clojure.lang.PersistentList cannot be cast to clojure.lang.IFn(java.lang.ClassCastException)
I think the error message is pretty descriptive in this case. Persistent list doesn't implement IFn, therefore cannot act as function. This is Clojure design choice and the reason may be that List datastructure is not designed for random access (getting element by index), because complexity of this operation is O(n), which is much worse than vector's O(log(n)).
In Clojure, a map entry created within a macro is preserved...
(class (eval `(new clojure.lang.MapEntry :a 7)))
;=> clojure.lang.MapEntry
...but when piped thru from the outside context collapses to a vector...
(class (eval `~(new clojure.lang.MapEntry :a 7)))
;=> clojure.lang.PersistentVector
This behavior is defined inside LispReader.syntaxQuote(Object form) condition if(form instanceof IPersistentCollection).
Does anyone know if this is intended behavior or something that will be fixed?
If you want to understand this behavior you need to dive into construction of Clojure sequences and collections.
In fact, every Clojure map is, underneath, a sequence of vectors. Each [:key :val] pair is stored as two elements vector.
Have a proper look, you're asking for class of MapEntry, which is just a vector! Instead, Clojure class for maps is clojure.lang.PersistentArrayMap or clojure.lang.IPersistentMap. MapEntry is just one element, one part of whole map. And, as I said, because each entry in a Clojure map is really a vector, class of evaluated MapEntry is vector, as it should be.
Hope my explanation is understandable.
I want a clojure data structure that:
pops from the front
pushes to the rear
lets me assoc indices with values (i.e. (assoc q 0 1) would set the value of the front to 1)
Is there something like that in Clojure (unfortunately PersistentQueue doesn't fulfill Nr.3), or should I built it on top of vector?
There isn't a data structure in standard Clojure that will meet these requirements efficiently.
There was some talk on the Clojure-Dev mailing list about using RRB trees for vectors, which would be a great data structure for this:
https://groups.google.com/forum/?fromgroups=#!topic/clojure-dev/xnbtzTVEK9A
Not sure how far that has developed - but if you are interested in this kind of data structure then it is definitely worth taking a look at this.
If you do not require persistency of the data structure,
you could use java.util.LinkedList in your Clojure programs.
Example:
;;; Creation
user> (import 'java.util.LinkedList)
java.util.LinkedList
user> (def linked-list (LinkedList. [:a :b :c :d :e]))
#'user/linked-list
;;; Pop from the front
user> (.pop ^LinkedList linked-list)
:a
user> linked-list
#<LinkedList [:b, :c, :d, :e]>
;;; Push to the rear, but costly
user> (.addLast ^LinkedList linked-list :x)
nil
user> linked-list
#<LinkedList [:b, :c, :d, :e, :x]>
;;; Assoc (cf. (assoc linked-list 0 :y)
user> (.add ^LinkedList linked-list 0 :y)
nil
user> linked-list
#<LinkedList [:y, :b, :c, :d, :x]>
You could use a sorted-map, but you'd have to implement the index part yourself.
For example, to push a value v, you could assoc it with the key produced by incrementing the last key in the map. To pop, you could dissoc the first key in the map.
Sounds like you want a deque like python's deque except you might prefer the indexed access performance characteristics of the c++ std::deque<T> whose documentation is somewhat more obtuse.
Java ships with java.util.Deque implementations which you could just use, much like #tnoda's suggestion of java.util.LinkedList.
If you were rolling your own, the implementation is pretty straightforward for a non-persistent collection, and seems reasonably intuitive to me at least to implement against the "hashed array trees" underlying clojure's hashmap and vector, or directly against vector initially if the details annoy you.