I want a clojure data structure that:
pops from the front
pushes to the rear
lets me assoc indices with values (i.e. (assoc q 0 1) would set the value of the front to 1)
Is there something like that in Clojure (unfortunately PersistentQueue doesn't fulfill Nr.3), or should I built it on top of vector?
There isn't a data structure in standard Clojure that will meet these requirements efficiently.
There was some talk on the Clojure-Dev mailing list about using RRB trees for vectors, which would be a great data structure for this:
https://groups.google.com/forum/?fromgroups=#!topic/clojure-dev/xnbtzTVEK9A
Not sure how far that has developed - but if you are interested in this kind of data structure then it is definitely worth taking a look at this.
If you do not require persistency of the data structure,
you could use java.util.LinkedList in your Clojure programs.
Example:
;;; Creation
user> (import 'java.util.LinkedList)
java.util.LinkedList
user> (def linked-list (LinkedList. [:a :b :c :d :e]))
#'user/linked-list
;;; Pop from the front
user> (.pop ^LinkedList linked-list)
:a
user> linked-list
#<LinkedList [:b, :c, :d, :e]>
;;; Push to the rear, but costly
user> (.addLast ^LinkedList linked-list :x)
nil
user> linked-list
#<LinkedList [:b, :c, :d, :e, :x]>
;;; Assoc (cf. (assoc linked-list 0 :y)
user> (.add ^LinkedList linked-list 0 :y)
nil
user> linked-list
#<LinkedList [:y, :b, :c, :d, :x]>
You could use a sorted-map, but you'd have to implement the index part yourself.
For example, to push a value v, you could assoc it with the key produced by incrementing the last key in the map. To pop, you could dissoc the first key in the map.
Sounds like you want a deque like python's deque except you might prefer the indexed access performance characteristics of the c++ std::deque<T> whose documentation is somewhat more obtuse.
Java ships with java.util.Deque implementations which you could just use, much like #tnoda's suggestion of java.util.LinkedList.
If you were rolling your own, the implementation is pretty straightforward for a non-persistent collection, and seems reasonably intuitive to me at least to implement against the "hashed array trees" underlying clojure's hashmap and vector, or directly against vector initially if the details annoy you.
Related
I am trying to implement the behavior of a stack in Clojure. Taking a cue from the implementation of frequencies I created a transient vector which I am conj!ing elements to (a la "push"). My issue is pop! removes elements from the end and a few other fns (rest,drop) only work on lazy sequences.
I know I could accomplish this using loop/recur (or reverseing and pop!ing) but I want to better understand why removing from the beginning of a transient vector isn't allowed. I read this, is it because the implementation that allows them to be mutated is only O(1) because yr only editing nodes on the end and if you changed the first node that requires copying the entire vector?
Your difficulty is not with transients: pop and peek always work at the same end of a collection as conj does:
the end of a vector or
the head of a list.
So ...
(= (pop (conj coll x)) coll)
and
(= (peek (conj coll x)) x)
... are true for any x for any coll that implements IPersistentStack:
Vectors and lists do.
Conses and ranges don't.
If you want to look at your stack both ways, use a vector, and use the (constant time) rseq to reverse it. You'd have to leave transients, though, as there is no rseq!. Mind you (comp rseqpersistent!) is still constant time.
By the way, rest and drop work on any sequence, lazy or not:
=> (rest [1 2 3])
(2 3)
Is there a Clojure predicate that means "collection, but not a map"?
Such a predicate is/would be valuable because there are many operations that can be performed on all collections except maps. For example (apply + ...) or (reduce + ...) can be used with vectors, lists, lazy sequences, and sets, but not maps, since the elements of a map in such a context end up as clojure.lang.MapEntrys. It's sets and maps that cause the problem with those predicates that I know of:
sequential? is true for vectors, lists, and lazy sequences, but it's false for both maps and sets. (seq? is similar but it's false for vectors.)
coll? and seqable? are true for both sets and maps, as well as for every other kind of collection I can think of.
Of course I can define such a predicate, e.g. like this:
(defn coll-but-not-map?
[xs]
(and (coll? xs)
(not (map? xs))))
or like this:
(defn sequential-or-set?
[xs]
(or (sequential? xs)
(set? xs)))
I'm wondering whether there's a built-in clojure.core (or contributed library) predicate that does the same thing.
This question is related to this one and this one but isn't answered by their answers. (If my question is a duplicate of one I haven't found, I'm happy to have it marked as such.)
For example (apply + ...) or (reduce + ...) can be used with vectors, lists, lazy sequences, and sets, but not maps
This is nothing about collections, I think. In your case, you have a problem not with general apply or reduce application, but with particular + function. (apply + [:a :b :c]) won't work either even though we are using a vector here.
My point is that you are trying to solve very domain specific problem, that's why there is no generic solution in Clojure itself. So use any proper predicate you can think of.
There's nothing that I've found or used that fits this description. I think your own predicate function is clear, simple, and easy to include in your code if you find it useful.
Maybe you are writing code that has to be very generic, but it's usually the case that a function both accepts and returns a consistent type of data. There are cases where this is not true, but it's usually the case that if a function can be the jack of all trades, it's doing too much.
Using your example -- it makes sense to add a vector of numbers, a list of numbers, or a set of numbers. But a map of numbers? It doesn't make sense, unless maybe it's the values contained in the map, and in this case, it's not reasonable for a single piece of code to be expected to handle adding both sequential data and associative data. The function should be handed something it expects, and it should return something consistent. This kind of reminds me of Stuart Sierra's blog post discussing consistency in this regard. Without more information I'm only guessing as to your use case, but it's something to consider.
I started with a csv-file. I have slurped it and added some structure with partition-by and sorted with sort-by. But now I'd like to add values to keep track of processing. But get-in and assoc-in don't like lists. And although I haven't added lists the partition-by seems to like them.
So, how to get out of this situation? Is there a way to transform all lists inside a structure to vectors or an alternative version of partition-by that isn't such a crybaby about lists or do I need to rethink my solution somehow? :-)
A simple example of a structure
(sort '(1 2 3 4 5))
If you want to transform the lists and then use the associative interface, clojure.walk has utilities that allow you to do that transformation on arbitrarily nested structures.
(let[my-nested-structure {:foo '(1 2 4) '(0) :bar :baz {42 '()}}]
(clojure.walk/postwalk #(if (list? %) (vec %) %) my-nested-structure))
;; => {:foo [1 2 4], [0] :bar, :baz {42 []}}
postwalk and prewalk are effectively the same in this instance but the difference can matter if your replacement function adds/removes sub-entries.
The specter library might also interest you - its transform, select and so on allow you to approach get-in type jobs from a slightly different direction and with that data structure agnosticism.
If your structure is a list of lists called xs, then you can:
(mapv vec xs)
That will give you a vector of vectors. A vector is an associative data structure (unlike a list), so get-in and assoc-in will work. However if there are more meaningful keys than the position of an element in a vector, then you might prefer to work with maps, in which case you would have a collection (vector or list) of homogeneous maps, which is a common way of working with data.
And just for verification:
(mapv vec '((1 2 3) (4 5 6)))
;;=> [[1 2 3] [4 5 6]]
I know that clojure's clojure.lang.IPersistentVector implements assoc, as in (assoc [0 1 2 3] 0 -1) ; => [-1 1 2 3]. I have also heard (as in this answer) that clojure's vector doesn't implement dissoc, as in (dissoc [0 1 2 3] 0) ; => [1 2 3]. If this functionality is so easily reproducible using subvec, is there any real reason why it shouldn't be implemented in clojure.lang, clojure.core, or even contrib? If not, is there any reasoning behind that?
Dissoc doesn't make much sense for vectors for two reasons:
The meaning of dissoc is "remove a key". You can't remove a key from a vector without causing other side effects (e.g. moving all future values)
dissoc would perform relatively badly on vectors if it had to move all subsequent keys - roughly O(n) with quite a lot of GC. Clojure core generally avoids implementing operations that aren't efficient / don't make sense for a particular data structure.
Basically, if you find yourself wanting to do dissoc on a vector, you are probably using the wrong data structure. A persistent hashmap or set is probably a better choice.
If you want a data structure which works as a vector but supports cutting out and inserting elements or subsequences efficiently, then it is worth checking out RRB trees: https://github.com/clojure/core.rrb-vector
In Clojure, a map entry created within a macro is preserved...
(class (eval `(new clojure.lang.MapEntry :a 7)))
;=> clojure.lang.MapEntry
...but when piped thru from the outside context collapses to a vector...
(class (eval `~(new clojure.lang.MapEntry :a 7)))
;=> clojure.lang.PersistentVector
This behavior is defined inside LispReader.syntaxQuote(Object form) condition if(form instanceof IPersistentCollection).
Does anyone know if this is intended behavior or something that will be fixed?
If you want to understand this behavior you need to dive into construction of Clojure sequences and collections.
In fact, every Clojure map is, underneath, a sequence of vectors. Each [:key :val] pair is stored as two elements vector.
Have a proper look, you're asking for class of MapEntry, which is just a vector! Instead, Clojure class for maps is clojure.lang.PersistentArrayMap or clojure.lang.IPersistentMap. MapEntry is just one element, one part of whole map. And, as I said, because each entry in a Clojure map is really a vector, class of evaluated MapEntry is vector, as it should be.
Hope my explanation is understandable.