Why can't you RSeq an RSeq? - clojure

user=> (rseq [:a :b])
(:b :a)
user=> (rseq (rseq [:a :b]))
ClassCastException clojure.lang.APersistentVector$RSeq cannot be cast to
clojure.lang.Reversible clojure.core/rseq (core.clj:1532)
Why can't rseq accept the result of a previous call to rseq?
I read in the docstring that the argument has to be (actually, "can be") a vector or sorted-map, and the above shows that it can't be an RSeq, so I already know that. What I want to know is: is there a good reason for this restriction? Is it just an oversight, or does this restriction provide some important benefit?
Also, is there a convenient workaround for this, other than just never calling rseq? It's hard to know, when you return an RSeq from one function, whether some other function somewhere else might call rseq on it.
I'm asking because it's frustrating to see my code throw exceptions for such surprising reasons. If I knew why this made sense, I might be less likely to make this and similar kinds of errors.

You cannot call rseq on a seq, since you need an input collection with constant-time random access to fullfill rseq's constant-time performance characteristics, and seqs only provide efficient access (iteration) from the head down.
Calling rseq on the result of rseq cannot be special-cased to return the original collection since the original collection is never a seq. And if calling rseq on a RSeq would return something (seq coll), that won't make it straightforward to support (rseq (drop x (rseq coll))). It's probably those kinds of complications that kept the language implementers from supporting "recursive" rseq at all.
If you need a general reversal function, use reverse - which will be slower. If you can, you probably just want to keep a reference to (seq coll) and (rseq coll) if you need both.

Because rseq works only for the special reversible sequences. But the result of its application is an ordinary seq. You can always check if you can rseq a sequence with a reversible? predicate:
(defn reverse* [s]
(if (reversible? s)
(rseq s)
(reverse s)))
Why this fallback isn't in rseq (or reverse) function itself? The reason is that rseq should guarantee its execution time predictability, I guess.
If you really need to reverse the collection back later, you would better keep it as a vector for example: (rseq (vec (rseq [1 2 3])))

Related

how to spec a lazy-seq generating function?

I wish to use spec in my pre and post conditions of a generator function. A simplified example of what I wish to do is described below:
(defn positive-numbers
([]
{:post [(s/valid? (s/+ int?) %)]}
(positive-numbers 1))
([n]
{:post [(s/valid? (s/+ int?) %)]}
(lazy-seq (cons n (positive-numbers (inc n))))))
(->> (positive-numbers) (take 5))
However, defining the generator function like that seems to cause stack-overflow, the cause being that spec will eagerly try to evaluate the whole thing, -or something like that....
Is there another way of using spec to describe the :post result of a generator function like the one above (without causing stack-overflow)?
The theoretically correct answer is that in general you cannot check whether a lazy sequence matches a spec without realizing all of it.
In the case of your specific example of (s/+ int?), given a lazy sequence, how would one establish merely by observing the sequence whether all its elements are integers? However many elements you examine, the next one could always be a keyword.
This is the sort of thing that a type system like, say, core.typed may be able to prove, but a runtime-predicate-based assertion won't be able to check.
Now, in addition to s/+ and s/*, spec (as of Clojure 1.9.0-alpha14) also has a a combinator called s/every, whose docstring says this:
Note that 'every' does not do exhaustive checking, rather it samples *coll-check-limit* elements.
So we have e.g.
(s/valid? (s/* int?) (concat (range 1000) [:foo]))
;= false
but
(s/valid? (s/every int?) (concat (range 1000) [:foo]))
;= true
(with the default *coll-check-limit* value of 101).
This actually isn't an immediate fix to your example – plugging in s/every in place of s/+ won't work, because each recursive call will want to validate its own return value, which will involve realizing more of the sequence, which will involve more recursive calls etc. But you could factor out the sequence-building logic to a helper function with no postconditions and then have positive-numbers declare the postcondition and call that helper function:
(defn positive-numbers* [n]
(lazy-seq (cons n (positive-numbers* (inc n)))))
(defn positive-numbers [n]
{:post [(s/valid? (s/every int? :min-count 1) %)]}
(positive-numbers* n))
Note the caveats:
this will still realize a good chunk of your sequence, which may wreak havoc with your application's performance profile;
the only watertight guarantee here is that the prefix actually examined is as desired, if the seq has a weird item at position 123456, that will go unnoticed.
Because of (1), this is something that makes more sense as a test-only assertion. (2) may be acceptable – you'll still catch some silly typos and the documentation value of the spec is there anyway; if it isn't and you do want an absolutely watertight guarantee that your return type is as desired, then again, core.typed (perhaps used locally just for a handful of namespaces) may be the better bet.

What are side-effects in predicates and why are they bad?

I'm wondering what is considered to be a side-effect in predicates for fns like remove or filter. There seems to be a range of possibilities. Clearly, if the predicate writes to a file, this is a side-effect. But consider a situation like this:
(def *big-var-that-might-be-garbage-collected* ...)
(let [my-ref *big-var-that-might-be-garbage-collected*]
(defn my-pred
[x]
(some-operation-on my-ref x)))
Even if some-operation-on is merely a query that does not change state, the fact that my-pred retains a reference to *big... changes the state of the system in that the big var cannot be garbage collected. Is this also considered to be side-effect?
In my case, I'd like to write to a logging system in a predicate. Is this a side effect?
And why are side-effects in predicates discouraged exactly? Is it because filter and remove and their friends work lazily so that you cannot determine when the predicates are called (and - hence - when the side-effects happen)?
GC is not typically considered when evaluating if a function is pure or not, although many actions that make a function impure can have a GC effect.
Logging is a side effect, as is changing any state in the program or the world. A pure function takes data and returns data, without modifying anything else.
https://softwareengineering.stackexchange.com/questions/15269/why-are-side-effects-considered-evil-in-functional-programming covers why side effects are avoided in functional languages.
I found this link helpful
The problem is determining when, or even whether, the side-effects will occur on any given call to the function.
If you only care that the same inputs return the same answer, you are fine. Side-effects are dependent on how the function is executed.
For example,
(first (filter odd? (range 20)))
; 1
But if we arrange for odd? to print its argument as it goes:
(first (filter #(do (print %) (odd? %)) (range 20)))
It will print 012345678910111213141516171819 before returning 1!
The reason is that filter, where it can, deals with its sequence argument in chunks of 32 elements.
If we take the limit off the range:
(first (filter #(do (print %) (odd? %)) (range)))
... we get a full-size chunk printed: 012345678910111213141516171819012345678910111213141516171819202122232425262728293031
Just printing the argument is confusing. If the side effects are significant, things could go seriously awry.

What is the difference among the functions doall dorun doseq and for?

What is the difference between the functions doall, dorun, doseq, and for ?
I found some information scattered throughout the internet, but I think it would be better to centralize that information here.
dorun, doall, and doseq are all for forcing lazy sequences, presumably to get side effects.
dorun - don't hold whole seq in memory while forcing, return nil
doall - hold whole seq in memory while forcing (i.e. all of it) and return the seq
doseq - same as dorun, but gives you chance to do something with each element as it's forced; returns nil
for is different in that it's a list comprehension, and isn't related to forcing effects. doseq and for have the same binding syntax, which may be a source of confusion, but doseq always returns nil, and for returns a lazy seq.
You can see how dorun and doall relate to one another by looking at the (simplified) source code:
(defn dorun [coll]
(when (seq coll) (recur (next coll))))
(defn doall [coll] (dorun coll) coll)
dorun runs through the sequence, forgetting it as it goes,
ultimately returning nil.
doall returns its sequence argument, now realised by the dorun.
Similarly, we could implement doseq in terms of dorun and for:
(defmacro doseq [seq-exprs & body]
`(dorun (for ~seq-exprs ~#body)))
For some reason, performance perhaps, this is not done. The standard doseq is written out in full, imitating for.

Idiomatic no-op/"pass"

What's the (most) idiomatic Clojure representation of no-op? I.e.,
(def r (ref {}))
...
(let [der #r]
(match [(:a der) (:b der)]
[nil nil] (do (fill-in-a) (fill-in-b))
[_ nil] (fill-in-b)
[nil _] (fill-in-a)
[_ _] ????))
Python has pass. What should I be using in Clojure?
ETA: I ask mostly because I've run into places (cond, e.g.) where not supplying anything causes an error. I realize that "most" of the time, an equivalent of pass isn't needed, but when it is, I'd like to know what's the most Clojuric.
I see the keyword :default used in cases like this fairly commonly.
It has the nice property of being recognizable in the output and or logs. This way when you see a log line like: "process completed :default" it's obvious that nothing actually ran. This takes advantage of the fact that keywords are truthy in Clojure so the default will be counted as a success.
There are no "statements" in Clojure, but there are an infinite number of ways to "do nothing". An empty do block (do), literally indicates that one is "doing nothing" and evaluates to nil. Also, I agree with the comment that the question itself indicates that you are not using Clojure in an idiomatic way, regardless of this specific stylistic question.
The most analogous thing that I can think of in Clojure to a "statement that does nothing" from imperative programming would be a function that does nothing. There are a couple of built-ins that can help you here: identity is a single-arg function that simply returns its argument, and constantly is a higher-order function that accepts a value, and returns a function that will accept any number of arguments and return that value. Both are useful as placeholders in situations where you need to pass a function but don't want that function to actually do much of anything. A simple example:
(defn twizzle [x]
(let [f (cond (even? x) (partial * 4)
(= 0 (rem x 3)) (partial + 2)
:else identity)]
(f (inc x))))
Rewriting this function to "do nothing" in the default case, while possible, would require an awkward rewrite without the use of identity.

In clojure, why does assoc require arguments in addition to a map, but dissoc not?

In clojure,
(assoc {})
throws an arity exception, but
(dissoc {})
does not. Why? I would have expected either both of them to throw an exception, or both to make no changes when no keys or values are provided.
EDIT: I see a rationale for allowing these forms; it means we can apply assoc or dissoc to a possibly empty list of arguments. I just don't see why one would be allowed and the other not, and I'm curious as to whether there's a good reason for this that I'm missing.
I personally think the lack of 1-arity assoc is an oversight: whenever a trailing list of parameters is expected (& stuff), the function should normally be capable of working with zero parameters in order to make it possible to apply it to an empty list.
Clojure has plenty of other functions that work correctly with zero arguments, e.g. + and merge.
On the other hand, Clojure has other functions that don't accept zero trailing parameters, e.g. conj.
So the Clojure API is a bit inconsistent in this regard.....
This is not an authoritative answer, but is based on my testing and looking at ClojureDocs:
dissoc 's arity includes your being able to pass in one argument, a map. No key/value is removed from the map, in that case.
(def test-map {:account-no 12345678 :lname "Jones" :fnam "Fred"})
(dissoc test-map)
{:account-no 12345678, :lname "Jones", :fnam "Fred"}
assoc has no similar arity. That is calling assoc requires a map, key, and value.
Now why this was designed this way is a different matter, and if you do not receive an answer with that information -- I hope you do -- then I suggest offering a bounty or go on Clojure's Google Groups and ask that question.
Here is the source.
(defn dissoc
"dissoc[iate]. Returns a new map of the same (hashed/sorted) type,
that does not contain a mapping for key(s)."
{:added "1.0"
:static true}
([map] map)
([map key]
(. clojure.lang.RT (dissoc map key)))
([map key & ks]
(let [ret (dissoc map key)]
(if ks
(recur ret (first ks) (next ks))
ret))))