Is there an idiomatic way of determining if a LazySeq contains an element? As of Clojure 1.5 calling contains? throws an IllegalArgumentException:
IllegalArgumentException contains? not supported on type: clojure.lang.LazySeq
clojure.lang.RT.contains (RT.java:724)
Before 1.5, as far as I know, it always returned false.
I know that calling contains? on a LazySeq may never return as it can be infinite. But what if I know it isn't and don't care if it is evaluated eagerly?
What I came up with is:
(defn lazy-contains? [col key]
(not (empty? (filter #(= key %) col))))
But it doesn't feel quite right. Is there a better way?
First, lazy seqs are not efficient for checking membership. Consider using a set instead of a lazy seq.
If a set is impractical, your solution isn't bad. A couple of possible improvements:
"Not empty" is a bit awkward. Just using seq is enough to get a nil-or-truthy value that your users can use in an if.You can wrap that in boolean if you want true or false.
Since you only care about the first match, you can use some instead of filter and seq.
A convenient way to write an equality predicate is with a literal set, like #{key}, though if key is nil this will always return nil whether nil is found our not.
All together that gives you:
(defn lazy-contains? [col key]
(some #{key} col))
If you use some instead of filter as in your example, you'll get an immediate return as soon as a value is found instead of forcing evaluation of the entire sequence.
(defn lazy-contains? [coll key]
(boolean (some #(= % key) coll)))
Edit: If you don't coerce the result to a boolean, note that you'll get nil instead of false if the key isn't found.
Related
I wish to use spec in my pre and post conditions of a generator function. A simplified example of what I wish to do is described below:
(defn positive-numbers
([]
{:post [(s/valid? (s/+ int?) %)]}
(positive-numbers 1))
([n]
{:post [(s/valid? (s/+ int?) %)]}
(lazy-seq (cons n (positive-numbers (inc n))))))
(->> (positive-numbers) (take 5))
However, defining the generator function like that seems to cause stack-overflow, the cause being that spec will eagerly try to evaluate the whole thing, -or something like that....
Is there another way of using spec to describe the :post result of a generator function like the one above (without causing stack-overflow)?
The theoretically correct answer is that in general you cannot check whether a lazy sequence matches a spec without realizing all of it.
In the case of your specific example of (s/+ int?), given a lazy sequence, how would one establish merely by observing the sequence whether all its elements are integers? However many elements you examine, the next one could always be a keyword.
This is the sort of thing that a type system like, say, core.typed may be able to prove, but a runtime-predicate-based assertion won't be able to check.
Now, in addition to s/+ and s/*, spec (as of Clojure 1.9.0-alpha14) also has a a combinator called s/every, whose docstring says this:
Note that 'every' does not do exhaustive checking, rather it samples *coll-check-limit* elements.
So we have e.g.
(s/valid? (s/* int?) (concat (range 1000) [:foo]))
;= false
but
(s/valid? (s/every int?) (concat (range 1000) [:foo]))
;= true
(with the default *coll-check-limit* value of 101).
This actually isn't an immediate fix to your example – plugging in s/every in place of s/+ won't work, because each recursive call will want to validate its own return value, which will involve realizing more of the sequence, which will involve more recursive calls etc. But you could factor out the sequence-building logic to a helper function with no postconditions and then have positive-numbers declare the postcondition and call that helper function:
(defn positive-numbers* [n]
(lazy-seq (cons n (positive-numbers* (inc n)))))
(defn positive-numbers [n]
{:post [(s/valid? (s/every int? :min-count 1) %)]}
(positive-numbers* n))
Note the caveats:
this will still realize a good chunk of your sequence, which may wreak havoc with your application's performance profile;
the only watertight guarantee here is that the prefix actually examined is as desired, if the seq has a weird item at position 123456, that will go unnoticed.
Because of (1), this is something that makes more sense as a test-only assertion. (2) may be acceptable – you'll still catch some silly typos and the documentation value of the spec is there anyway; if it isn't and you do want an absolutely watertight guarantee that your return type is as desired, then again, core.typed (perhaps used locally just for a handful of namespaces) may be the better bet.
Here's a use of the standard 'contains?' function in Clojure-
(contains? {:state "active", :course_n "law", :course_i "C0"} :state)
and it returns the expected
true
I used the following
Clojure: Idiomatic way to call contains? on a lazy sequence
as a guide for building a lazy-contains? as this is what I need for my present use-case.
The problem I'm facing is that for a map these alternatives are not returning the same answer, giving either a false or a nil response. I've tried looking at the source for contains? and it's slow going trying to understand what's happening so I can correct the lazy-contains? appropriately (for the record Clojure is essentially my first programming language, and my exposure to Java is very limited).
Any thoughts or ideas on how I might approach this? I tried every variant on the linked question I could.
Thanks in advance.
Edited to remove the error pointed out by #amalloy.
I think your problem is with the way that maps present themselves as sequences.
Given
(def data {:state "active", :course_n "law", :course_i "C0"})
then
(seq data)
;([:state "active"] [:course_i "C0"] [:course_n "law"])
... a sequence of key-value pairs.
So if we define (following #chouser)
(defn lazy-contains? [coll x]
(some #(= x %) coll))
... then
(lazy-contains? data :state)
;nil
... a false result, whereas ...
(lazy-contains? data [:state "active"])
;true
This is what #Ankur was getting at, showing you a function treating a map as a sequence consistent with contains? on the map itself.
The standard contains? works with keyed/indexed collections - maps
or sets or vectors - where it tests for the presence of a key.
Our lazy-contains? works with anything sequable, including all the
standard collections, testing for the presence of a value.
Given the way that keyed/indexed collections present as sequences, these are bound to be inconsistent.
You can try the below implementation (for maps only):
(defn lazy-contains? [col key]
(some (fn [[k v]] (= k key)) col))
Remember, contains? is to check the existence of a key in a collection, in maps the key is obvious, in other supported collections (like vector) the key is the index.
A "lazy" implementation of contains? is undesirable where checking for presence
of a key in a hash-map or of a value in a set
(contains? #{:foo} :foo}) => true
(contains? {:foo :bar} :foo) => true
of an index of a vector array or string.
(contains? [:foo] 0) => true
(contains? (int-array 7) 6) => true
(contains? "foo" 2) => true
Quoting from the contains? docstring:
'contains?' operates constant or logarithmic time; it will not
perform a linear search for a value.
some is a tool for linear searching. When searching for an element in a set or vector, it can take the input sequence length times as long as contains? or longer in the worst case and will take more time than contains? in almost every case.
contains? can't be implemented "lazy" as it does not produce a sequence. However, some stops consuming a lazy sequence as soon as it has determined a return value.
(some zero? (range))
;; true
Notice that maps and sets are never sequential or lazy.
In clojure,
(assoc {})
throws an arity exception, but
(dissoc {})
does not. Why? I would have expected either both of them to throw an exception, or both to make no changes when no keys or values are provided.
EDIT: I see a rationale for allowing these forms; it means we can apply assoc or dissoc to a possibly empty list of arguments. I just don't see why one would be allowed and the other not, and I'm curious as to whether there's a good reason for this that I'm missing.
I personally think the lack of 1-arity assoc is an oversight: whenever a trailing list of parameters is expected (& stuff), the function should normally be capable of working with zero parameters in order to make it possible to apply it to an empty list.
Clojure has plenty of other functions that work correctly with zero arguments, e.g. + and merge.
On the other hand, Clojure has other functions that don't accept zero trailing parameters, e.g. conj.
So the Clojure API is a bit inconsistent in this regard.....
This is not an authoritative answer, but is based on my testing and looking at ClojureDocs:
dissoc 's arity includes your being able to pass in one argument, a map. No key/value is removed from the map, in that case.
(def test-map {:account-no 12345678 :lname "Jones" :fnam "Fred"})
(dissoc test-map)
{:account-no 12345678, :lname "Jones", :fnam "Fred"}
assoc has no similar arity. That is calling assoc requires a map, key, and value.
Now why this was designed this way is a different matter, and if you do not receive an answer with that information -- I hope you do -- then I suggest offering a bounty or go on Clojure's Google Groups and ask that question.
Here is the source.
(defn dissoc
"dissoc[iate]. Returns a new map of the same (hashed/sorted) type,
that does not contain a mapping for key(s)."
{:added "1.0"
:static true}
([map] map)
([map key]
(. clojure.lang.RT (dissoc map key)))
([map key & ks]
(let [ret (dissoc map key)]
(if ks
(recur ret (first ks) (next ks))
ret))))
I want to write a function that would return the boolean true if the given collection is not empty and false otherwise.
I could either do
defn ..
(boolean (seq coll))
or
defn ..
(not (empty? coll))
As I am new to clojure I was initially inclined to go with #2 (more readable), but the clojure api reference for empty? explicitly says use the idiom (seq coll) instead of (not (empty? coll)), maybe to avoid double negation.
I want to know what is the clojure way to check if a collection is non-empty and return a boolean true/false.
According to Joy of Clojure, nil punning with seq is idiomatic:
(defn print-seq [s]
(when (seq s)
(prn (first s))
(recur (rest s))))
"...the use of seq as a terminating condition is the idiomatic way to test whether a sequence is empty. If we tested [in the above example] just s instead of (seq s), then the terminating condition wouldn't occur even for empty collections..."
The passage from empty?'s docstring which you mentioned means in particular that such a nonempty? function should never be necessary, or even particularly useful, because seq can always stand in for it in Boolean contexts, which in pure Clojure code it can.
If you feel compelled to write such a function nonetheless, I'll say that I like the first approach better. empty? is built on seq anyway, so there's no avoiding calling it; just casting the result to Boolean seems cleaner than two trips through not. For other options, see e.g. nil?, false? (I still prefer the cast).
Incidentally, why do you want to write this...? For calling a Java method with a boolean argument perhaps? In that case, I think the cast would express the intention nicely.
Update: An example to illustrate the latter point:
A simple Java class:
public class Foo {
public static boolean foo(boolean arg) {
return !arg;
}
}
Some Clojure client code:
(Foo/foo nil)
; => NullPointerException
(Foo/foo (boolean nil))
; => true
In addition to Michal Marczyk's excellent answer, I'll point out that there is a specific not-empty function:
http://clojure.github.io/clojure/clojure.core-api.html#clojure.core/not-empty
but it doesn't do exactly what you ask for. (Though it will work in most situations).
Not-empty returns nil if the collection is empty, and the collection itself if the collection is not empty. For predicate tests, that will function well. If you actually need true and false values, then (not (empty? x)) is what you're after.
If you need a boolean, I think (comp not seq) has a nice ring to it.
Example usage:
((comp not seq) coll)
And if you need to store it as a fn for later:
(def not-empty' (comp not seq))
I tried the following in Clojure, expecting to have the class of a non-lazy sequence returned:
(.getClass (doall (take 3 (repeatedly rand))))
However, this still returns clojure.lang.LazySeq. My guess is that doall does evaluate the entire sequence, but returns the original sequence as it's still useful for memoization.
So what is the idiomatic means of creating a non-lazy sequence from a lazy one?
doall is all you need. Just because the seq has type LazySeq doesn't mean it has pending evaluation. Lazy seqs cache their results, so all you need to do is walk the lazy seq once (as doall does) in order to force it all, and thus render it non-lazy. seq does not force the entire collection to be evaluated.
This is to some degree a question of taxonomy. a lazy sequence is just one type of sequence as is a list, vector or map. So the answer is of course "it depends on what type of non lazy sequence you want to get:
Take your pick from:
an ex-lazy (fully evaluated) lazy sequence (doall ... )
a list for sequential access (apply list (my-lazy-seq)) OR (into () ...)
a vector for later random access (vec (my-lazy-seq))
a map or a set if you have some special purpose.
You can have whatever type of sequence most suites your needs.
This Rich guy seems to know his clojure and is absolutely right.
Buth I think this code-snippet, using your example, might be a useful complement to this question :
=> (realized? (take 3 (repeatedly rand)))
false
=> (realized? (doall (take 3 (repeatedly rand))))
true
Indeed type has not changed but realization has
I stumbled on this this blog post about doall not being recursive. For that I found the first comment in the post did the trick. Something along the lines of:
(use 'clojure.walk)
(postwalk identity nested-lazy-thing)
I found this useful in a unit test where I wanted to force evaluation of some nested applications of map to force an error condition.
(.getClass (into '() (take 3 (repeatedly rand))))