Why Clojure idiom prefer to return nil instead of empty list like Scheme? - clojure

From a comment on another question, someone is saying that Clojure idiom prefers to return nil rather than an empty list like in Scheme. Why is that?
Like,
(when (seq lat) ...)
instead of
(if (empty? lat)
'() ...)

I can think of a few reasons:
Logical distinction. In Clojure nil means nothing / absence of value. Whereas '() "the empty list is a value - it just happens to be a value that is an empty list. It's quite often conceptually and logically useful to distinguish between the two.
Fit with JVM - the JVM object model supports null references. And quite a lot of Java APIs return null to mean "nothing" or "value not found". So to ensure easy JVM interoperability, it makes sense for Clojure to use nil in a similar way.
Laziness - the logic here is quite complicated, but my understanding is that using nil for "no list" works better with Clojure's lazy sequences. As Clojure is a lazy functional programming language by default, it makes sense for this usage to be standard. See http://clojure.org/lazy for some extra explanation.
"Falsiness" - It's convenient to use nil to mean "nothing" and also to mean "false" when writing conditional code that examines collections - so you can write code like (if (some-map :some-key) ....) to test if a hashmap contains a value for a given key.
Performance - It's more efficient to test for nil than to examine a list to see if it empty... hence adopting this idiom as standard can lead to higher performance idiomatic code
Note that there are still some functions in Clojure that do return an empty list. An example is rest:
(rest [1])
=> ()
This question on rest vs. next goes into some detail of why this is.....

Also note that the union of collection types and nil form a monoid, with concatenation the monoid plus and nil the monoid zero. So nil keeps the empty list semantics under concatenation while also representing a false or "missing" value.
Python is another language where common monoid identities represent false values: 0, empty list, empty tuple.

From The Joy of Clojure
Because empty collections act like true in Boolean contexts, you need an idiom for testing whether there's anything in a collection to process. Thankfully, Clojure provides such a technique:
(seq [1 2 3])
;=> (1 2 3)
(seq [])
;=> nil
In other Lisps, like Common Lisp, the empty list is used to mean nil. This is known as nil punning and is only viable when the empty list is falsey. Returning nil here is clojure's way of reintroducing nil punning.

Since I wrote the comment I will write a answer. (The answer of skuro provides all information but maybe a too much)
First of all I think that more importend things should be in first.
seq is just what everybody uses most of the time but empty? is fine to its just (not (seq lat))
In Clojure '() is true, so normaly you want to return something thats false if the sequence is finished.
if you have only one importend branch in your if an the other returnes false/'() or something like that why should you write down that branch. when has only one branch this is spezially good if you want to have sideeffects. You don't have to use do.
See this example:
(if false
'()
(do (println 1)
(println 2)
(println 3)))
you can write
(when true
(println 1)
(println 2)
(println 3))
Not that diffrent but i think its better to read.
P.S.
Not that there are functions called if-not and when-not they are often better then (if (not true) ...)

Related

In Clojure, how can I add support for common functions like empty? and count to my new type?

As I understand, Clojure makes it "easy" to solve the "expression problem".
But I can't find details how to do this. How can I create a new type (like defrecord) that handles things like empty? and count ?
The two examples empty? and count functions are part of Clojure's core and their implementations are driven by performance considerations, so they may not be the best examples for the solution of the expression problem. Anyway:
You can make empty? work by making seq work on your type, for example by implementing the Seqable interface.
You can make count work by implementing the Counted interface.
Example code:
(deftype Tuple [a b]
clojure.lang.Counted
(count [_] 2)
clojure.lang.Seqable
(seq [_] (list a b)))
(count (->Tuple 1 2)) ;=> 2
(empty? (->Tuple 1 2)) ;=> false
A more general solution for a new function would be either:
Creating a multimethod for your function. Now you need to write custom methods (via defmethod) for the supported types.
Creating a protocol that contains your function and making the types satisfy the protocol via extend-protocol or extend-type.
In either case you have the ability to create a default implementation and new implementations for new or existing types any time. Even during runtime!

Why does `(count nil)` return 0?

In Clojure, I find this surprising:
> (count nil)
0
I would expect a type error, as in this case:
> (count 77)
java.lang.UnsupportedOperationException: count not supported on this type: Long
since nil is not a list:
> (list? nil)
false
Does nil have a special status as an empty sequence?
From the official documentation:
count:
Returns the number of items in the collection. (count nil) returns
0. Also works on strings, arrays, and Java Collections and Maps
So this is the spec ;)
I imagine that it ensures that any mutating value having a "countable" type can be handled at runtime.
Indeed, any reference, referencing an allowed type (strings, arrays, and Java Collections and Maps) might target nil at some point.
there is a old lisp tradition to conflate nil and the empty list.
Now Clojure doesn't adhere to that one but in the LISP differences pages
http://clojure.org/lisps
you can read
A big difference in Clojure, is sequences. Sequences are not specific
collections, esp. they are not necessarily concrete lists. When you
ask an empty collection for a sequence of its elements (by calling
seq) it returns nil, saying "I can't produce one". When you ask a
sequence on its last element for the rest it returns another logical
sequence. You can only tell if that sequence is empty by calling seq
on it in turn. This enables sequences and the sequence protocol to be
lazy.
Thus to be able to chain "seq" calls with many other traditional processing functions (first, rest etc.) you have to deal with nil as some kind of an empty list (this is just my understanding of the whole affair).
Does nil have a special status as an empty sequence?
Yes. This is called nil-punning - a Lisp tradition, as Freakhill says.
In Clojure, it only works in one direction:
If you supply nil where a sequence is expected, it turns itself
into the empty sequence. This works in general, not just for count.
For example,
(concat nil) ; => ()
(map inc nil) ; => ()
But if you supply an empty sequence where nil might be expected,
for example, as a logical false value, it does not convert to nil.
For example
(if () 1 2) ; => 1
(if nil 1 2) ; => 2
This page explains how Clojure, by departing from the traditional Lisp model, is able to better exploit lazy sequences.
Clojure has an abstraction first design (these abstractions can be protocols, interfaces, or multimethods). That is to say, that a function shouldn't generally target a specific datatype, but rather it should operate on some abstraction type, and let any datatype implement that abstraction in order to be used by that function.
Functions in Clojure that work on ordered collections should target clojure.lang.ISeq. The complication here is that we want to also target native types like String or Array or List, where we cannot add a supertype retroactively. Our solution is to use seq to get an instance of clojure.lang.ISeq. It turns out that it is convenient to treat nil as an empty ISeq, this simplifies eg. various linked list representations, as it's natural to have a nil next element for for the last element of the list, and thus to treat nil as an empty list.

In clojure, why the type of an empty list is different from that of non-empty lists?

I want to judge if two values are of same type, but I found that the type of an empty list is clojure.lang.PersistentList$EmptyList rather than clojure.lang.PersistentList.
user=> (def la '())
#'user/la
user=> (def lb '(1 2))
#'user/lb
user=> (def t (map type [la lb]))
#'user/t
user=> t
(clojure.lang.PersistentList$EmptyList clojure.lang.PersistentList)
user=> (apply = t)
false
user=>
So, I'm wondering why is the type of an empty list different from that of non-empty lists and what's the correct way to tell if two things are of same type?
Don't rely on the concrete types of Clojure data structures. They are undocumented implementation details, and you have no guarantee that they won't change in future versions of Clojure.
It is much safer to rely on the abstractions (e.g. as defined by the IPersistentList or ISeq interfaces). These are much less likely to change in ways that might break your code (my understanding is that Rich Hickey is very big on backwards compatibility when it comes to abstractions. If you depend on a concrete implementation, I believe he would say it's your own fault if things break)
But even better, you should use functions in clojure.core such as seq? or list?, depending on exactly what it is you want to detect. Not only are these likely to maintain backwards compatibility for a long time, they also have a chance of working correctly on non-JVM versions of Clojure (e.g. ClojureScript).

Are there variables in Clojure sequence comprehensions?

I'm reading Programming Clojure 2nd edition, and on page 49 it covers Clojure's for loop construct, which it says is actually a sequence comprehension.
The authors suggest the following code:
(defn indexed [coll] (map-indexed vector coll))
(defn index-filter [pred col]
(when pred
(for [[idx elt] (indexed col) :when (pred elt)] idx)))
(index-filter #{\a} "aba")
(0 2)
...is preferable to a Java-based imperative example, and the evidence given is that it "by using higher-order functions...the functional index-of-any avoids all need for variables."
What are "idx", "elt" if they are not variables? Do they mean variables besides the accumulators?
Also, why #{\a} instead of "a"?
pred is a function - #{\a} is a set containing the character a. In Clojure, a set is a function which returns true if its argument \a is contained by it. You could also use #(= % \a) or (fn [x] (= \a x)).
As the other answer implies, "no state was created in the making of this example." idx and elt function like variables, but are local only to the for sequence comprehension, so the code is more compact, not stateful, and arguably clearer (once you're used to sequence comprehensions, at least :-) ) -- perhaps the text is not optimally clear on this point.
There are no variables in functional languages. Actually, you need distinguish variable and value. idx it's just a name bound to concrete value, and you can not reassign it (but you can rebound it to another value).
First parameter of function index-filter is predicate, that means function that return true or false. #{\a} it's a data structure set, but it also can be treated like a function. If you pass element as argument to set function it returns this argument (like true) if element exists and nil (like false) otherwise. So you can think about this set predicate as anonymous function written in more understandable way #(contains? #{\a} %)

What is the correct "clojure way" to check if a collection is non empty

I want to write a function that would return the boolean true if the given collection is not empty and false otherwise.
I could either do
defn ..
(boolean (seq coll))
or
defn ..
(not (empty? coll))
As I am new to clojure I was initially inclined to go with #2 (more readable), but the clojure api reference for empty? explicitly says use the idiom (seq coll) instead of (not (empty? coll)), maybe to avoid double negation.
I want to know what is the clojure way to check if a collection is non-empty and return a boolean true/false.
According to Joy of Clojure, nil punning with seq is idiomatic:
(defn print-seq [s]
(when (seq s)
(prn (first s))
(recur (rest s))))
"...the use of seq as a terminating condition is the idiomatic way to test whether a sequence is empty. If we tested [in the above example] just s instead of (seq s), then the terminating condition wouldn't occur even for empty collections..."
The passage from empty?'s docstring which you mentioned means in particular that such a nonempty? function should never be necessary, or even particularly useful, because seq can always stand in for it in Boolean contexts, which in pure Clojure code it can.
If you feel compelled to write such a function nonetheless, I'll say that I like the first approach better. empty? is built on seq anyway, so there's no avoiding calling it; just casting the result to Boolean seems cleaner than two trips through not. For other options, see e.g. nil?, false? (I still prefer the cast).
Incidentally, why do you want to write this...? For calling a Java method with a boolean argument perhaps? In that case, I think the cast would express the intention nicely.
Update: An example to illustrate the latter point:
A simple Java class:
public class Foo {
public static boolean foo(boolean arg) {
return !arg;
}
}
Some Clojure client code:
(Foo/foo nil)
; => NullPointerException
(Foo/foo (boolean nil))
; => true
In addition to Michal Marczyk's excellent answer, I'll point out that there is a specific not-empty function:
http://clojure.github.io/clojure/clojure.core-api.html#clojure.core/not-empty
but it doesn't do exactly what you ask for. (Though it will work in most situations).
Not-empty returns nil if the collection is empty, and the collection itself if the collection is not empty. For predicate tests, that will function well. If you actually need true and false values, then (not (empty? x)) is what you're after.
If you need a boolean, I think (comp not seq) has a nice ring to it.
Example usage:
((comp not seq) coll)
And if you need to store it as a fn for later:
(def not-empty' (comp not seq))