Why does `(count nil)` return 0? - clojure

In Clojure, I find this surprising:
> (count nil)
0
I would expect a type error, as in this case:
> (count 77)
java.lang.UnsupportedOperationException: count not supported on this type: Long
since nil is not a list:
> (list? nil)
false
Does nil have a special status as an empty sequence?

From the official documentation:
count:
Returns the number of items in the collection. (count nil) returns
0. Also works on strings, arrays, and Java Collections and Maps
So this is the spec ;)
I imagine that it ensures that any mutating value having a "countable" type can be handled at runtime.
Indeed, any reference, referencing an allowed type (strings, arrays, and Java Collections and Maps) might target nil at some point.

there is a old lisp tradition to conflate nil and the empty list.
Now Clojure doesn't adhere to that one but in the LISP differences pages
http://clojure.org/lisps
you can read
A big difference in Clojure, is sequences. Sequences are not specific
collections, esp. they are not necessarily concrete lists. When you
ask an empty collection for a sequence of its elements (by calling
seq) it returns nil, saying "I can't produce one". When you ask a
sequence on its last element for the rest it returns another logical
sequence. You can only tell if that sequence is empty by calling seq
on it in turn. This enables sequences and the sequence protocol to be
lazy.
Thus to be able to chain "seq" calls with many other traditional processing functions (first, rest etc.) you have to deal with nil as some kind of an empty list (this is just my understanding of the whole affair).

Does nil have a special status as an empty sequence?
Yes. This is called nil-punning - a Lisp tradition, as Freakhill says.
In Clojure, it only works in one direction:
If you supply nil where a sequence is expected, it turns itself
into the empty sequence. This works in general, not just for count.
For example,
(concat nil) ; => ()
(map inc nil) ; => ()
But if you supply an empty sequence where nil might be expected,
for example, as a logical false value, it does not convert to nil.
For example
(if () 1 2) ; => 1
(if nil 1 2) ; => 2
This page explains how Clojure, by departing from the traditional Lisp model, is able to better exploit lazy sequences.

Clojure has an abstraction first design (these abstractions can be protocols, interfaces, or multimethods). That is to say, that a function shouldn't generally target a specific datatype, but rather it should operate on some abstraction type, and let any datatype implement that abstraction in order to be used by that function.
Functions in Clojure that work on ordered collections should target clojure.lang.ISeq. The complication here is that we want to also target native types like String or Array or List, where we cannot add a supertype retroactively. Our solution is to use seq to get an instance of clojure.lang.ISeq. It turns out that it is convenient to treat nil as an empty ISeq, this simplifies eg. various linked list representations, as it's natural to have a nil next element for for the last element of the list, and thus to treat nil as an empty list.

Related

clojure's `into` in common lisp

clojure has a handy (into to-coll from-coll) function, adding elements from from-coll to to-coll, retaining to-coll's type.
How can this one be implemented in common lisp?
The first attempt would be
(defun into (seq1 seq2)
(concatenate (type-of seq1) seq1 seq2))
but this one obviously fails, since type-of includes the vector's length in it's result, disallowing adding more elements (as of sbcl), though it still works for list as a first arg
(while still failing for empty list).
the question is: is it possible to make up this kind of function without using generic methods and/or complex type-of result processing (e.g. removing length for vectors/arrays etc) ?
i'm okay with into acting as append (in contrast with clojure, where into result depends on target collection type) Let's call it concat-into
In Clojure, you have a concrete idea (most of the time) of what kind that first collection is when you use into, because it changes the semantics: if it is a list, additional elements will be conjed onto the front, if it is a vector, they will be conjed to the back, if it is a map, you need to supply map entry designators (i. e. actual map entries or two-element vectors), sets are more flexible but also carry their own semantics. That's why I'd guess that using concatenate directly, explicitly supplying the type, is probably a good enough fit for many use cases.
Other than that, I think that it could be useful to extend this functionality (Common Lisp only has a closed set of sequence types), but for that, it seems too obviously convenient to use generic functions to ignore. It is not trivial to provide a solution that is extensible, generic, and performant.
EDIT: To summarize: no, you can't get that behaviour with clever application of one or two “built-ins”, but you can certainly write an extensible and generic solution using generic functions.
ok, the only thing i've come to (besides generic methods) is this dead simple function:
(defun into (target source)
(let ((target-type (etypecase target
(vector (list 'array (array-element-type target) (*)))
(list 'list))))
(concatenate target-type target source)))
CL-USER> (into (list 1 2 4) "asd")
;;=> (1 2 4 #\a #\s #\d)
CL-USER> (into #*0010 (list 1 1 0 0))
;;=> #*00101100
CL-USER> (into "asdasd" (list #\a #\b))
;;=> "asdasdab"
also the simple empty impl:
(defun empty (target)
(etypecase target
(vector (make-array 0
:element-type (array-element-type target)
:adjustable t :fill-pointer 0))
(list)))
The result indeed (as #Svante noted) doesn't have the exact type, but rather "the collection with the element type being the same as that of target". It doesn't conform the clojure's protocol (where list target should be prepended to).
Can't see where it flaws (if it does), so would be nice to hear about that.. Anyway, as it was only for the sake of education, that will do.

Clojure basics: counting frequencies

I am learning Clojure, and I saw this bit of code online:
(count (filter #{42} coll))
And it does, as stated, count occurrences of the number 42 in coll. Is #{42} a function? The Clojure documentation on filter says that it should be, since the snippet works as advertised. I just have no idea how it works. If someone could clarify this for me, that would be great. My own solution to this same thing would have been:
(count (filter #(= %1 42) coll))
How come my filtering function has parenthesis and the snippet I found online has curly braces around the filtering function (#(...) vs. #{...})?
=> #{42}
#{42}
Defines a set...
=> (type #{42})
clojure.lang.PersistentHashSet
=> (supers (type #{42}))
#{clojure.lang.IHashEq java.lang.Object clojure.lang.IFn ...}
Interestingly the set implements IFn so you can treat it like a function. The behaviour of the function is "if this item exists in the set, return it".
=> (#{2 3} 3)
3
=> (#{2 3} 4)
nil
Other collections such as map and vector stand in as functions in a similar fashion, retrieving by key or index as appropriate.
=> ({:x 23 :y 26} :y)
26
=> ([5 7 9] 1)
7
Sweet, no? :-)
Yes, #{42} is a function,
because it's a set, and sets, amongst other capabilities, are
functions: they implement the clojure.lang.IFn interface.
Applied to any value in the set, they return it; applied to anything
else, they return nil.
So #{42} tests whether its argument is 42 (only nil and false are false, remember).
The Clojure way is to make everything a function that might usefully be one:
Sets work as a test for membership.
Maps work as key lookup.
Vectors work as index lookup.
Keywords work as lookup in the map argument.
This
often saves you a get,
allows you, as in the question, to pass naked data structures to higher order functions
such as filter and map, and
in the case of keywords, allows you to move transparently between maps and records
for holding your data.

Are there variables in Clojure sequence comprehensions?

I'm reading Programming Clojure 2nd edition, and on page 49 it covers Clojure's for loop construct, which it says is actually a sequence comprehension.
The authors suggest the following code:
(defn indexed [coll] (map-indexed vector coll))
(defn index-filter [pred col]
(when pred
(for [[idx elt] (indexed col) :when (pred elt)] idx)))
(index-filter #{\a} "aba")
(0 2)
...is preferable to a Java-based imperative example, and the evidence given is that it "by using higher-order functions...the functional index-of-any avoids all need for variables."
What are "idx", "elt" if they are not variables? Do they mean variables besides the accumulators?
Also, why #{\a} instead of "a"?
pred is a function - #{\a} is a set containing the character a. In Clojure, a set is a function which returns true if its argument \a is contained by it. You could also use #(= % \a) or (fn [x] (= \a x)).
As the other answer implies, "no state was created in the making of this example." idx and elt function like variables, but are local only to the for sequence comprehension, so the code is more compact, not stateful, and arguably clearer (once you're used to sequence comprehensions, at least :-) ) -- perhaps the text is not optimally clear on this point.
There are no variables in functional languages. Actually, you need distinguish variable and value. idx it's just a name bound to concrete value, and you can not reassign it (but you can rebound it to another value).
First parameter of function index-filter is predicate, that means function that return true or false. #{\a} it's a data structure set, but it also can be treated like a function. If you pass element as argument to set function it returns this argument (like true) if element exists and nil (like false) otherwise. So you can think about this set predicate as anonymous function written in more understandable way #(contains? #{\a} %)

Why Clojure idiom prefer to return nil instead of empty list like Scheme?

From a comment on another question, someone is saying that Clojure idiom prefers to return nil rather than an empty list like in Scheme. Why is that?
Like,
(when (seq lat) ...)
instead of
(if (empty? lat)
'() ...)
I can think of a few reasons:
Logical distinction. In Clojure nil means nothing / absence of value. Whereas '() "the empty list is a value - it just happens to be a value that is an empty list. It's quite often conceptually and logically useful to distinguish between the two.
Fit with JVM - the JVM object model supports null references. And quite a lot of Java APIs return null to mean "nothing" or "value not found". So to ensure easy JVM interoperability, it makes sense for Clojure to use nil in a similar way.
Laziness - the logic here is quite complicated, but my understanding is that using nil for "no list" works better with Clojure's lazy sequences. As Clojure is a lazy functional programming language by default, it makes sense for this usage to be standard. See http://clojure.org/lazy for some extra explanation.
"Falsiness" - It's convenient to use nil to mean "nothing" and also to mean "false" when writing conditional code that examines collections - so you can write code like (if (some-map :some-key) ....) to test if a hashmap contains a value for a given key.
Performance - It's more efficient to test for nil than to examine a list to see if it empty... hence adopting this idiom as standard can lead to higher performance idiomatic code
Note that there are still some functions in Clojure that do return an empty list. An example is rest:
(rest [1])
=> ()
This question on rest vs. next goes into some detail of why this is.....
Also note that the union of collection types and nil form a monoid, with concatenation the monoid plus and nil the monoid zero. So nil keeps the empty list semantics under concatenation while also representing a false or "missing" value.
Python is another language where common monoid identities represent false values: 0, empty list, empty tuple.
From The Joy of Clojure
Because empty collections act like true in Boolean contexts, you need an idiom for testing whether there's anything in a collection to process. Thankfully, Clojure provides such a technique:
(seq [1 2 3])
;=> (1 2 3)
(seq [])
;=> nil
In other Lisps, like Common Lisp, the empty list is used to mean nil. This is known as nil punning and is only viable when the empty list is falsey. Returning nil here is clojure's way of reintroducing nil punning.
Since I wrote the comment I will write a answer. (The answer of skuro provides all information but maybe a too much)
First of all I think that more importend things should be in first.
seq is just what everybody uses most of the time but empty? is fine to its just (not (seq lat))
In Clojure '() is true, so normaly you want to return something thats false if the sequence is finished.
if you have only one importend branch in your if an the other returnes false/'() or something like that why should you write down that branch. when has only one branch this is spezially good if you want to have sideeffects. You don't have to use do.
See this example:
(if false
'()
(do (println 1)
(println 2)
(println 3)))
you can write
(when true
(println 1)
(println 2)
(println 3))
Not that diffrent but i think its better to read.
P.S.
Not that there are functions called if-not and when-not they are often better then (if (not true) ...)

Scheme and Clojure don't have the atom type predicate - is this by design?

Common LISP and Emacs LISP have the atom type predicate. Scheme and Clojure don't have it. http://hyperpolyglot.wikidot.com/lisp
Is there a design reason for this - or is it just not an essential function to include in the API?
In Clojure, the atom predicate isn't so important because Clojure emphasizes various other types of (immutable) data structures rather than focusing on cons cells / lists.
It could also cause confusion. How would you expect this function to behave when given a hashmap, a set or a vector for example? Or a Java object that represents some complex mutable data structure?
Also the name "atom" is used for something completely different - it's one of Clojure's core concurrency mechanisms to manage shared, synchronous, independent state.
Clojure has the coll? (collection?) function, which is (sort of) the inverse of atom?.
In the book The Little Schemer, atom? is defined as follows:
(define (atom? x)
(and (not (pair? x))
(not (null? x))))
Noting that null is not considered an atom, as other answers have suggested. In the mentioned book atom? is used heavily, in particular when writing procedures that deal with lists of lists.
In the entire IronScheme standard libraries which implement R6RS, I never needed such a function.
In summary:
It is useless
It is easy enough to write if you need it
Which pretty much follows Scheme's minimalistic approach.
In Scheme anything that is not a pair is an atom. As Scheme already defines the predicate pair?, the atom? predicate is not needed, as it is so trivial to define:
(define (atom? s)
(not (pair? s)))
It's a trivial function:
(defun atom (x)
(not (consp x)))
It is used in list processing, when the Lisp dialect uses conses to build lists. There are some 'Lisps' for which this is not the case or not central.
Atom is either a symbol, a character, a number, or null.
(define (atom? a)
(or (symbol? a)
(char? a)
(number? a)
(null? a)))
I think those are all the atoms that exist, if you find more add to the conditional expression. For example, if you think a string is an atom, add (string? a), :-). The absence of a definition for atom, allows you to define it the way you want. After all, Scheme does not know what an atom is.
In Lisp nil is an atom, so I've made null an atom. nil is also a list by simplification nil = (nil . nil), the same way the integral numbers are rational numbers by simplification, 2 = 2/1, 2 is an integral number, 2/1 is a rational number, as both are equals by simplification of the rational one; one says the integral number 2 is also a rational number. But the list predicate is already defined in Scheme, nothing to worry about.
About the question. As long as I am concerned Scheme has predicates only for class types, atom is not a class type, atom is an abstraction that incorporates several class types. Maybe that is the reason. But pair is not a class type either, but it does not incorporate several class types, and yet some may consider pair as a class type.
Atom means that a certain thing is not a compound thing. One reason not to include such a predicate is when the language allows you to define atomic types, so the pletora of atoms can grow wider and wider, and such a predicate would make no sense. I don't know if Scheme allows for this. I can only say that Scheme predicates (the built-in ones) are all specific. You can ask, is this an apple?, is this an orange?; but you cannot ask is this a fruit?. :-). Well, you can, if you do it yourself. Despite what a said, Scheme has a general predicate number?, and number specific predicates, integer?, rational?, real?; notwithstanding, number can be thought of as a class type (the other predicates refer to sub-types of number), whereas atom is not (at least in Scheme).
Note:
class types: types that belong to a certain class of things. Example:
number, integer, real, rational, character, procedure, list, vector, string, etc.