clojure's `into` in common lisp - clojure

clojure has a handy (into to-coll from-coll) function, adding elements from from-coll to to-coll, retaining to-coll's type.
How can this one be implemented in common lisp?
The first attempt would be
(defun into (seq1 seq2)
(concatenate (type-of seq1) seq1 seq2))
but this one obviously fails, since type-of includes the vector's length in it's result, disallowing adding more elements (as of sbcl), though it still works for list as a first arg
(while still failing for empty list).
the question is: is it possible to make up this kind of function without using generic methods and/or complex type-of result processing (e.g. removing length for vectors/arrays etc) ?
i'm okay with into acting as append (in contrast with clojure, where into result depends on target collection type) Let's call it concat-into

In Clojure, you have a concrete idea (most of the time) of what kind that first collection is when you use into, because it changes the semantics: if it is a list, additional elements will be conjed onto the front, if it is a vector, they will be conjed to the back, if it is a map, you need to supply map entry designators (i. e. actual map entries or two-element vectors), sets are more flexible but also carry their own semantics. That's why I'd guess that using concatenate directly, explicitly supplying the type, is probably a good enough fit for many use cases.
Other than that, I think that it could be useful to extend this functionality (Common Lisp only has a closed set of sequence types), but for that, it seems too obviously convenient to use generic functions to ignore. It is not trivial to provide a solution that is extensible, generic, and performant.
EDIT: To summarize: no, you can't get that behaviour with clever application of one or two “built-ins”, but you can certainly write an extensible and generic solution using generic functions.

ok, the only thing i've come to (besides generic methods) is this dead simple function:
(defun into (target source)
(let ((target-type (etypecase target
(vector (list 'array (array-element-type target) (*)))
(list 'list))))
(concatenate target-type target source)))
CL-USER> (into (list 1 2 4) "asd")
;;=> (1 2 4 #\a #\s #\d)
CL-USER> (into #*0010 (list 1 1 0 0))
;;=> #*00101100
CL-USER> (into "asdasd" (list #\a #\b))
;;=> "asdasdab"
also the simple empty impl:
(defun empty (target)
(etypecase target
(vector (make-array 0
:element-type (array-element-type target)
:adjustable t :fill-pointer 0))
(list)))
The result indeed (as #Svante noted) doesn't have the exact type, but rather "the collection with the element type being the same as that of target". It doesn't conform the clojure's protocol (where list target should be prepended to).
Can't see where it flaws (if it does), so would be nice to hear about that.. Anyway, as it was only for the sake of education, that will do.

Related

In Clojure, how can I add support for common functions like empty? and count to my new type?

As I understand, Clojure makes it "easy" to solve the "expression problem".
But I can't find details how to do this. How can I create a new type (like defrecord) that handles things like empty? and count ?
The two examples empty? and count functions are part of Clojure's core and their implementations are driven by performance considerations, so they may not be the best examples for the solution of the expression problem. Anyway:
You can make empty? work by making seq work on your type, for example by implementing the Seqable interface.
You can make count work by implementing the Counted interface.
Example code:
(deftype Tuple [a b]
clojure.lang.Counted
(count [_] 2)
clojure.lang.Seqable
(seq [_] (list a b)))
(count (->Tuple 1 2)) ;=> 2
(empty? (->Tuple 1 2)) ;=> false
A more general solution for a new function would be either:
Creating a multimethod for your function. Now you need to write custom methods (via defmethod) for the supported types.
Creating a protocol that contains your function and making the types satisfy the protocol via extend-protocol or extend-type.
In either case you have the ability to create a default implementation and new implementations for new or existing types any time. Even during runtime!

In clojure, why the type of an empty list is different from that of non-empty lists?

I want to judge if two values are of same type, but I found that the type of an empty list is clojure.lang.PersistentList$EmptyList rather than clojure.lang.PersistentList.
user=> (def la '())
#'user/la
user=> (def lb '(1 2))
#'user/lb
user=> (def t (map type [la lb]))
#'user/t
user=> t
(clojure.lang.PersistentList$EmptyList clojure.lang.PersistentList)
user=> (apply = t)
false
user=>
So, I'm wondering why is the type of an empty list different from that of non-empty lists and what's the correct way to tell if two things are of same type?
Don't rely on the concrete types of Clojure data structures. They are undocumented implementation details, and you have no guarantee that they won't change in future versions of Clojure.
It is much safer to rely on the abstractions (e.g. as defined by the IPersistentList or ISeq interfaces). These are much less likely to change in ways that might break your code (my understanding is that Rich Hickey is very big on backwards compatibility when it comes to abstractions. If you depend on a concrete implementation, I believe he would say it's your own fault if things break)
But even better, you should use functions in clojure.core such as seq? or list?, depending on exactly what it is you want to detect. Not only are these likely to maintain backwards compatibility for a long time, they also have a chance of working correctly on non-JVM versions of Clojure (e.g. ClojureScript).

Why Clojure idiom prefer to return nil instead of empty list like Scheme?

From a comment on another question, someone is saying that Clojure idiom prefers to return nil rather than an empty list like in Scheme. Why is that?
Like,
(when (seq lat) ...)
instead of
(if (empty? lat)
'() ...)
I can think of a few reasons:
Logical distinction. In Clojure nil means nothing / absence of value. Whereas '() "the empty list is a value - it just happens to be a value that is an empty list. It's quite often conceptually and logically useful to distinguish between the two.
Fit with JVM - the JVM object model supports null references. And quite a lot of Java APIs return null to mean "nothing" or "value not found". So to ensure easy JVM interoperability, it makes sense for Clojure to use nil in a similar way.
Laziness - the logic here is quite complicated, but my understanding is that using nil for "no list" works better with Clojure's lazy sequences. As Clojure is a lazy functional programming language by default, it makes sense for this usage to be standard. See http://clojure.org/lazy for some extra explanation.
"Falsiness" - It's convenient to use nil to mean "nothing" and also to mean "false" when writing conditional code that examines collections - so you can write code like (if (some-map :some-key) ....) to test if a hashmap contains a value for a given key.
Performance - It's more efficient to test for nil than to examine a list to see if it empty... hence adopting this idiom as standard can lead to higher performance idiomatic code
Note that there are still some functions in Clojure that do return an empty list. An example is rest:
(rest [1])
=> ()
This question on rest vs. next goes into some detail of why this is.....
Also note that the union of collection types and nil form a monoid, with concatenation the monoid plus and nil the monoid zero. So nil keeps the empty list semantics under concatenation while also representing a false or "missing" value.
Python is another language where common monoid identities represent false values: 0, empty list, empty tuple.
From The Joy of Clojure
Because empty collections act like true in Boolean contexts, you need an idiom for testing whether there's anything in a collection to process. Thankfully, Clojure provides such a technique:
(seq [1 2 3])
;=> (1 2 3)
(seq [])
;=> nil
In other Lisps, like Common Lisp, the empty list is used to mean nil. This is known as nil punning and is only viable when the empty list is falsey. Returning nil here is clojure's way of reintroducing nil punning.
Since I wrote the comment I will write a answer. (The answer of skuro provides all information but maybe a too much)
First of all I think that more importend things should be in first.
seq is just what everybody uses most of the time but empty? is fine to its just (not (seq lat))
In Clojure '() is true, so normaly you want to return something thats false if the sequence is finished.
if you have only one importend branch in your if an the other returnes false/'() or something like that why should you write down that branch. when has only one branch this is spezially good if you want to have sideeffects. You don't have to use do.
See this example:
(if false
'()
(do (println 1)
(println 2)
(println 3)))
you can write
(when true
(println 1)
(println 2)
(println 3))
Not that diffrent but i think its better to read.
P.S.
Not that there are functions called if-not and when-not they are often better then (if (not true) ...)

common lisp cons creates a list from two symbols, clojure cons requires a seq to cons onto?

(Disclaimer - I'm aware of the significance of Seqs in Clojure)
In common lisp the cons function can be used to combine two symbols into a list:
(def s 'x)
(def l 'y)
(cons s l)
In clojure - you can only cons onto a sequence - cons hasn't been extended to work with two symbols. So you have to write:
(def s 'x)
(def l 'y)
(cons s '(l))
Is there a higher level pattern in Clojure that explains this difference between Common LISP and Clojure?
In Clojure, unlike traditional Lisps, lists are not the primary data structures. The data structures can implement the ISeq interface - which is another view of the data structure it's given - allowing the same functions to access elements in each. (Lists already implement this. seq? checks whether something implements ISeq.(seq? '(1 2)), (seq? [1 2])) Clojure simply acts differently (with good reason), in that when cons is used, a sequence (it's actually of type clojure.lang.Cons) constructed of a and (seq b) is returned. (a being arg 1 and b arg 2) Obviously, symbols don't and can't implement ISeq.
Clojure.org/sequences
Sequences screencast/talk by Rich Hickey However, note that rest has changed, and it's previous behaviour is now in next, and that lazy-cons has been replaced by lazy-seq and cons.
clojure.lang.RT
In Common Lisp CONS creates a so-called CONS cell, which is similar to a record with two slots: the 'car' and the 'cdr'.
You can put ANYTHING into those two slots of a cons cell.
Cons cells are used to build lists. But one can create all kinds of data structures with cons cells: trees, graphs, various types of specialized lists, ...
The implementations of Lisp are highly optimized to provide very efficient cons cells.
A Lisp list is just a common way of using cons cells (see Rainer's description). Clojure is best seen as not having cons cells (although something similar might hide under the hood). The Clojure cons is a misnomer, it should actually just be named prepend.
In Clojure the use of a two-element vector is preferred: [:a :b]. Under the hood such small vectors are implemented as Java arrays and are extremely simple and fast.
A short hand for (cons :a '(:b)) (or (cons :a (cons :b nil))) is list: (list :a :b).
When you say
> (cons 'a 'b)
in common lisp you dont get a list but a dotted pair: (a . b), whereas the result of
> (cons 'a (cons 'b nil))
is the dotted pair (a . ( b . nil)).
In the first list the cdr() of that is not a list, since it is here b and not nil, making it an improper list. Proper lists must be terminated by nil. Therefore higher order functions like mapcar() and friends won't work, but we save a cons-cell. I guess the designers of Clojure removed this feature because of the confusion it could cause.

LISP very simple list question

Im learning lisp and im pretty new at this so i was wondering...
if i do this:
(defparameter *list-1* (list 1 2))
(defparameter *list-2* (list 2 3))
(defparameter *list-3* (append *list-1* *list-2*))
And then
(setf (first *list-2*) 1)
*list-3*
I will get (1 2 1 4)
I know this is because the append is going to "save resources" and create a new list for the first chunk, but will actually just point to the second chunk, coz if i do:
(setf (first *list-1*) 0)
*list-3*
I will get (1 2 1 4) instade of the more logical (0 2 1 4)
So my question is, what other cases are like this in lisp, how do you black belt lispers know how to deal with this stuff that is not intuitive or consistent?
One defensive tactic is to avoid sharing structure.
(defparameter *list-3* (append *list-1* *list-2* '()))
or
(defparameter *list-3* (append *list-1* (copy-list *list-2*)))
Now the structure of the new *list-3* is all new, and modifications to *list-3* won't affect *list-2* and vice versa.
The append function has to make a copy of its first argument, to avoid modifying existing data structures. As a result, you now have two list segments that look like (1 2 ...), but they're part of different lists.
In general, any list can be the tail of any other list, but you can't have a single list object that serves as the head of multiple lists.
You have to think of lists in terms of cons cells. When you define list 1 and list 2, it is like:
(defparameter *list-1* (cons 1 (cons 2 nil)))
(defparameter *list-2* (cons 2 (cons 3 nil)))
Then, when you append:
(defparameter *list-3* (cons 1 (cons 2 *list-2*)))
Basically, a cons cell consists of two parts; a value (the car), and a pointer (the cdr). Append is defined to not change the first list, so that is copied, but then the last cdr (normally nil) is changed to point at the second list, not a copy of the second list. If you were willing to destroy the first list, you would use nconc.
Try this:
(defparameter *list-3* (nconc *list-1* *list-2*))
Then observe the value of *list-1*, it is (1 2 2 3), just like *list-3*.
The general rule is that the non-destructive functions (append) won't destroy existing data, while the destructive functions (nconc) will. What a future destructive function does ((setf cdr)), though, is not the responsibility of the first non-destructive function.
quote:
So my question is, what other cases are like this in lisp, how do you black belt lispers know how to deal with this stuff that is not intuitive or consistent?
I think that you are a bit harsh here with a subject that is quite a bit larger than you imagine. Lists are a rather elaborate concept in Lisp, and you need to understand that this is not some simple array. The standard provides a lot of functions to manipulate lists in every way. What you want in this case is:
(concatenate 'list *list-1* *list-2*)
So, why is there also append? Well, if you can omit copying the last list, and all symbols involved still return the correct data, this can be a significant performance boost in both calculating time and memory footprint. append is especially useful in a functional programming style which doesn't use side effects.
In lieu of further explanation about cons cells, destructive vs. nondestructive functions etc., I'll point you to a nice introduction: Practical Common Lisp, Ch. 12, and for a complete reference, the Common Lisp Hyperspec, look at Chapters 14 and 17.
So my question is, what other cases are like this in lisp, how do you black belt lispers know how to deal with this stuff that is not intuitive or consistent?
By reading the fine manual? Hyperpsec explicitly states:
[...] the list structure of each of lists except the last is copied. The last argument is not copied; it becomes the cdr of the final dotted pair of the concatenation of the preceding lists, or is returned directly if there are no preceding non-empty lists.
Um, primarily we learn how it works, so that what we imagine isn't consistent makes sense.
What you need to do is find some of the old-fashioned block and pointer diagrams, which I can't easily draw, but let's figure it out.
After the first defparameter, you've got list-1, which is
(1 . 2 . nil)
in dot notation; list-2 is
(2 . 3 . nil)