I'm learning Clojure and trying to understand reader, quoting, eval and homoiconicity by drawing parallels to Python's similar features.
In Python, one way to avoid (or postpone) evaluation is to wrap the expression between quotes, eg. '3 + 4'. You can evaluate this later using eval, eg. eval('3 + 4') yielding 7. (If you need to quote only Python values, you can use repr function instead of adding quotes manually.)
In Lisp you use quote or ' for quoting and eval for evaluating, eg. (eval '(+ 3 4)) yielding 7.
So in Python the "quoted" stuff is represented by a string, whereas in Lisp it's represented by a list which has quoteas first item.
My question, finally: why does Clojure allow (eval 3) although 3 is not quoted? Is it just the matter of Lisp style (trying to give an answer instead of error wherever possible) or are there some other reasons behind it? Is this behavior essential to Lisp or not?
The short answer would be that numbers (and symbols, and strings, for example) evaluate to themselves. Quoting instruct lisp (the reader) to pass unevaluated whatever follows the quote. eval then gets that list as you wrote it, but without the quote, and then evaluates it (in the case of (eval '(+ 3 4)), eval will evaluate a function call (+) over two arguments).
What happens with that last expression is the following:
When you hit enter, the expression is evaluated. It contain a normal function call (eval) and some arguments.
The arguments are evaluated. The first argument contains a quote, which tells the reader to produce what is after the quote (the actual (+ 3 4) list).
There are no more arguments, and the actual function call is evaluated. This means calling the eval function with the list (+ 3 4) as argument.
The eval function does the same steps again, finding the normal function + and the arguments, and applies it, obtaining the result.
Other answers have explained the mechanics, but I think the philosophical point is in the different ways lisp and python look at "code". In python, the only way to represent code is as a string, so of course attempting to evaluate a non-string will fail. Lisp has richer data structures for code: lists, numbers, symbols, and so forth. So the expression (+ 1 2) is a list, containing a symbol and two numbers. When evaluating a list, you must first evaluate each of its elements.
So, it's perfectly natural to need to evaluate a number in the ordinary course of running lisp code. To that end, numbers are defined to "evaluate to themselves", meaning they are the same after evaluation as they were before: just a number. The eval function applies the same rules to the bare "code snippet" 3 that the compiler would apply when compiling, say, the third element of a larger expression like (+ 5 3). For numbers, that means leaving it alone.
What should 3 evaluate to? It makes the most sense that Lisp evaluates a number to itself. Would we want to require numbers to be quoted in code? That would not be very convenient and extremely problematic:
Instead of
(defun add-fourtytwo (n)
(+ n 42))
we would have to write
(defun add-fourtytwo (n)
(+ n '42))
Every number in code would need to be quoted. A missing quote would trigger an error. That's not something one would want to use.
As a side note, imagine what happens when you want to use eval in your code.
(defun example ()
(eval 3))
Above would be wrong. Numbers would need to be quoted.
(defun example ()
(eval '3))
Above would be okay, but generating an error at runtime. Lisp evaluates '3 to the number 3. But then calling eval on the number would be an error, since they need to be quoted.
So we would need to write:
(defun example ()
(eval ''3))
That's not very useful...
Numbers have be always self-evaluating in Lisp history. But in earlier Lisp implementations some other data objects, like arrays, were not self-evaluating. Again, since this is a huge source of errors, Lisp dialects like Common Lisp have defined that all data types (other than lists and symbols) are self-evaluating.
To answer this question we need to look at eval definition in lisp. E.g. in CLHS there is definition:
Syntax: eval form => result*
Arguments and Values:
form - a form.
results - the values yielded by the evaluation of form.
Where form is
any object meant to be evaluated.
a symbol, a compound form, or a self-evaluating object.
(for an operator, as in <<operator>> form'') a compound form having that operator as its first element.A quote form is a
constant form.''
In your case number "3" is self-evaluating object. Self-evaluating object is a form that is neither a symbol nor a cons is defined to be a self-evaluating object. I believe that for clojure we can just replace cons by list in this definition.
In clojure only lists are interpreted by eval as function calls. Other data structures and objects are evaluated as self-evaluating objects.
'(+ 3 4) is equal to (list '+ 3 4). ' (transformed by reader to quote function) just avoid evaluation of given form. So in expression (eval '(+ 3 4)) eval takes list data structure ('+ 3 4) as argument.
Related
Macros do not evaluate their arguments until explicitly told to do so, however functions do. In the following code:
(defmacro foo [xs]
(println xs (type xs)) ;; unquoted list
(blah xs))
(defn blah [xs] ;; xs is unquoted list, yet not evaluated
(println xs)
xs)
(foo (+ 1 2 3))
It seems that blah does not evaluate xs, since we still have the entire list: (+ 1 2 3) bound to xs in the body of blah.
I have basically just memorized this interaction between helper functions within macros and their evaluation of arguments, but to be honest it goes against what my instincts are (that xs would be evaluated before entering the body since function arguments are always evaluated).
My thinking was basically: "ok, in this macro body I have xs as the unevaluated list, but if I call a function with xs from within the macro it should evaluate that list".
Clearly I have an embarassingly fundamental misunderstanding here of how things work. What am I missing in my interpretation? How does the evaluation actually occur?
EDIT
I've thought on this a bit more and it seems to me that maybe viewing macro arguments as "implicitly quoted" would solve some confusion on my part.
I think I just got mixed up in the various terminologies, but given that quoted forms are synonymous with unevaluated forms, and given macro arguments are unevaluated, they are implicitly quoted.
So in my above examples, saying xs is unquoted is somewhat misleading. For example, this macro:
(defmacro bluh [xs]
`(+ 1 2 ~xs))
Is basically the same as the below macro (excluding namespacing on the symbols). Resolving xs in the call to list gives back an unevaluated (quoted?) list.
(defmacro bleh [xs]
(list '+ '1 '2 xs)) ;; xs resolves to a quoted list (or effectively quoted)
Calling bleh (or bluh) is the same as saying:
(list '+ '1 '2 '(+ 1 2 3))
;; => (+ 1 2 (+ 1 2 3))
If xs did not resolve to a quoted list, then we would end up with:
(list '+ '1 '2 (+ 1 2 3))
;; => (+ 1 2 6)
So, in short, macro arguments are quoted.
I thnk part of my confusion came from thinking about the syntax quoted forms as templates with slots filled in e.g. (+ 1 2 ~xs) I would mentally expand to (+ 1 2 (+ 1 2 3)), and seeing that (+ 1 2 3) was not quoted in that expansion, I found it confusing that function calls using xs (in the first example above blah) would not evalute immediately to 6.
The template metaphor is helpful, but if I instead look at it as a
shortcut for (list '+ '1 '2 xs) it becomes obvious that xs must be a quoted list otherwise the expansion would include 6 and not the entire list.
I'm not sure why I found this so confusing... have I got this right or did I just go down the wrong path entirely?
[This answer is an attempt to explain why macros and functions which don't evaluate their arguments are different things. I believe this applies to macros in Clojure but I am not an expert on Clojure. It's also much too long, sorry.]
I think you are confused between what Lisp calls macros and a construct which modern Lisps don't have but which used to be called FEXPRs.
There are two interesting, different, things you might want:
functions which, when called, do not immediately evaluate their arguments;
syntax transformers, which are called macros in Lisp.
I'll deal with them in order.
Functions which do not immediately evaluate their arguments
In a conventional Lisp, a form like (f x y ...), where f is a function, will:
establish that f is a function and not some special thing;
get the function corresponding to f and evaluate x, y, & the rest of the arguments in some order specified by the language (which may be 'in an unspecified order');
call f with the results of evaluating the arguments.
Step (1) is needed initially because f might be a special thing (like, say if, or quote), and it might be that the function definition is retrieved in (1) as well: all of this, as well as the order that things happen in in (2) is something the language needs to define (or, in the case of Scheme say, leave explicitly undefined).
This ordering, and in particular the ordering of (2) & (3) is known as applicative order or eager evaluation (I'll call it applicative order below).
But there are other possibilities. One such is that the arguments are not evaluated: the function is called, and only when the values of the arguments are needed are they evaluated. There are two approaches to doing this.
The first approach is to define the language so that all functions work this way. This is called lazy evaluation or normal order evaluation (I'll call it normal order below). In a normal order language function arguments are evaluated, by magic, at the point they are needed. If they are never needed then they may never be evaluated at all. So in such a language (I am inventing the syntax for function definition here so as not to commit CL or Clojure or anything else):
(def foo (x y z)
(if x y z))
Only one of y or z will be evaluated in a call to foo.
In a normal order language you don't need to explicitly care about when things get evaluated: the language makes sure that they are evaluated by the time they're needed.
Normal order languages seem like they'd be an obvious win, but they tend to be quite hard to work with, I think. There are two problems, one obvious and one less so:
side-effects happen in a less predictable order than they do in applicative order languages and may not happen at all, so people used to writing in an imperative style (which is most people) find them hard to deal with;
even side-effect-free code can behave differently than in an applicative order language.
The side-effect problem could be treated as a non-problem: we all know that code with side-effects is bad, right, so who cares about that? But even without side-effects things are different. For instance here's a definition of the Y combinator in a normal order language (this is kind of a very austere, normal order subset of Scheme):
(define Y
((λ (y)
(λ (f)
(f ((y y) f))))
(λ (y)
(λ (f)
(f ((y y) f))))))
If you try to use this version of Y in an applicative order language -- like ordinary Scheme -- it will loop for ever. Here's the applicative order version of Y:
(define Y
((λ (y)
(λ (f)
(f (λ (x)
(((y y) f) x)))))
(λ (y)
(λ (f)
(f (λ (x)
(((y y) f) x)))))))
You can see it's kind of the same, but there are extra λs in there which essentially 'lazify' the evaluation to stop it looping.
The second approach to normal order evaluation is to have a language which is mostly applicative order but in which there is some special mechanism for defining functions which don't evaluate their arguments. In this case there often would need to be some special mechanism for saying, in the body of the function, 'now I want the value of this argument'. Historically such things were called FEXPRs, and they existed in some very old Lisp implementations: Lisp 1.5 had them, and I think that both MACLISP & InterLisp had them as well.
In an applicative order language with FEXPRs, you need somehow to be able to say 'now I want to evaluate this thing', and I think this is the problem are running up against: at what point does the thing decide to evaluate the arguments? Well, in a really old Lisp which is purely dynamically scoped there's a disgusting hack to do this: when defining a FEXPR you can just pass in the source of the argument and then, when you want its value, you just call EVAL on it. That's just a terrible implementation because it means that FEXPRs can never really be compiled properly, and you have to use dynamic scope so variables can never really be compiled away. But this is how some (all?) early implementations did it.
But this implementation of FEXPRs allows an amazing hack: if you have a FEXPR which has been given the source of its arguments, and you know that this is how FEXPRs work, then, well, it can manipulate that source before calling EVAL on it: it can call EVAL on something derived from the source instead. And, in fact, the 'source' it gets given doesn't even need to be strictly legal Lisp at all: it can be something which the FEXPR knows how to manipulate to make something that is. That means you can, all of a sudden, extend the syntax of the language in pretty general ways. But the cost of being able to do that is that you can't compile any of this: the syntax you construct has to be interpreted at runtime, and the transformation happens each time the FEXPR is called.
Syntax transformers: macros
So, rather than use FEXPRs, you can do something else: you could change the way that evaluation works so that, before anything else happens, there is a stage during which the code is walked over and possibly transformed into some other code (simpler code, perhaps). And this need happen only once: once the code has been transformed, then the resulting thing can be stashed somewhere, and the transformation doesn't need to happen again. So the process now looks like this:
code is read in and structure built from it;
this initial structure is possibly transformed into other structure;
(the resulting structure is possibly compiled);
the resulting structure, or the result of compiling it is evaluated, probably many times.
So now the process of evaluation is divided into several 'times', which don't overlap (or don't overlap for a particular definition):
read time is when the initial structure is built;
macroexpansion time is when it is transformed;
compile time (which may not happen) is when the resulting thing is compiled;
evaluation time is when it is evaluated.
Well, compilers for all languages probably do something like this: before actually turning your source code into something that the machine understands they will do all sorts of source-to-source transformations. But these things are in the guts of the compiler and are operating on some representation of the source which is idiosyncratic to that compiler and not defined by the language.
Lisp opens this process to users. The language has two features which make this possible:
the structure that is created from source code once it has been read is defined by the language and the language has a rich set of tools for manipulating this structure;
the structure created is rather 'low commitment' or austere -- it does not particularly predispose you to any interpretation in many cases.
As an example of the second point, consider (in "my.file"): that's a function call of a function called in, right? Well, may be: (with-open-file (in "my.file") ...) almost certainly is not a function call, but binding in to a filehandle.
Because of these two features of the language (and in fact some others I won't go into) Lisp can do a wonderful thing: it can let users of the language write these syntax-transforming functions -- macros -- in portable Lisp.
The only thing that remains is to decide how these macros should be notated in source code. And the answer is the same way as functions are: when you define some macro m you use it just as (m ...) (some Lisps support more general things, such as CL's symbol macros. At macroexpansion time -- after the program is read but before it is (compiled and) run -- the system walks over the structure of the program looking for things which have macro definitions: when it finds them it calls the function corresponding to the macro with the source code specified by its arguments, and the macro returns some other chunk of source code, which gets walked in turn until there are no macros left (and yes, macros can expand to code involving other macros, and even to code involving themselves). Once this process is complete then the resulting code can be (compiled and) run.
So although macro look like function calls in the code, they are not just functions which don't evaluate their arguments, like FEXPRs were: instead they are functions which take a bit of Lisp source code and return another bit of Lisp source code: they're syntax transformers, or function which operate on source code (syntax) and return other source code. Macros run at macroexpansion time which is properly before evaluation time (see above).
So, in fact macros are functions, written in Lisp, and the functions they call evaluate their arguments perfectly conventionally: everything is perfectly ordinary. But the arguments to macros are programs (or the syntax of programs represented as Lisp objects of some kind) and their results are (the syntax of) other programs. Macros are functions at the meta-level, if you like. So a macro if a function which computes (parts of) programs: those programs may later themselves be run (perhaps much later, perhaps never) at which point the evaluation rules will be applied to them. But at the point a macro is called what it's dealing with is just the syntax of programs, not evaluating parts of that syntax.
So, I think your mental model is that macros are something like FEXPRs in which case the 'how does the argument get evaluated' question is an obvious thing to ask. But they're not: they're functions which compute programs, and they run properly before the program they compute is run.
Sorry this answer has been so long and rambling.
What happened to FEXPRs?
FEXPRs were always pretty problematic. For instance what should (apply f ...) do? Since f might be a FEXPR, but this can't generally be known until runtime it's quite hard to know what the right thing to do is.
So I think that two things happened:
in the cases where people really wanted normal order languages, they implemented those, and for those languages the evaluation rules dealt with the problems FEXPRs were trying to deal with;
in applicative order languages then if you want to not evaluate some argument you now do it by explicitly saying that using constructs such as delay to construct a 'promise' and force to force evaluation of a promise -- because the semantics of the languages improved it became possible to implement promises entirely in the language (CL does not have promises, but implementing them is essentially trivial).
Is the history I've described correct?
I don't know: I think it may be but it may also be a rational reconstruction. I certainly, in very old programs in very old Lisps, have seen FEXPRs being used the way I describe. I think Kent Pitman's paper, Special Forms in Lisp may have some of the history: I've read it in the past but had forgotten about it until just now.
A macro definition is a definition of a function that transforms code. The input for the macro function are the forms in the macro call. The return value of the macro function will be treated as code inserted where the macro form was. Clojure code is made of Clojure data structures (mostly lists, vectors, and maps).
In your foo macro, you define the macro function to return whatever blah did to your code. Since blah is (almost) the identity function, it just returns whatever was its input.
What is happening in your case is the following:
The string "(foo (+ 1 2 3))" is read, producing a nested list with two symbols and three integers: (foo (+ 1 2 3)).
The foo symbol is resolved to the macro foo.
The macro function foo is invoked with its argument xs bound to the list (+ 1 2 3).
The macro function (prints and then) calls the function blah with the list.
blah (prints and then) returns that list.
The macro function returns the list.
The macro is thus “expanded” to (+ 1 2 3).
The symbol + is resolved to the addition function.
The addition function is called with three arguments.
The addition function returns their sum.
If you wanted the macro foo to expand to a call to blah, you need to return such a form. Clojure provides a templating convenience syntax using backquote, so that you do not have to use list etc. to build the code:
(defmacro foo [xs]
`(blah ~xs))
which is like:
(defmacro foo [xs]
(list 'blah xs))
In clojure, we can use unquote slicing ~# to spread the list. For example
(macroexpand `(+ ~#'(1 2 3)))
expands to
(clojure.core/+ 1 2 3)
This is useful feature in macros when rearranging the syntax. But is it possible to use unquote slicing or familiar technique outside of macro and without eval?
Here is the solution with eval
(eval `(+ ~#'(1 2 3))) ;-> 6
But I would rather do
(+ ~#'(1 2 3))
Which unfortunately throws an error
IllegalStateException Attempting to call unbound fn: #'clojure.core/unquote-splicing clojure.lang.Var$Unbound.throwArity (Var.java:43)
At first I thought apply would do it, and it is indeed the case with functions
(apply + '(1 2 3)) ; -> 6
However, this is not the case with macros or special forms. It's obvious with macros, as it's expanded before apply and must be first element in the form anyway. With special forms it's not so obvious though, but still makes sense as they aren't first class citizens as functions are. For example the following throws an error
(apply do ['(println "hello") '(println "world")]) ;-> error
Is the only way to "apply" list to special form at runtime to use unquote slicing and eval?
Clojure has a simple model of how programs are loaded and executed. Slightly simplified, it goes something like this:
some source code is read from a text stream by the reader;
this is passed to the compiler one form at a time;
the compiler expands any macros it encounters;
for non-macros, the compiler applies various simple evaluation rules (special rules for special forms, literals evaluate to themselves, function calls are compiled as such etc.);
the compiled code is evaluated and possibly changes the compilation environment used by the following forms.
Syntax quote is a reader feature. It is replaced at read time by code that emits list structure:
;; note the ' at the start
user=> '`(+ ~#'(1 2 3))
(clojure.core/seq
(clojure.core/concat (clojure.core/list (quote clojure.core/+)) (quote (1 2 3))))
It is only in the context of syntax-quoted blocks that the reader affords ~ and ~# this special handling, and syntax-quoted blocks always produce forms that may call a handful of seq-building functions from clojure.core and are otherwise composed from quoted data.
This all happens as part of step 1 from the list above. So for syntax-quote to be useful as an apply-like mechanism, you'd need it to produce code in the right shape at that point in the process that would then look like the desired "apply result" in subsequent steps. As explained above, syntax-quote always produces code that creates list structure, and in particular it never returns unquoted expressions that look like unquoted dos or ifs etc., so that's impossible.
This isn't a problem, since the code transformations that are reasonable given the above execution model can be implemented using macros.
Incidentally, the macroexpand call is actually superfluous in your example, as the syntax-quoted form already is the same as its macroexpansion (as it should be, since + is not a macro):
user=> `(+ ~#'(1 2 3))
(clojure.core/+ 1 2 3)
Some literature say "the first subform of the following form..." or "to evaluate a form..." while some other literature say "To evaluate an expression...", and most literature seem to use both terms. Are the two terms interchangeable? Is there a difference in meaning?
Summary
A form is Lisp code as data. An expression is data as text.
See the Glossary entries in the Common Lisp standard:
form
expression
Explanation
In Common Lisp form and expression have two different meanings and it is useful to understand the difference.
A form is an actual data object inside a running Lisp system. The form is valid input for the Lisp evaluator.
EVAL takes a form as an argument.
The syntax is:
eval form => result*
EVAL does not get textual input in the form of Lisp expressions. It gets forms. Which is Lisp data: numbers, strings, symbols, programs as lists, ...
CL-USER 103 > (list '+ 1 2)
(+ 1 2)
Above constructs a Lisp form: here a list with the symbol + as the first element and the numbers 1 and 2 as the next elements. + names a function and the two numbers are the arguments. So it is a valid function call.
CL-USER 104 > (eval (list '+ 1 2))
3
Above gives the form (+ 1 2) as data objects to EVAL and computes a result. We can not see forms directly - we can let the Lisp system create printed representations for us.
The form is really a Lisp expression as a data object.
This is slightly unusual, since most programming languages are defined by describing textual input. Common Lisp describes data input to EVAL. Forms as data structures.
The following creates a Lisp form when evaluated:
"foo" ; strings evaluate to themselves
'foo ; that evaluates to a symbol, which then denotes a variable
123
(list '+ 1 2) ; evaluates to a list, which describes a function call
'(+ 1 2) ; evaluates to a list, which describes a function call
Example use:
CL-USER 105 > (defparameter foo 42)
FOO
CL-USER 106 > (eval 'foo)
42
The following are not creating valid forms:
'(1 + 2) ; Lisp expects prefix form
(list 1 '+ 2) ; Lisp expects prefix form
'(defun foo 1 2)' ; Lisp expects a parameter list as third element
Example:
CL-USER 107 > (eval '(1 + 2))
Error: Illegal argument in functor position: 1 in (1 + 2).
The expression is then usually used for a textual version of Lisp data object - which is not necessarily code. Expressions are read by the Lisp reader and created by the Lisp printer.
If you see Lisp data on your screen or a piece of paper, then it is an expression.
(1 + 2) ; is a valid expression in a text, `READ` can read it.
The definitions and uses of these terms vary by Lisp dialect and community, so there is no clear answer to your question for Lisps in general.
For their use in Common Lisp, see Rainers detailed answer. To give a short summary:
The HyperSpec entry for form:
form n. 1. any object meant to be evaluated. 2. a symbol, a compound
form, or a self-evaluating object. 3. (for an operator, as in
<<operator>> form'') a compound form having that operator as its
first element.A quote form is a constant form.''
The HyperSpec entry for expression:
expression n. 1. an object, often used to emphasize the use of the
object to encode or represent information in a specialized format,
such as program text. The second expression in a let form is a list
of bindings.'' 2. the textual notation used to notate an object in a
source file.The expression 'sample is equivalent to (quote
sample).''
So, according to the HyperSpec, expression is used for the (textual) representation, while form is used for Lisp objects to be evaluated. But, as I said above, this is only the definition of those terms in the context of the HyperSpec (and thus Common Lisp).
In Scheme, however, the R5RS doesn't mention form at all, and talks about expressions only. The R6RS even gives a definition that almost sounds like the exact opposite of the above:
At the purely syntactical level, both are forms, and form is the
general name for a syntactic part of a Scheme program.
(Talking about the difference between (define …) and (* …).)
This is by no means a scientific or standards-based answer, but the distinction that I have built up in my own head based on things I've heard is more along the lines of: an expression is a form which will be (or can be) evaluated in the final program.
So for instance, consider the form (lambda (x) (+ x 1)). It is a list of three elements: the symbol lambda, the list (x), and the list (+ x 1). All of those elements are forms, but only the last is an expression, because it is "intended" for evaluation; the first two forms are shuffled around by the macroexpander but never evaluated. The outermost form (lambda (x) (+ x 1)) is itself an expression as well.
This seems to me to be an interesting distinction, but it does mean it is context-sensitive: (x) is always a form, and may or may not be an expression depending on context.
I'm new to Lisp. I encountered 2 terms "list" and "S-expression". I just can't distinguish between them. Are they just synonyms in Lisp?
First, not all S-expressions represent lists; an expression such as foobar, representing a bare atom, is also considered an S-expression. As is the "cons cell" syntax, (car . cons), used when the "cons" part is not itself another list (or nil). The more familiar list expression, such as (a b c d), is just syntactic sugar for a chain of nested cons cells; that example expands to (a . (b . (c . (d . nil)))).
Second, the term "S-expression" refers to the syntax - (items like this (possibly nested)). Such an S-expression is the representation in Lisp source code of a list, but it's not technically a list itself. This distinction is the same as that between a sequence of decimal digits and their numeric value, or between a sequence of characters within quotation marks and the resulting string.
That is perhaps an overly technical distinction; programmers routinely refer to literal representations of values as though they were the values themselves. But with Lisp and lists, things get a little trickier because everything in a Lisp program is technically a list.
For example, consider this expression:
(+ 1 2)
The above is a straightforward S-expression which represents a flat list, consisting of the atoms +, 1, and 2.
However, within a Lisp program, such a list will be interpreted as a call to the + function with 1 and 2 as arguments. (Do note that is the list, not the S-expression, that is so interpreted; the evaluator is handed lists that have been pre-parsed by the reader, not source code text.)
So while the above S-expression represents a list, it would only rarely be referred to as a "list" in the context of a Lisp program. Unless discussing macros, or the inner workings of the reader, or engaged in a metasyntactic discussion because of some other code-generation or parsing context, a typical Lisp programmer would instead treat the above as a numeric expression.
On the other hand, any of the following S-expressions likely would be referred to as "lists", because evaluating them as Lisp code would produce the list represented by the above literal S-expression as a runtime value:
'(+ 1 2)
(quote (+ 1 2))
(list '+ 1 2)
Of course, the equivalence of code and data is one of the cool things about Lisp, so the distinction is fluid. But my point is that while all of the above are S-expressions and lists, only some would be referred to as "lists" in casual Lisp-speak.
S-expressions are a notation for data.
Historically an s-expression (short for symbolic expression) is described as:
symbols like FOO and BAR
cons cells with s-expressions as its first and second element : ( expression-1 . expression-2 )
the list termination symbol NIL
and a convention to write lists: ( A . ( B . NIL ) ) is simpler written as the list (A B)
Note also that historically program text was written differently. An example for the function ASSOC.
assoc[x;y] =
eq[caar[y];x] -> cadar[y];
T -> assoc[x;cdr[y]]
Historically there existed also a mapping from these m-expressions (short for meta expressions) to s-expressions. Today most Lisp program code is written using s-expressions.
This is described here: McCarthy, Recursive Functions of Symbolic Expressions
In a Lisp programming language like Common Lisp nowadays s-expressions have more syntax and can encode more data types:
Symbols: symbol123, |This is a symbol with spaces|
Numbers: 123, 1.0, 1/3, ...
Strings: "This is a string"
Characters: #\a, #\space
Vectors: #(a b c)
Conses and lists: ( a . b ), (a b c)
Comments: ; this is a comment, #| this is a comment |#
and more.
Lists
A list is a data structure. It consists of cons cells and a list end marker. Lists have in Lisp a notation as lists in s-expressions. You could use some other notations for lists, but in Lisp one has settled on the s-expression syntax to write them.
Side note: programs and forms
In a programming language like Common Lisp, the expressions of the programming language are not text, but data! This is different from many other programming languages. Expressions in the programming language Common Lisp are called Lisp forms.
For example a function call is Lisp data, where the call is a list with a function symbol as its first element and the next elements are its arguments.
We can write that as (sin 3.0). But it really is data. Data we can also construct.
The function to evaluate Lisp forms is called EVAL and it takes Lisp data, not program text or strings of program text. Thus you can construct programs using Lisp functions which return Lisp data: (EVAL (LIST 'SIN 3.0)) evaluates to 0.14112.
Since Lisp forms have a data representation, they are usually written using the external data representation of Lisp - which is what? - s-expressions!
It is s-expressions. Lisp forms as Lisp data are written externally as s-expression.
You should first understand main Lisp feature - program can be manipulated as data. Unlike other languages (like C or Java), where you write program by using special syntax ({, }, class, define, etc.), in Lisp you write code as (nested) lists (btw, this allows to express abstract syntactic trees directly). Once again: you write programs that look just like language's data structures.
When you talk about it as data, you call it "list", but when you talk about program code, you should better use term "s-expression". Thus, technically they are similar, but used in different contexts. The only real place where these terms are mixed is meta-programming (normally with macros).
Also note that s-expression may also consist of the only atom (like numbers, strings, etc.).
A simple definition for an S-expression is
(define S-expression?
(λ (object)
(or (atom? object) (list? object))))
;; Where atom? is:
(define atom?
(λ (object)
(and (not (pair? object)) (not (null? object)))))
;; And list? is:
(define list? (λ (object)
(let loop ((l1 object) (l2 object))
(if (pair? l1)
(let ((l1 (cdr l1)))
(cond ((eq? l1 l2) #f)
((pair? l1) (loop (cdr l1) (cdr l2)))
(else (null? l1))))
(null? l1)))))
Both are written in similar way: (blah blah blah), may be nested. with one difference - lists are prefixed with apostrophe.
On evaluation:
S-expression returns some result (may be an atom or list or nil or whatever)
Lists return Lists
If we need, we can convert lists to s-exp and vice versa.
(eval '(blah blah blah)) => list is treated as an s-exp and a result is returned.
(quote (blah blah blah)) => sexp is converted to list and the list is returned without evaluating
IAS:
If a List is treated as data it is called List, if it is treated as code it is called s-exp.
Quoting in clojure results in non-evaluation. ':a and :a return the same result. What is the difference between ':a and :a ? One is not evaluated and other evaluates to itself... but is this same as non-evaluation ?
':a is shorthand for (quote :a).
(eval '(quote form)) returns form by definition. That is to say, if the Clojure function eval receives as its argument a list structure whose first element is the symbol quote, it returns the second element of said list structure without transforming it in any way (thus it is said that the quoted form is not evaluated). In other words, the behaviour eval dispatches to when its argument is a list structure of the form (quote foo) is that of returning foo unchanged, regardless of what it is.
When you write down the literal :a in your programme, it gets read in as the keyword :a; that is, the concrete piece of text :a gets converted to an in-memory data structure which happens to be called the :a keyword (Lisp being homoiconic means that occasionally it is hard to distinguish between the textual representation of Lisp data and the data itself, even when this would be useful for explanatory purposes...).
The in-memory data structure corresponding to the literal :a is a Java object which exposes a number of methods etc. and which has the interesting property that the function eval, when it receives this data object as an argument, returns it unchanged. In other words, the keyword's "evaluation to itself" which you ask about is just the behaviour eval dispatches to when passed in a keyword as an argument.
Thus when eval sees ':a, it treats it as a quoted form and returns the second part thereof, which happens to be :a. When, on the other hand, eval sees :a, it treats it as a keyword and returns it unchanged. The return value is the same in both cases (it's just the keyword :a); the evaluation process is slightly different.
Clojure semantics -- indeed Lisp semantics, for any dialect of Lisp -- are specified in terms of the values returned by and side-effects caused by the function eval when it receives various Lisp data structures as arguments. Thus the above explains what's actually meant to happen when you write down ':a or :a in your programme (code like (println :a) may get compiled into efficient bytecode which doesn't actually code the function eval, of course; but the semantics are always preserved, so that it still acts as if it was eval receiving a list structure containing the symbol println and the keyword :a).
The key idea here is that regardless of whether the form being evaluated is ':a or :a, the keyword data structure is constructed at read time; then when one of these forms is evaluated, that data structure is returned unchanged -- although for different reasons.