How does clojure evaluate code at compile time?

How does clojure evaluate code at compile time? - clojure

Here are two macros I had written
(defmacro hello [x] '(+ 1 2))
&
(defmacro hello [x] (eval '(+ 1 2)))
On macroexpanding the first one, I get (+ 1 2), and while macroexpanding the second, I get the value 3. Does this mean the addition happened at compile time? How is that even possible? What if instead of '(+ 1 2) I had written a function that queries a db. Would it query the db at compile time?

A macro injects arbitrary code into the compiler. Usually, the purpose is to "pre-process" custom code like (1 + 2) into something Clojure understands like (+ 1 2). However, you could include anything (including DB access) into the compilation phase if you really want to. After all the compiler is just a piece of software running on a general-purpose computer. Since it is open-source, you could modify the compiler code directly to do anything.
Using a macro is just a more convenient way of modifying/extending the base compiler code, that is optimized for extending the core Clojure language. However, macros are not limited to that use-case (if you really want to get crazy).
There is a similar ability using the C++ expression template mechanism, which is a Turing Complete compiler pre-processor. A famous example was to use the compiler to compute the first several prime numbers as "error" messages. See http://aszt.inf.elte.hu/~gsd/halado_cpp/ch06s04.html#Static-metaprogramming

Macro body is executed during compile time and its return value is used to replace its usage in the code and this new code will be compiled.
Thus your macro code:
(defmacro hello [x] (eval '(+ 1 2)))
will actually execute eval with the form '(+ 1 2) during compilation and the result value of that expression (3) will be returned as the result of the macro replacing its usage (e.g. (hello 0)).

Related

Use of eval inside macro (global vars unexpectedly resolved)

Recently I came across a use for eval within a macro, which I understand is a bit of a faux pas but let's ignore that for now. What I found surprising, was that eval was able to resolve global vars at macroexpansion time. Below is a contrived example, just to illustrate the situation I'm referring to:
(def list-of-things (range 10))
(defmacro force-eval [args]
(apply + (eval args)))
(macroexpand-1 '(force-eval list-of-things))
; => 45
I would have expected args to resolve to the symbol list-of-things inside force-eval, and then list-of-things to be evaluated resulting in an error due to it being unbound:
"unable to resolve symbol list-of-things in this context"
However, instead list-of-things is resolved to (range 10) and no error is thrown - the macroexpansion succeeds.
Contrast this with attempting to perform the same macroexpansion, but within a local binding context:
(defmacro force-eval [args]
(apply + (eval args)))
(let [list-of-things (range 10)]
(macroexpand-1 '(force-eval list-of-things)))
; => Unable to resolve symbol: list-of-thingss in this context
Note in the above examples I'm assuming list-of-things is not previously bound, e.g. a fresh REPL. One final example illustrates why this is important:
(defmacro force-eval [args]
(apply + (eval args)))
(def list-of-things (range 10 20))
(let [list-of-thing (range 10)]
(macroexpand-1 '(force-eval list-of-things)))
; => 145
The above example shows that the locals are ignored, which is expected behavior for eval, but is a bit confusing when you are expecting the global to not be available at macroexpansion time either.
I seem to have a misunderstanding about what exactly is available at macroexpansion time. I had previously thought that essentially any binding, be it global or local, would not be available until runtime. Apparently this is an incorrect assumption. Is the answer to my confusion simply that global vars are available at macroexpansion time? Or am I missing some further nuance here?
Note: this related post closely describes a similar problem, but the focus there is more on how to avoid inappropriate use of eval. I'm mainly interested in understanding why eval works in the first example and by extension what's available to eval at macroexpansion time.

Of course, vars must be visible at compile time. That's where functions like first and + are stored. Without them, you couldn't do anything.
But keep in mind that you have to make sure to refer to them correctly. In the repl, *ns* will be bound, and so a reference to a symbol will look in the current namespace. If you are running a program through -main instead of the repl, *ns* will not be bound, and only properly qualified vars will be found. You can ensure that you qualify them correctly by using
`(force-eval list-of-things)
instead of
'(force-eval list-of-things)
Note I do not distinguish between global vars and non-global vars. All vars in Clojure are global. Local bindings are not called vars. They're called locals, or bindings, or variables, or some combination of those words.

Clojure is designed with an incremental compilation model. This is poorly documented.
In C and other traditional languages, source code must be compiled, then linked with pre-compiled libraries before the final result can be executed. Once execution begins, no changes to the code can occur until the program is terminated, when new source code can be compiled, linked, then executed. Java is normally used in this manner just like C.
With the Clojure REPL, you can start with zero source code in a live executing environment. You can call existing functions like (+ 2 3), or you can define new functions and variables on the fly (both global & local), and redefine existing functions. This is only possible because core Clojure is already available (i.e. clojure.core/+ etc is already "installed"), so you can combine these functions to define your own new functions.
The Clojure "compiler" works just like a giant REPL session. It reads and evaluates forms from your source code files one at a time, incrementally adding them the the global environment. Indeed, it is a design goal/requirement that the result of compiling and executing source code is identical to what would occur if you just pasted each entire source code file into the REPL (in proper dependency order).
Indeed, the simplest mental model for code execution in Clojure is to pretend it is an interpreter instead of a traditional compiler.

And eval in a macro makes no sense.
Because:
a macro already implicitely contains an eval
at the very final step.
If you use macroexpand-1, you make visible how the code was manipulated in the macro before the evocation of the implicite eval inside the macro.
An eval in a macro is an anti-pattern which might indicate that you should use a function instead of a macro - and in your examle this is exactly the case.
So your aim is to dynamically (in run-time) evoke sth in a macro. This you can only do through an eval applied over a macro call OR you should rather use a function.
(defmacro force-eval [args]
(apply + (eval args)))
;; What you actually mean is:
(defn force-eval [args]
(apply + args))
;; because a function in lisp evaluates its arguments
;; - before applying the function body.
;; That means: args in the function body is exactly
;; `(eval args)`!
(def list-of-things (range 10))
(let [lit-of-things (range 10 13)]
(force-eval list-of-things))
;; => 45
;; so this is exactly the behavior you wanted!
The point is, your construct is a "bad" example for a macro.
Because apply is a special function which allows you to
dynamically rearrange function call structures - so it has
some magic of macros inside it - but in run-time.
With apply you can do quite some meta programming in some cases when you just quote some of your input arguments.
(Try (force-eval '(1 2 3)) it returns 6. Because the (1 2 3) is put together with + at its front by apply and then evaluated.)
The second point - I am thinking of this answer I once gave and this to a dynamic macro call problem in Common Lisp.
In short: When you have to control two levels of evaluations inside a macro (often when you want a macro inject some code in runtime into some code), you need too use eval when calling the macro and evaluate those parts in the macro call which then should be processed in the macro.

Trouble understanding a simple macro in Clojure - passing macro as stand-in for higher order function to map

I've been into Clojure lately and have avoided macros up until now, so this is my first exposure to them. I've been reading "Mastering Clojure Macros", and on Chapter 3, page 28, I encountered the following example:
user=> (defmacro square [x] `(* ~x ~x))
;=> #'user/square
user=> (map (fn [n] (square n)) (range 10))
;=> (0 1 4 9 16 25 36 49 64 81)
The context is the author is explaining that while simply passing the square macro to map results in an error (can't take value of a macro), wrapping it in a function works because:
when the anonymous function (fn [n] (square n)) gets compiled, the
square expression gets macroexpanded, to (fn [n] (clojure.core/* n
n)). And this is a perfectly reasonable function, so we don’t have any
problems with the compiler
This makes sense to me if we assume the body of the function is evaluated before runtime (at compile, or "definition" time) thus expanding the macro ahead of runtime. However, I always thought that function bodys were not evaluated until runtime, and at compile time you would basically just have a function object with some knowledge of it's lexical scope (but no knowledge of its body).
I'm clearly mixed up on the compile/runtime semantics here, but when I look at this sample I keep thinking that square won't be expanded until map forces it's call, since it's in the body of the anonymous function, which I thought would be unevaluated until runtime. I know my thinking is wrong, because if that was the case, then n would be bound to each number in (range 10), and there wouldn't be an issue.
I know it's a pretty basic question, but macros are proving to be pretty tricky for me to fully wrap my head around at first exposure!

Generally speaking function bodies aren't evaluated at compile time, but macros are always evaluated at compile time because they're always expanded at compile time whether inside a function or not.
You can write a macro that expands to a function, but you still can't refer/pass the macro around as if it were a function:
(defmacro inc-macro [] `(fn [x#] (inc x#)))
=> #'user/inc-macro
(map (inc-macro) [1 2 3])
=> (2 3 4)

defmacro is expanded at compile time, so you can think of it as a function executed during compilation. This will replace every occurrence of the macro "call" with the code it "returns".

You may consider a macros as a syntax rule. For example, in Scheme, a macros is declared with define-syntax form. There is a special step in compiler that substitutes all the macros calls into their content before compiling the code. As a result, there won't be any square calls in your final code. Say, if you wrote something like
(def value (square 3))
the final version after expansion would be
(def value (clojure.core/* 3 3))
There is a special way to check what will be the body of your macros after being expanded:
user=> (defmacro square [x] `(* ~x ~x))
#'user/square
user=> (macroexpand '(square 3))
(clojure.core/* 3 3)
That's why a macros is an ephemeral thing that lives only in source code but not in the compiled version of it. That's why it cannot be passed as a value or referenced somehow.
The best rule regarding macroses is: try to avoid them until you really need them in your work.

If there is no distinct between read, compile and runtime in Lisp, can someone give me some intuitive examples?

As I read blog Revenge of the nerds, It says (in what made Lisp different section):
The whole language there all the time. There is no real distinction between read-time, compile-time, and runtime. You can compile or run code while reading, read or run code while compiling, and read or compile code at runtime.
Running code at read-time lets users reprogram Lisp's syntax; running code at compile-time is the basis of macros; compiling at runtime is the basis of Lisp's use as an extension language in programs like Emacs; and reading at runtime enables programs to communicate using s-expressions, an idea recently reinvented as XML.
In order to understand this sentence, I draw a statechart diagram:
I have two questions:
how to understand to read at runtime enable programming to communicate using s-expression, an idea reinvented as XML
what can we do when compiling at read time or reading at compile time？

XML let you exchange data at runtime between programs (or between different invocations of the same program). The same goes for JSON, which is a subset of Javascript and a little closer to Lisp in spirit. However, Common Lisp gives more control over how the different steps are executed; as explained in the quotes, you can reuse the same tools as your Lisp environment instead of building a framework like other languages need to.
Basically, you print data to a file:
(with-open-file (out file :direction :output)
(write data :stream out :readably t))
... and you restore it later :
(with-open-file (in file) (read in))
You call that "serialization" or "marshalling" in other languages (and in fact, in some Lisp libraries).
The READ step can be customized: you can read data written in a custom syntax (JSON.parse accepts a reviver function, so it is a little bit similar; the Lisp reader works for normal code too). For example, the local-time library has a special syntax for dates that can be used to rebuild a date object from a stream.
In practice, this is a bit more complex because not all data has a simple readable form (how do you save a network connection?), but you can write forms that can restore the information when you load it (e.g. restore a connection). So Lisp allows you to customize READ and PRINT, with readtables and PRINT-OBJECT, but there is also LOAD-TIME-VALUE and MAKE-LOAD-FORM, which allows you to allocate objects and initialize them when loading code. All of this is already available in the language, but there are also libraries that make things even easier, like cl-conspack: you just store classes into files and load them back without having to define anything special (assuming you save all slots). This works well thanks to the meta-object protocol.

Common Lisp
READ is a function which reads s-expressions and returns Lisp data.
CL-USER> (read)
(defun example (n) (if (zerop n) 1 (* (example (1- n)) n))) ; <- input
(DEFUN EXAMPLE (N) (IF (ZEROP N) 1 (* (EXAMPLE (1- N)) N))) ; <- output
The last result again:
CL-USER> *
(DEFUN EXAMPLE (N) (IF (ZEROP N) 1 (* (EXAMPLE (1- N)) N)))
Setting the variable code to the last result.
CL-USER> (setf code *)
(DEFUN EXAMPLE (N) (IF (ZEROP N) 1 (* (EXAMPLE (1- N)) N)))
What is the third element?
CL-USER> (third code)
(N)
We can evaluate this list, since it looks like valid Lisp code:
CL-USER> (eval code)
EXAMPLE
The function EXAMPLE has been defined. Let's get the function object:
CL-USER> (function example)
#<interpreted function EXAMPLE 21ADA5B2>
It's an interpreted function. We use a Lisp interpreter.
Let's use the function by mapping it over a list:
CL-USER> (mapcar (function example) '(1 2 3 4 5))
(1 2 6 24 120)
Let's compile the function:
CL-USER> (compile 'example)
EXAMPLE
NIL
NIL
The function has been compiled successfully. The compiler has no warnings and the function should now run much faster.
Let's use it again:
CL-USER> (mapcar (function example) '(1 2 3 4 5))
(1 2 6 24 120)
This is the same result, but probably much faster computed.
Since it is now compiled, let's disassemble the function:
CL-USER> (disassemble #'example)
0 : #xE3E06A03 : mvn tmp1, #12288
4 : #xE18D6626 : orr tmp1, sp, tmp1, lsr #12
8 : #xE5166030 : ldr tmp1, [tmp1, #-48]
... and a lot more lines of ARM assembler machine code

Clojure, unquote slicing outside of syntax quote

In clojure, we can use unquote slicing ~# to spread the list. For example
(macroexpand `(+ ~#'(1 2 3)))
expands to
(clojure.core/+ 1 2 3)
This is useful feature in macros when rearranging the syntax. But is it possible to use unquote slicing or familiar technique outside of macro and without eval?
Here is the solution with eval
(eval `(+ ~#'(1 2 3))) ;-> 6
But I would rather do
(+ ~#'(1 2 3))
Which unfortunately throws an error
IllegalStateException Attempting to call unbound fn: #'clojure.core/unquote-splicing clojure.lang.Var$Unbound.throwArity (Var.java:43)
At first I thought apply would do it, and it is indeed the case with functions
(apply + '(1 2 3)) ; -> 6
However, this is not the case with macros or special forms. It's obvious with macros, as it's expanded before apply and must be first element in the form anyway. With special forms it's not so obvious though, but still makes sense as they aren't first class citizens as functions are. For example the following throws an error
(apply do ['(println "hello") '(println "world")]) ;-> error
Is the only way to "apply" list to special form at runtime to use unquote slicing and eval?

Clojure has a simple model of how programs are loaded and executed. Slightly simplified, it goes something like this:
some source code is read from a text stream by the reader;
this is passed to the compiler one form at a time;
the compiler expands any macros it encounters;
for non-macros, the compiler applies various simple evaluation rules (special rules for special forms, literals evaluate to themselves, function calls are compiled as such etc.);
the compiled code is evaluated and possibly changes the compilation environment used by the following forms.
Syntax quote is a reader feature. It is replaced at read time by code that emits list structure:
;; note the ' at the start
user=> '`(+ ~#'(1 2 3))
(clojure.core/seq
(clojure.core/concat (clojure.core/list (quote clojure.core/+)) (quote (1 2 3))))
It is only in the context of syntax-quoted blocks that the reader affords ~ and ~# this special handling, and syntax-quoted blocks always produce forms that may call a handful of seq-building functions from clojure.core and are otherwise composed from quoted data.
This all happens as part of step 1 from the list above. So for syntax-quote to be useful as an apply-like mechanism, you'd need it to produce code in the right shape at that point in the process that would then look like the desired "apply result" in subsequent steps. As explained above, syntax-quote always produces code that creates list structure, and in particular it never returns unquoted expressions that look like unquoted dos or ifs etc., so that's impossible.
This isn't a problem, since the code transformations that are reasonable given the above execution model can be implemented using macros.
Incidentally, the macroexpand call is actually superfluous in your example, as the syntax-quoted form already is the same as its macroexpansion (as it should be, since + is not a macro):
user=> `(+ ~#'(1 2 3))
(clojure.core/+ 1 2 3)

Macros can't return anonymous functions? Clojure

I have the following macro:
(defmacro anon-mac [value]
#(+ value 1))
I would expect for this to behave as such:
((anon-mac 1) 1) ;=> 2
However I get this error:
IllegalArgumentException No matching ctor found for class user$anon_mac$fn__10767 clojure.lang.Reflector.invokeConstructor (Reflector.java:163)
What should I do to be able to have this macro return an anonymous function that works as I would expect?
The answer must be a macro. Seeing as my question concerns the ability of macros to return anonymous functions
Why must the answer be a macro? In my case it is because I do not want this conversion to be called more than once at compile time wherever it is found. If I were to have this conversion in a for loop that called it 200 times, with a function the conversion would be run 200 times. However seeing as macros edit the code itself it would only be run once for that for loop.

I simply needed to escape the function while evaluating the inner variable like so:
(defmacro anon-mac [value] `#(+ % ~value))

Not sure what you are using it for, but if you wanted to go with a function, you might find partial helpful, as it provides the behavior that you are after.
(defn anon-partial [val] (partial + val))
((anon-partial 1) 1) ;;=> 2
There are also some helpful examples of partial at clojuredocs.org.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js