Below is a simple Clojure app example created with lein new mw:
(ns mw.core
(:gen-class))
(def fs (atom {}))
(defmacro op []
(swap! fs assoc :macro-f "somevalue"))
(op)
(defn -main [& args]
(println #fs))
and in project.clj I have
:profiles {:uberjar {:aot [mw.core]}}
:main mw.core
When run in REPL, evaluating #fs returns {:macro-f somevalue}. But, running an uberjar yields {}. If I change op definition to defn instead of defmacro, then fs has again proper content when run from an uberjar. Why is that?
I vaguely realize that this has something to do with AOT compilation and the fact that macro-expansion occurs before the compilation phase, but clearly my understanding of these things is lacking.
I ran into this issue while trying to deploy an application that uses a very nice mixfix library, in which mixfix operators are defined using a global atom. It took me quite a long time to isolate the issue to the example presented above.
Any help will be greatly appreciated.
Thanks!
The real problem here is that your macro is incorrect. You forgot to add backquote character:
(defmacro op []
`(swap! fs assoc :macro-f "somevalue"))
; ^ syntax-quote ("backquote")
This operation is called syntax-quote and it's very important here, because macros in clojure modify your code during its compilation.
So, as a result you've got an impure macro, modifying fs atom whenever your code is compiled.
Since your macro doesn't produce any code, (op) call in your example does nothing at all (only it's compilation do). It appear to be working in REPL because compilation and execution is handled by the same clojure instance (see Timur's answer for details).
This is indeed related to the AOT, and the fact that some side effects are expected when a top-level code is executed - here at macro expansion time. The difference between the lein repl (or lein run) and the uberjar is in when exactly this happens.
When lein repl is executed, REPL starts and then loads the mw.core namespace automatically, if it is defined in project.clj, or one does it manually. When namespace is loaded, first the atom is defined, then macro is expanded and this expansion changes the value of the atom. All this happens in same runtime environment (in REPL process), and after the module is loaded, atom has an updated value in this REPL. Executing lein run will do pretty much the same - load namespace and then execute -main function in the same process.
And when lein uberjar is executed - same thing happens and this is the problem now. Compiler, in order to compile the clj file will first load it and evaluate the top level (I learned it myself from this SO answer). So the module is loaded, top level is evaluated, macro is expanded, reference value is changed and then, after compilation completes, the compiler process, the one where reference value just changed, ends. Now, when the uberjar is executed with java -jar this spawns the new process, with a compiled code, where the macro is already expended (so (op) is already "replaced" with the code the op macro generated, which is none in this case). Therefore, atom value is unchanged.
In my opinion, good fix would be to not rely on side effects in a macro.
If stick to the macro anyway, the way to make this idea work is to skip the AOT for the module where macro expansion happens and load it lazily from the main module (again, same solution as in the other SO answer I mentioned). For example:
project.clj:
; ...
:profiles {:uberjar {:aot [mw.main]}}) ; note, no `mw.core` here
; ...
main.clj:
(ns mw.main
(:gen-class))
(defn get-fs []
(require 'mw.core)
#(resolve 'mw.core/fs))
(defn -main [& args]
(println #(get-fs)))
core.clj:
(ns mw.core
(:gen-class))
(def fs (atom {}))
(defmacro op []
(swap! fs assoc :macro-f "somevalue"))
(op)
I'm not sure myself, however, if this solution is stable enough and that there are no edge cases. It does work though on this simple example.
Related
I've reduced my bigger problem to an artificial MVE (minimal viable example)
using file-io for illustration. My question concerns a certain wrapper macro
that I explain below; it does not concern better ways to use the file-io APIs;
I'm just using file-io to illustrate the macro problem in a small and easy
context. The wrapper macro tactic in my real problem is harder to show and
explain, but this MVE captures the gist of the problem.
Consider the following protocol:
(defprotocol Dumper
(dump [this]))
and an implementation over java.io.File
(extend-type java.io.File
Dumper
(dump [file]
(with-open [rdr (io/reader file)]
(doseq [line (line-seq rdr)]
(println line)))))
where we have done a (:use [clojure.java.io :as io]) to get the reader
function. I can use this as follows:
(defn -main
[& args]
(dump (io/file "resources/a_file.txt")))
Hello from a text file.
Now, I want to create another implementation of the protocol, this time over
java.lang.String. This implementation wraps the string, treating it as a
file-path string; creates a clojure.java.io/file; then calls into the other
implementation of the protocol:
(extend-type java.lang.String
Dumper
(dump [path-str] (-> path-str, io/file, dump)))
and call it like this:
(defn -main
[& args]
(dump (io/file "resources/a_file.txt"))
(dump "resources/a_file.txt"))
Hello from a text file.
Hello from a text file.
In my real problem, I have many functions in the protocol, and one
implementation just wraps the other in the manner shown. Notice that, in the
wrapper implementation, the method name, dump, is replicated. Let's eliminate
that replication with a macro (it's worth doing when the real protocol has many
methods):
(defmacro wrap-path-string [method]
`(~method [path-str] (-> path-str, io/file, ~method)))
(extend-type java.lang.String
Dumper
(wrap-path-string dump))
Oops, the compiler doesn't like it:
Exception in thread "main" java.lang.UnsupportedOperationException:
nth not supported on this type: Symbol, compiling:(wrapper_mve/core.clj:18:1)
at clojure.lang.Compiler.analyze(Compiler.java:6688)
at clojure.lang.Compiler.analyze(Compiler.java:6625)
at clojure.lang.Compiler$MapExpr.parse(Compiler.java:3072)
I tried macroexpand-all'ing and macroexpand-1'ing the macro calls (in CIDER,
difficult to replicate here), and it looks ok. I'm at a loss how to debug
deeper, but perhaps someone here can spot the problem.
Again, I know this MVE has better solutions with the file-io APIs, but I really
want to debug the macro, not find ways to avoid using it, because I need the
wrapper-macro tactic in my real problem.
I believe the problem is that extend-type is itself a macro, and macroexpansion begins with the outermost form (as opposed to function evaluation, which evaluates each argument before invoking the function). In this case the macroexpansion of extend-type is trying to treat the form (wrap-path-string dump) as a function body, and is expecting the second item to be an arg vector but finds the symbol dump.
If you want to go this route, I think you'll need to write a macro that will produce the desired expand-type form with all the function bodies already expanded in place.
I've set up small project:
project.clj:
(defproject testing-compilation "0.1.0-SNAPSHOT"
:dependencies [[org.clojure/clojure "1.8.0"]]
;; this is important!
:aot :all)
src/core.clj
(ns testing-compilation.core)
(def x (do
(println "Print during compilation?")
1))
Then when I do lein compile in project directory I'm seeing output from a print:
$ lein compile
Compiling testing-compilation.core
Print during compilation?
My question is: why clojure evaluates top level forms during AOT compilation? Shouldn't they be evaluated at program startup?
For reference, Common Lisp doesn't evaluate forms by default and provides ability to tune this behaviour. Anything similar in Clojure? If nothing, does Clojure documentation explicitly state such behaviour?
UPD: Forms are evaluated at startup as well.
After specifying a main namespace and writing main function that prints Hello, world!, I did this:
$ lein uberjar
Compiling testing-compilation.core
Print during compilation?
Created testing-compilation-0.1.0-SNAPSHOT.jar
Created testing-compilation-0.1.0-SNAPSHOT-standalone.jar
$ java -jar target/testing-compilation-0.1.0-SNAPSHOT-standalone.jar
Print during compilation?
Hello world!
The first part of the AOT process is to find the file containing the main namespace and load it by evaluating every expression from top to bottom.
Some of these expressions will be require expressions that will have the effect of loading other namespaces, which recursively load yet more namespaces.
Other will be defn expressions that will have the effect of firing up the compiler and producing class files. One class file is produced for each function.
Other expressions may do some calculation and then do things that produce class files, so it's important to give them a chance to run. Here's a made up example:
user> (let [precomputed-value (reduce + (range 5))]
(defn funfunfun [x]
(+ x precomputed-value)))
#'user/funfunfun
user> (funfunfun 4)
14
It is possible to design a lisp that would not evaluate top level forms at start or, as you mention, make it optional. In the case of Clojure it was decided to keep a single evaluation strategy across both AOT and "non AOT" loading so programs always run the same regardless of how they are compiled. These are personal design choices made by others so I can't speak to their motivations here.
Simple as the headline suggest:
Why do I have to
(require 'clojure.edn)
in order to use e.g.:
(clojure.edn/read-string "9")
And why can I immediately invoke:
(clojure.string/join (range 4) "-")
Clojure programs start at the top of the "main" namespace (often project-name.core) and evaluate each form from top to bottom. This happens when the program starts, and before any "main" functions are invoked.
When a require expression is evaluated it jumps over to that namespace and does the same thing there. If requires are encountered there it recurses down branches of those namespaces, recursively loading each namespace as required.
So If you don't explicitly state that your namespace requires another namespace, then you are at the mercy of the order that other namspaces you require load their dependencies. Sometimes it will work, and sometimes unrelated changes to the evaulation order of your distant dependencies will break your code.
So please, please, ... declare your own requirements!
in the repl (just starting clojure) I have the following ns loaded by default
user=> (pprint (map #(.getName %) (all-ns)))
(clojure.edn
clojure.core.server
clojure.java.io
clojure.java.shell
clojure.core.protocols
clojure.string
clojure.java.browse
clojure.pprint
clojure.uuid
clojure.core
clojure.main
user
clojure.java.javadoc
clojure.repl
clojure.walk
clojure.instant)
in whichever namespace you are in, clojure.edn seems not to be loaded
I'm trying to understand "Lieningen" behaviour when creating an uberjar. Following is the minimal example which reproduces the behaviour:
(ns my-stuff.core
(:gen-class))
(def some-var (throw (Exception. "boom!")))
(defn -main [& args]
(println some-var))
When this is executed with lein run it clearly fails with an Exception. However, I don't understand why executing lein uberjar also fails with an Exception from variable definition? Why executing lein uberjar attempts to evaluate the variable value? Is this speecific to uberjar task or am I missing something more substantial about Clojure or Leiningen?
In order to compile your namespace for the uberjar (if you have AOT turned on), the clojure compiler must load your namespace. This will always invoke all top-level side effects.
The best way to handle this is to never have side effects in top level code (whether inside or outside a def form), and have initialization functions to realize any start-up side effects needed.
A workaround can be to make a small namespace that uses introspection to load the rest of your code at runtime but not while compiling - using a function like this:
(defn -main
[]
(require 'my.primary.ns)
((resolve 'my.primary.ns/start)))
if that namespace is compiled, the jvm can find -main and run it, despite none of your other code being compiled. The runtime require will cause the Clojure compiler to load the rest of your code at runtime only, and resolve is needed so that -main will compile cleanly - it returns the var referenced, which then invokes your function when called.
I am working on an webapp that relies on a certain data file to be slurped at runtime. Without the datafile present I don't seem to be able to compile. Why is this?
This is in my core.clj
(def my-data (slurp "my-file.txt"))
Then when I try to compile:
$ lein ring war
I get this exception
Exception in thread "main" java.io.FileNotFoundException: my-file.txt (No such file or directory), compiling:(core.clj:24:28)
How can I compile my war? I don't need the file to be slurped or even check for existence at compile time. Thanks in advance!
[UPDATE]
This is not specific to war file packaging or ring, for example:
(ns slurp-test.core
(:gen-class))
(def x (slurp "/tmp/foo.txt"))
(defn -main [& args]
(println x))
Then:
$ lein uberjar
Compiling slurp-test.core
(ns slurp-test.core
Exception in thread "main" java.io.FileNotFoundException: /tmp/foo.txt (No such file or directory), compiling:(core.clj:4:8)
How can I fix this?
Compiling a Clojure source file involves evaluating all top-level forms. This is in fact strictly necessary to support the expected semantics -- most notably, macros couldn't work properly otherwise1.
If you AOT compile your code, top-level forms will be evaluated at compile time, and then again at run time as your compiled code is loaded.
For this reason, it is generally not a good idea to cause side effects in code living at top level. If an app requires initialization, it should be performed by a function (typically -main).
1 A macro is a function living in a Var marked as a macro (with :macro true in the Var's metadata; there's a setMacro method on clojure.lang.Var which adds this entry). Macros must clearly be available to the compiler, so they must be loaded at compile time. Furthermore, in computing an expansion, a macro function may want to call non-macro functions or otherwise make use of the values of arbitrary Vars resulting from evaluating any top-level code occurring before the point where the macro is invoked. Removing these capabilities would cripple the macro facility rather badly.