I am working on an webapp that relies on a certain data file to be slurped at runtime. Without the datafile present I don't seem to be able to compile. Why is this?
This is in my core.clj
(def my-data (slurp "my-file.txt"))
Then when I try to compile:
$ lein ring war
I get this exception
Exception in thread "main" java.io.FileNotFoundException: my-file.txt (No such file or directory), compiling:(core.clj:24:28)
How can I compile my war? I don't need the file to be slurped or even check for existence at compile time. Thanks in advance!
[UPDATE]
This is not specific to war file packaging or ring, for example:
(ns slurp-test.core
(:gen-class))
(def x (slurp "/tmp/foo.txt"))
(defn -main [& args]
(println x))
Then:
$ lein uberjar
Compiling slurp-test.core
(ns slurp-test.core
Exception in thread "main" java.io.FileNotFoundException: /tmp/foo.txt (No such file or directory), compiling:(core.clj:4:8)
How can I fix this?
Compiling a Clojure source file involves evaluating all top-level forms. This is in fact strictly necessary to support the expected semantics -- most notably, macros couldn't work properly otherwise1.
If you AOT compile your code, top-level forms will be evaluated at compile time, and then again at run time as your compiled code is loaded.
For this reason, it is generally not a good idea to cause side effects in code living at top level. If an app requires initialization, it should be performed by a function (typically -main).
1 A macro is a function living in a Var marked as a macro (with :macro true in the Var's metadata; there's a setMacro method on clojure.lang.Var which adds this entry). Macros must clearly be available to the compiler, so they must be loaded at compile time. Furthermore, in computing an expansion, a macro function may want to call non-macro functions or otherwise make use of the values of arbitrary Vars resulting from evaluating any top-level code occurring before the point where the macro is invoked. Removing these capabilities would cripple the macro facility rather badly.
Related
I've set up small project:
project.clj:
(defproject testing-compilation "0.1.0-SNAPSHOT"
:dependencies [[org.clojure/clojure "1.8.0"]]
;; this is important!
:aot :all)
src/core.clj
(ns testing-compilation.core)
(def x (do
(println "Print during compilation?")
1))
Then when I do lein compile in project directory I'm seeing output from a print:
$ lein compile
Compiling testing-compilation.core
Print during compilation?
My question is: why clojure evaluates top level forms during AOT compilation? Shouldn't they be evaluated at program startup?
For reference, Common Lisp doesn't evaluate forms by default and provides ability to tune this behaviour. Anything similar in Clojure? If nothing, does Clojure documentation explicitly state such behaviour?
UPD: Forms are evaluated at startup as well.
After specifying a main namespace and writing main function that prints Hello, world!, I did this:
$ lein uberjar
Compiling testing-compilation.core
Print during compilation?
Created testing-compilation-0.1.0-SNAPSHOT.jar
Created testing-compilation-0.1.0-SNAPSHOT-standalone.jar
$ java -jar target/testing-compilation-0.1.0-SNAPSHOT-standalone.jar
Print during compilation?
Hello world!
The first part of the AOT process is to find the file containing the main namespace and load it by evaluating every expression from top to bottom.
Some of these expressions will be require expressions that will have the effect of loading other namespaces, which recursively load yet more namespaces.
Other will be defn expressions that will have the effect of firing up the compiler and producing class files. One class file is produced for each function.
Other expressions may do some calculation and then do things that produce class files, so it's important to give them a chance to run. Here's a made up example:
user> (let [precomputed-value (reduce + (range 5))]
(defn funfunfun [x]
(+ x precomputed-value)))
#'user/funfunfun
user> (funfunfun 4)
14
It is possible to design a lisp that would not evaluate top level forms at start or, as you mention, make it optional. In the case of Clojure it was decided to keep a single evaluation strategy across both AOT and "non AOT" loading so programs always run the same regardless of how they are compiled. These are personal design choices made by others so I can't speak to their motivations here.
Below is a simple Clojure app example created with lein new mw:
(ns mw.core
(:gen-class))
(def fs (atom {}))
(defmacro op []
(swap! fs assoc :macro-f "somevalue"))
(op)
(defn -main [& args]
(println #fs))
and in project.clj I have
:profiles {:uberjar {:aot [mw.core]}}
:main mw.core
When run in REPL, evaluating #fs returns {:macro-f somevalue}. But, running an uberjar yields {}. If I change op definition to defn instead of defmacro, then fs has again proper content when run from an uberjar. Why is that?
I vaguely realize that this has something to do with AOT compilation and the fact that macro-expansion occurs before the compilation phase, but clearly my understanding of these things is lacking.
I ran into this issue while trying to deploy an application that uses a very nice mixfix library, in which mixfix operators are defined using a global atom. It took me quite a long time to isolate the issue to the example presented above.
Any help will be greatly appreciated.
Thanks!
The real problem here is that your macro is incorrect. You forgot to add backquote character:
(defmacro op []
`(swap! fs assoc :macro-f "somevalue"))
; ^ syntax-quote ("backquote")
This operation is called syntax-quote and it's very important here, because macros in clojure modify your code during its compilation.
So, as a result you've got an impure macro, modifying fs atom whenever your code is compiled.
Since your macro doesn't produce any code, (op) call in your example does nothing at all (only it's compilation do). It appear to be working in REPL because compilation and execution is handled by the same clojure instance (see Timur's answer for details).
This is indeed related to the AOT, and the fact that some side effects are expected when a top-level code is executed - here at macro expansion time. The difference between the lein repl (or lein run) and the uberjar is in when exactly this happens.
When lein repl is executed, REPL starts and then loads the mw.core namespace automatically, if it is defined in project.clj, or one does it manually. When namespace is loaded, first the atom is defined, then macro is expanded and this expansion changes the value of the atom. All this happens in same runtime environment (in REPL process), and after the module is loaded, atom has an updated value in this REPL. Executing lein run will do pretty much the same - load namespace and then execute -main function in the same process.
And when lein uberjar is executed - same thing happens and this is the problem now. Compiler, in order to compile the clj file will first load it and evaluate the top level (I learned it myself from this SO answer). So the module is loaded, top level is evaluated, macro is expanded, reference value is changed and then, after compilation completes, the compiler process, the one where reference value just changed, ends. Now, when the uberjar is executed with java -jar this spawns the new process, with a compiled code, where the macro is already expended (so (op) is already "replaced" with the code the op macro generated, which is none in this case). Therefore, atom value is unchanged.
In my opinion, good fix would be to not rely on side effects in a macro.
If stick to the macro anyway, the way to make this idea work is to skip the AOT for the module where macro expansion happens and load it lazily from the main module (again, same solution as in the other SO answer I mentioned). For example:
project.clj:
; ...
:profiles {:uberjar {:aot [mw.main]}}) ; note, no `mw.core` here
; ...
main.clj:
(ns mw.main
(:gen-class))
(defn get-fs []
(require 'mw.core)
#(resolve 'mw.core/fs))
(defn -main [& args]
(println #(get-fs)))
core.clj:
(ns mw.core
(:gen-class))
(def fs (atom {}))
(defmacro op []
(swap! fs assoc :macro-f "somevalue"))
(op)
I'm not sure myself, however, if this solution is stable enough and that there are no edge cases. It does work though on this simple example.
I'm trying to understand "Lieningen" behaviour when creating an uberjar. Following is the minimal example which reproduces the behaviour:
(ns my-stuff.core
(:gen-class))
(def some-var (throw (Exception. "boom!")))
(defn -main [& args]
(println some-var))
When this is executed with lein run it clearly fails with an Exception. However, I don't understand why executing lein uberjar also fails with an Exception from variable definition? Why executing lein uberjar attempts to evaluate the variable value? Is this speecific to uberjar task or am I missing something more substantial about Clojure or Leiningen?
In order to compile your namespace for the uberjar (if you have AOT turned on), the clojure compiler must load your namespace. This will always invoke all top-level side effects.
The best way to handle this is to never have side effects in top level code (whether inside or outside a def form), and have initialization functions to realize any start-up side effects needed.
A workaround can be to make a small namespace that uses introspection to load the rest of your code at runtime but not while compiling - using a function like this:
(defn -main
[]
(require 'my.primary.ns)
((resolve 'my.primary.ns/start)))
if that namespace is compiled, the jvm can find -main and run it, despite none of your other code being compiled. The runtime require will cause the Clojure compiler to load the rest of your code at runtime only, and resolve is needed so that -main will compile cleanly - it returns the var referenced, which then invokes your function when called.
I'm reading in the API docs that the *file* variable should have a value of "the path of the file being evaluated, as a String". However, this feature seems broken in certain cases.
When I execute a file using lein exec, things work as expected:
$ cat test.clj
(println *file*)
$ lein exec test.clj
/path/to/test.clj
Yet when I run a test that contains a call to (println *file*), NO_SOURCE_PATH is printed instead of the file containing that line.
Why am I seeing this behavior, and how can I reliably access the path and filename of the file being evaluated?
*file* is set to the path of the file being compiled, so after your whole program is compiled it is no longer useful to look at the value of *file* (assuming no use of eval).
In your test.clj example, the println is executed while the file is still being compiled. If the reference to *file* is moved into a test or function, it will only be dereferenced at runtime after the value of *file* is no longer useful.
One option is to write a macro that stores the value of *file* when it is expanded, so that the result can be used later. For example, a file example.clj could have:
(defmacro source-file []
*file*)
(defn foo [x]
(println "Foo was defined in" (source-file) "and called with" x))
Then from the REPL or anywhere, (foo 42) would print:
Foo was defined in /home/chouser/example.clj and called with 42
Note that it doesn't matter which file source-file is defined in, only where it was expanded, that is the file where foo is defined. This works because it's when foo is compiled that source-file is run, and the return value of source-file which is just a string is then included in the compiled version of foo. The string is then of course available every time foo executes.
If this behaviour is surprising, it may help to consider what would have to happen in order for *file* to have a useful value inside every function at runtime. Its value would have to change for every function call and return, a substantial runtime overhead for a rarely-used feature.
I'm new to Cojure, but I read that when using AOT compilation a class is generated for each function. Wouldn't that mean a whole lot of classes that consume perm-gen space? Aren't there any issues with that? What about when AOT compilation is not used, but bytecode is generated on the fly?
Well, I think it doesn't matter if the class is loaded from disk or from memory, wrt PermGen space.
However, notice that the problem may not be as bad as you think: each function is compiled once. Especially, anonymous functions which you can see here or there, generated "on the fly" are only compiled once, and each invocation of them just leads to the creation of new instances of those classes (an instance is needed to store the lexical context).
So the following code leads to the creation of two classes (one for create-fn, one for lambda-fn), whatever the number of calls to create-fn will be at runtime:
(defn create-fn [n] (fn lambda-fn [x] (add n x)))