Why `lein uberjar` evaluates variables defined with `def`? - clojure

I'm trying to understand "Lieningen" behaviour when creating an uberjar. Following is the minimal example which reproduces the behaviour:
(ns my-stuff.core
(:gen-class))
(def some-var (throw (Exception. "boom!")))
(defn -main [& args]
(println some-var))
When this is executed with lein run it clearly fails with an Exception. However, I don't understand why executing lein uberjar also fails with an Exception from variable definition? Why executing lein uberjar attempts to evaluate the variable value? Is this speecific to uberjar task or am I missing something more substantial about Clojure or Leiningen?

In order to compile your namespace for the uberjar (if you have AOT turned on), the clojure compiler must load your namespace. This will always invoke all top-level side effects.
The best way to handle this is to never have side effects in top level code (whether inside or outside a def form), and have initialization functions to realize any start-up side effects needed.
A workaround can be to make a small namespace that uses introspection to load the rest of your code at runtime but not while compiling - using a function like this:
(defn -main
[]
(require 'my.primary.ns)
((resolve 'my.primary.ns/start)))
if that namespace is compiled, the jvm can find -main and run it, despite none of your other code being compiled. The runtime require will cause the Clojure compiler to load the rest of your code at runtime only, and resolve is needed so that -main will compile cleanly - it returns the var referenced, which then invokes your function when called.

Related

How do I prevent a ClassNotFoundException when calling Java interop from Clojure in a JNI-created thread?

I have a project that's using Clojure + JNI in order to access some operating-system specific APIs. The native code in this process runs some of its work on background threads, and when doing so occasionally calls a Clojure function. I quickly learned the importance of calling javaVm->AttachCurrentThread on these background threads in order to be able to call the Clojure function in the first place, but now I'm hitting another roadblock.
It looks like any Clojure function that uses the clojure.lang.RT library fails to run in this setup, with an error that says Exception in thread "Thread-3" java.lang.ClassNotFoundException: clojure/lang/RT. A great example are the array functions (e.g. alength and friends), which are used to pass data back and forth between the native side and Clojure side.
Edit: I've put together a condensed reproducible example in this repository: clojure-jni-threading-example. It appears that the error is related to type inference; the following Clojure excerpt leads to the error:
(defn alength-wrapper [x]
(alength x))
(JNIClass/callFromJNI
(reify JNIFunctionClass
(callMe [this]
(let [c (byte-array 6)
(println "c" (alength-wrapper c))))))
The error can be prevented by added a type hint:
(defn alength-wrapper [x]
(alength ^bytes x))
I also tried to reproduce this error by creating a separate thread in Clojure when calling these functions (e.g. using future), but of course that works. This suggests there may be some initialization step that I'm missing when using JNI.
How can I prevent this error from happening so that these Clojure functions can be called from JNI background threads, without needing type hints?

Why Clojure evaluates forms during AOT compilation?

I've set up small project:
project.clj:
(defproject testing-compilation "0.1.0-SNAPSHOT"
:dependencies [[org.clojure/clojure "1.8.0"]]
;; this is important!
:aot :all)
src/core.clj
(ns testing-compilation.core)
(def x (do
(println "Print during compilation?")
1))
Then when I do lein compile in project directory I'm seeing output from a print:
$ lein compile
Compiling testing-compilation.core
Print during compilation?
My question is: why clojure evaluates top level forms during AOT compilation? Shouldn't they be evaluated at program startup?
For reference, Common Lisp doesn't evaluate forms by default and provides ability to tune this behaviour. Anything similar in Clojure? If nothing, does Clojure documentation explicitly state such behaviour?
UPD: Forms are evaluated at startup as well.
After specifying a main namespace and writing main function that prints Hello, world!, I did this:
$ lein uberjar
Compiling testing-compilation.core
Print during compilation?
Created testing-compilation-0.1.0-SNAPSHOT.jar
Created testing-compilation-0.1.0-SNAPSHOT-standalone.jar
$ java -jar target/testing-compilation-0.1.0-SNAPSHOT-standalone.jar
Print during compilation?
Hello world!
The first part of the AOT process is to find the file containing the main namespace and load it by evaluating every expression from top to bottom.
Some of these expressions will be require expressions that will have the effect of loading other namespaces, which recursively load yet more namespaces.
Other will be defn expressions that will have the effect of firing up the compiler and producing class files. One class file is produced for each function.
Other expressions may do some calculation and then do things that produce class files, so it's important to give them a chance to run. Here's a made up example:
user> (let [precomputed-value (reduce + (range 5))]
(defn funfunfun [x]
(+ x precomputed-value)))
#'user/funfunfun
user> (funfunfun 4)
14
It is possible to design a lisp that would not evaluate top level forms at start or, as you mention, make it optional. In the case of Clojure it was decided to keep a single evaluation strategy across both AOT and "non AOT" loading so programs always run the same regardless of how they are compiled. These are personal design choices made by others so I can't speak to their motivations here.

why some native clojure namespaces have to be required and some not

Simple as the headline suggest:
Why do I have to
(require 'clojure.edn)
in order to use e.g.:
(clojure.edn/read-string "9")
And why can I immediately invoke:
(clojure.string/join (range 4) "-")
Clojure programs start at the top of the "main" namespace (often project-name.core) and evaluate each form from top to bottom. This happens when the program starts, and before any "main" functions are invoked.
When a require expression is evaluated it jumps over to that namespace and does the same thing there. If requires are encountered there it recurses down branches of those namespaces, recursively loading each namespace as required.
So If you don't explicitly state that your namespace requires another namespace, then you are at the mercy of the order that other namspaces you require load their dependencies. Sometimes it will work, and sometimes unrelated changes to the evaulation order of your distant dependencies will break your code.
So please, please, ... declare your own requirements!
in the repl (just starting clojure) I have the following ns loaded by default
user=> (pprint (map #(.getName %) (all-ns)))
(clojure.edn
clojure.core.server
clojure.java.io
clojure.java.shell
clojure.core.protocols
clojure.string
clojure.java.browse
clojure.pprint
clojure.uuid
clojure.core
clojure.main
user
clojure.java.javadoc
clojure.repl
clojure.walk
clojure.instant)
in whichever namespace you are in, clojure.edn seems not to be loaded

Clojure macros weirdness when running jars

Below is a simple Clojure app example created with lein new mw:
(ns mw.core
(:gen-class))
(def fs (atom {}))
(defmacro op []
(swap! fs assoc :macro-f "somevalue"))
(op)
(defn -main [& args]
(println #fs))
and in project.clj I have
:profiles {:uberjar {:aot [mw.core]}}
:main mw.core
When run in REPL, evaluating #fs returns {:macro-f somevalue}. But, running an uberjar yields {}. If I change op definition to defn instead of defmacro, then fs has again proper content when run from an uberjar. Why is that?
I vaguely realize that this has something to do with AOT compilation and the fact that macro-expansion occurs before the compilation phase, but clearly my understanding of these things is lacking.
I ran into this issue while trying to deploy an application that uses a very nice mixfix library, in which mixfix operators are defined using a global atom. It took me quite a long time to isolate the issue to the example presented above.
Any help will be greatly appreciated.
Thanks!
The real problem here is that your macro is incorrect. You forgot to add backquote character:
(defmacro op []
`(swap! fs assoc :macro-f "somevalue"))
; ^ syntax-quote ("backquote")
This operation is called syntax-quote and it's very important here, because macros in clojure modify your code during its compilation.
So, as a result you've got an impure macro, modifying fs atom whenever your code is compiled.
Since your macro doesn't produce any code, (op) call in your example does nothing at all (only it's compilation do). It appear to be working in REPL because compilation and execution is handled by the same clojure instance (see Timur's answer for details).
This is indeed related to the AOT, and the fact that some side effects are expected when a top-level code is executed - here at macro expansion time. The difference between the lein repl (or lein run) and the uberjar is in when exactly this happens.
When lein repl is executed, REPL starts and then loads the mw.core namespace automatically, if it is defined in project.clj, or one does it manually. When namespace is loaded, first the atom is defined, then macro is expanded and this expansion changes the value of the atom. All this happens in same runtime environment (in REPL process), and after the module is loaded, atom has an updated value in this REPL. Executing lein run will do pretty much the same - load namespace and then execute -main function in the same process.
And when lein uberjar is executed - same thing happens and this is the problem now. Compiler, in order to compile the clj file will first load it and evaluate the top level (I learned it myself from this SO answer). So the module is loaded, top level is evaluated, macro is expanded, reference value is changed and then, after compilation completes, the compiler process, the one where reference value just changed, ends. Now, when the uberjar is executed with java -jar this spawns the new process, with a compiled code, where the macro is already expended (so (op) is already "replaced" with the code the op macro generated, which is none in this case). Therefore, atom value is unchanged.
In my opinion, good fix would be to not rely on side effects in a macro.
If stick to the macro anyway, the way to make this idea work is to skip the AOT for the module where macro expansion happens and load it lazily from the main module (again, same solution as in the other SO answer I mentioned). For example:
project.clj:
; ...
:profiles {:uberjar {:aot [mw.main]}}) ; note, no `mw.core` here
; ...
main.clj:
(ns mw.main
(:gen-class))
(defn get-fs []
(require 'mw.core)
#(resolve 'mw.core/fs))
(defn -main [& args]
(println #(get-fs)))
core.clj:
(ns mw.core
(:gen-class))
(def fs (atom {}))
(defmacro op []
(swap! fs assoc :macro-f "somevalue"))
(op)
I'm not sure myself, however, if this solution is stable enough and that there are no edge cases. It does work though on this simple example.

Clojure: War Compilation Failure with Missing Data File Dependency

I am working on an webapp that relies on a certain data file to be slurped at runtime. Without the datafile present I don't seem to be able to compile. Why is this?
This is in my core.clj
(def my-data (slurp "my-file.txt"))
Then when I try to compile:
$ lein ring war
I get this exception
Exception in thread "main" java.io.FileNotFoundException: my-file.txt (No such file or directory), compiling:(core.clj:24:28)
How can I compile my war? I don't need the file to be slurped or even check for existence at compile time. Thanks in advance!
[UPDATE]
This is not specific to war file packaging or ring, for example:
(ns slurp-test.core
(:gen-class))
(def x (slurp "/tmp/foo.txt"))
(defn -main [& args]
(println x))
Then:
$ lein uberjar
Compiling slurp-test.core
(ns slurp-test.core
Exception in thread "main" java.io.FileNotFoundException: /tmp/foo.txt (No such file or directory), compiling:(core.clj:4:8)
How can I fix this?
Compiling a Clojure source file involves evaluating all top-level forms. This is in fact strictly necessary to support the expected semantics -- most notably, macros couldn't work properly otherwise1.
If you AOT compile your code, top-level forms will be evaluated at compile time, and then again at run time as your compiled code is loaded.
For this reason, it is generally not a good idea to cause side effects in code living at top level. If an app requires initialization, it should be performed by a function (typically -main).
1 A macro is a function living in a Var marked as a macro (with :macro true in the Var's metadata; there's a setMacro method on clojure.lang.Var which adds this entry). Macros must clearly be available to the compiler, so they must be loaded at compile time. Furthermore, in computing an expansion, a macro function may want to call non-macro functions or otherwise make use of the values of arbitrary Vars resulting from evaluating any top-level code occurring before the point where the macro is invoked. Removing these capabilities would cripple the macro facility rather badly.