Doesn't Clojure consume too much perm-gen space? - clojure

I'm new to Cojure, but I read that when using AOT compilation a class is generated for each function. Wouldn't that mean a whole lot of classes that consume perm-gen space? Aren't there any issues with that? What about when AOT compilation is not used, but bytecode is generated on the fly?

Well, I think it doesn't matter if the class is loaded from disk or from memory, wrt PermGen space.
However, notice that the problem may not be as bad as you think: each function is compiled once. Especially, anonymous functions which you can see here or there, generated "on the fly" are only compiled once, and each invocation of them just leads to the creation of new instances of those classes (an instance is needed to store the lexical context).
So the following code leads to the creation of two classes (one for create-fn, one for lambda-fn), whatever the number of calls to create-fn will be at runtime:
(defn create-fn [n] (fn lambda-fn [x] (add n x)))

Related

How do I prevent a ClassNotFoundException when calling Java interop from Clojure in a JNI-created thread?

I have a project that's using Clojure + JNI in order to access some operating-system specific APIs. The native code in this process runs some of its work on background threads, and when doing so occasionally calls a Clojure function. I quickly learned the importance of calling javaVm->AttachCurrentThread on these background threads in order to be able to call the Clojure function in the first place, but now I'm hitting another roadblock.
It looks like any Clojure function that uses the clojure.lang.RT library fails to run in this setup, with an error that says Exception in thread "Thread-3" java.lang.ClassNotFoundException: clojure/lang/RT. A great example are the array functions (e.g. alength and friends), which are used to pass data back and forth between the native side and Clojure side.
Edit: I've put together a condensed reproducible example in this repository: clojure-jni-threading-example. It appears that the error is related to type inference; the following Clojure excerpt leads to the error:
(defn alength-wrapper [x]
(alength x))
(JNIClass/callFromJNI
(reify JNIFunctionClass
(callMe [this]
(let [c (byte-array 6)
(println "c" (alength-wrapper c))))))
The error can be prevented by added a type hint:
(defn alength-wrapper [x]
(alength ^bytes x))
I also tried to reproduce this error by creating a separate thread in Clojure when calling these functions (e.g. using future), but of course that works. This suggests there may be some initialization step that I'm missing when using JNI.
How can I prevent this error from happening so that these Clojure functions can be called from JNI background threads, without needing type hints?

Why Clojure evaluates forms during AOT compilation?

I've set up small project:
project.clj:
(defproject testing-compilation "0.1.0-SNAPSHOT"
:dependencies [[org.clojure/clojure "1.8.0"]]
;; this is important!
:aot :all)
src/core.clj
(ns testing-compilation.core)
(def x (do
(println "Print during compilation?")
1))
Then when I do lein compile in project directory I'm seeing output from a print:
$ lein compile
Compiling testing-compilation.core
Print during compilation?
My question is: why clojure evaluates top level forms during AOT compilation? Shouldn't they be evaluated at program startup?
For reference, Common Lisp doesn't evaluate forms by default and provides ability to tune this behaviour. Anything similar in Clojure? If nothing, does Clojure documentation explicitly state such behaviour?
UPD: Forms are evaluated at startup as well.
After specifying a main namespace and writing main function that prints Hello, world!, I did this:
$ lein uberjar
Compiling testing-compilation.core
Print during compilation?
Created testing-compilation-0.1.0-SNAPSHOT.jar
Created testing-compilation-0.1.0-SNAPSHOT-standalone.jar
$ java -jar target/testing-compilation-0.1.0-SNAPSHOT-standalone.jar
Print during compilation?
Hello world!
The first part of the AOT process is to find the file containing the main namespace and load it by evaluating every expression from top to bottom.
Some of these expressions will be require expressions that will have the effect of loading other namespaces, which recursively load yet more namespaces.
Other will be defn expressions that will have the effect of firing up the compiler and producing class files. One class file is produced for each function.
Other expressions may do some calculation and then do things that produce class files, so it's important to give them a chance to run. Here's a made up example:
user> (let [precomputed-value (reduce + (range 5))]
(defn funfunfun [x]
(+ x precomputed-value)))
#'user/funfunfun
user> (funfunfun 4)
14
It is possible to design a lisp that would not evaluate top level forms at start or, as you mention, make it optional. In the case of Clojure it was decided to keep a single evaluation strategy across both AOT and "non AOT" loading so programs always run the same regardless of how they are compiled. These are personal design choices made by others so I can't speak to their motivations here.

Idiomatic Way to Set & Update Clojure Namespace Flags?

I have a ring/compojure based web API and I need to be able to optionally turn on and off caching (or any flag for that matter) depending on a startup flag or having a param passed in to the request.
I tried having the flag set as a dynamic var:
(def ^:dynamic *cache* true)
(defmacro cache [source record options & body]
`(let [cachekey# (gen-cachekey ~source ~record ~options)]
(if-let [cacheval# (if (and (:ttl ~source) ~*cache*) (mc/generic-get cachekey#) nil)]
cacheval#
(let [ret# (do ~#body)]
(if (and (:ttl ~source) ~*cache*) (mc/generic-set cachekey# ret# :ttl (:ttl ~source)))
ret#))))
...but that only allows me to update the flag within a binding block which isn't ideal to wrap every data fetching function, and it din't allow me to optionally set the flag on start up
I then tried to set the flag in an atom, which allowed me to set the flag on the start up, and easily update the flag if a certain param was passed to the request, but the update would be changing the flag for all threads and not just the specific request's flag.
What's the most idiomatic way to do something like this in Clojure?
Firstly, unquoting *cache* in your macro definition means that it's compile-time value will be included in the compiled output and rebinding it at runtime will have no effect. If you want the value to be looked up at runtime, you should not unquote *cache*.
As for the actual question: if you want the various data fetching functions to react to a cache setting, you'll need to communicate it to them somehow anyway. Additionally, there are the two separate concerns of (1) computing the relevant flag values, (2) making them available to the handler so that it can communicate them to the functions which care.
Computing flag values and making them available to the main handler
For decisions on a per-request basis, examining some incoming parameters and settings, you might want to use a piece of middleware which will determine the correct values of the various flags and assoc them onto the request map. This way handlers living downstream from this piece of middleware will be able to look them up in the request map without knowing how they were computed.
You can of course install multiple pieces of middleware, each responsible for computing a different set of flags.
If you do use middleware, you'll likely want it to handle the default values. In this case, the note about setting defaults at startup in the section on dynamic Vars below may not be relevant.
Finally, if the application-level (global, thread-independent) defaults might change at runtime (as a result of a "turn off all caching" request, perhaps), you can store these in Atoms.
Communicating flag values to functions which care
First approach: dynamic Vars
Once you do that, you'll have to communicate the flags to the functions which actually perform operations where the flags are relevant; here dynamic Vars and explicit arguments are the most natural options.
Using a dynamic Var means that you don't have to do it explicitly for every function call involving such functions; instead, you can do it once per request, say. Installing a default value at startup is quite possible too; for example, you could use alter-var-root for that. (Or you could simply define the initial value of the Var in terms of information obtained from the environment.)
NB. if you launch new threads within the scope of a binding block, they will not see the bindings installed by this binding block automatically -- you'll have to arrange for them to be transmitted. The bound-fn macro is useful for creating functions which handle this automatically; see (doc bound-fn) for details.
The idea of using a single map with all flags described below is relevant here too, if perhaps not equally necessary for reasonable convenience; in essence, you'd be using a single dynamic Var instead of many.
Second approach: explicit arguments and flag maps
The other natural option is simply to pass in any relevant flags to the functions which need them. If you pass all the flags in a map, you can just assemble all options relevant to a request in a single map and pass it in to all the flag-aware functions without caring which flags any given function needs (as each function will simply examine the map for the flags it cares about, disregarding the others).
With this approach, you'll likely want to split the data fetching functionality into a function to get the value from the cache, a function to get the value from the data store and a flag-aware function which calls one of the other two depending on the flag value. This way you can, for example, test them separately. (Although if the individual functions really are completely trivial, I'd say it's ok to create only the flag-taking version at first; just remember to factor out any pieces which become more complex in the course of development.)

Clojure: read-string on functions

Is there a way to use the reader with function values, e.g:
(read-string (pr-str +))
RuntimeException Unreadable form clojure.lang.Util.runtimeException
(Util.java:219)
?
As you might already know the output for (pr-str +) is not valid Clojure code that the reader can parse: "#<core$_PLUS_ clojure.core$_PLUS_#ff4805>". The output for function values when using the functions pr, prn, println and such, is intentionally wrapped around the #< reader macro that dispatches to the UnreadableReader, which throws the exception you are seeing.
For the example you provided you can use the print-dup function that works for basic serialization:
(defn string-fn [f]
(let [w (java.io.StringWriter.)]
(print-dup f w)
(str w)))
(let [plus (read-string (string-fn +))]
(plus 1 2))
The serialization done for the + function is actually generating the call to the class' constructor:
#=(clojure.core$_PLUS_. )
This only works of course if the class is already compiled in the Clojure environment where you are reading the string. If you serialized an anonymous function, saving it to a file and then reading it back in, when running a new REPL session, it will most likely not work since the class name for each anonymous function is different and depends on Clojure internals.
For arbitrary functions things get a lot more complicated. Sharing the source code might not even be enough, the function could rely on the usage of any number of other functions or vars that only exist in the source environment. If this is what you are thinking of doing, maybe considering other approaches to the problem you are trying to solve, will eliminate the need to serialize the value of arbitrary functions.
Hope it helps,
If you only need the name, can you just send the symbol with (name '+).
But generally speaking, it is a bad idea to use clojure read, if you want to read it back, as clojure's reader might execute some code in the process. Maybe have a look at the edn reader : clojure.edn/read-string
But maybe you just need to convert the string back to a symbol, in which case the (symbol name) function would be enough.

Clojure: War Compilation Failure with Missing Data File Dependency

I am working on an webapp that relies on a certain data file to be slurped at runtime. Without the datafile present I don't seem to be able to compile. Why is this?
This is in my core.clj
(def my-data (slurp "my-file.txt"))
Then when I try to compile:
$ lein ring war
I get this exception
Exception in thread "main" java.io.FileNotFoundException: my-file.txt (No such file or directory), compiling:(core.clj:24:28)
How can I compile my war? I don't need the file to be slurped or even check for existence at compile time. Thanks in advance!
[UPDATE]
This is not specific to war file packaging or ring, for example:
(ns slurp-test.core
(:gen-class))
(def x (slurp "/tmp/foo.txt"))
(defn -main [& args]
(println x))
Then:
$ lein uberjar
Compiling slurp-test.core
(ns slurp-test.core
Exception in thread "main" java.io.FileNotFoundException: /tmp/foo.txt (No such file or directory), compiling:(core.clj:4:8)
How can I fix this?
Compiling a Clojure source file involves evaluating all top-level forms. This is in fact strictly necessary to support the expected semantics -- most notably, macros couldn't work properly otherwise1.
If you AOT compile your code, top-level forms will be evaluated at compile time, and then again at run time as your compiled code is loaded.
For this reason, it is generally not a good idea to cause side effects in code living at top level. If an app requires initialization, it should be performed by a function (typically -main).
1 A macro is a function living in a Var marked as a macro (with :macro true in the Var's metadata; there's a setMacro method on clojure.lang.Var which adds this entry). Macros must clearly be available to the compiler, so they must be loaded at compile time. Furthermore, in computing an expansion, a macro function may want to call non-macro functions or otherwise make use of the values of arbitrary Vars resulting from evaluating any top-level code occurring before the point where the macro is invoked. Removing these capabilities would cripple the macro facility rather badly.