lein uberjar results in "Method code too large!" - clojure

I have a clojure project that runs fine using lein run, but lein uberjar results in
java.lang.RuntimeException: Method code too large!, compiling:(some_file.clj:1:1)
for the same project. I could trace it down to some_file.clj using a large vector of more than 2000 entries that is defined something like this:
(def x
[[1 :qwre :asdf]
[2 :zxcv :fafa]
...
])
When removing enough entries from that vector, lein uberjar compiles without issues. How can I make uberjar do its job, leaving all entries in the vector?
N.B. When changing x to a constant a la (def ^:const x, lein run will also throw a Method too large! error. And btw, that error occurs in the place where x is used, so just defining the constant is fine, if it is not being used.

There is a 64kb limit on the size of the method in Java. It seems that in your case the method that creates the big vector literal exceeds this limit.
Actually, you can check it by yourself, using a fancy library called clj-java-decompiler. Here's a short example using boot:
(set-env!
:dependencies '[[com.clojure-goes-fast/clj-java-decompiler "0.1.0"]])
(require '[clj-java-decompiler.core :as d])
(d/decompile-form
{}
[[1 :qwre :asdf]
[2 :zxcv :fafa]
[3 :zxcv :fafa]])
;; ==> prints:
;;
;; // Decompiling class: cjd__init
;; import clojure.lang.*;
;;
;; public class cjd__init
;; {
;; public static final AFn const__10;
;;
;; public static void load() {
;; const__10;
;; }
;;
;; public static void __init0() {
;; const__10 = (AFn)Tuple.create((Object)Tuple.create((Object)1L, (Object)RT.keyword((String)null, "qwre"), (Object)RT.keyword((String)null, "asdf")), (Object)Tuple.create((Object)2L, (Object)RT.keyword((String)null, "zxcv"), (Object)RT.keyword((String)null, "fafa")), (Object)Tuple.create((Object)3L, (Object)RT.keyword((String)null, "zxcv"), (Object)RT.keyword((String)null, "fafa")));
;; }
;;
;; static {
;; __init0();
;; Compiler.pushNSandLoader(RT.classForName("cjd__init").getClassLoader());
;; try {
;; load();
;; Var.popThreadBindings();
;; }
;; finally {
;; Var.popThreadBindings();
;; }
;; }
;; }
;;
As you can see, there's a method __init0 that creates the actual Java object for the vector literal. Given enough elements in the vector, size of the method can easily exceed the 64kb limit.
I think, in your case, the easiest solution to this problem will be to put your vector to the file and then slurp+read this file. Something, like this:
file vector.edn:
[[1 :qwre :asdf]
[2 :zxcv :fafa]
...
]
and then in your code:
(def x (clojure.edn/read-string (slurp "vector.edn"))

It turns out ahead-of-time-compilation was the difference between lein run and lein uberjar, as my project.clj contains the (default) line
:profiles {:uberjar {:aot :all}})
when I remove the :aot :all, uberjar finishes without complaining - but also without aot-compiling. So I guess I need to turn off aot, or figure out to limit it to exclude that vector.
But I'd like to be able to have this - albeit huge - vector "literal", and still compile everything. But perhaps factoring that huge vector out into a data file, which is read in at startup is not too bad an idea either. Then I'd be able to leave the :aot :all setting as is.

would it work in your case top wrap the value in a call to delay to cause it to be computed when it's first used?
there are limits on how big a method can be in a java classfile, and each top level form in Clojure (usually) produces a class.
To fix this problem it's useful to arrange to have the constants generated and stored either:
in memory by calculating them at runtime
in the resources directory where they can be read at runtime
don't generate the class file by not compiling it in advance, this generates the data at program start.
if you use a future or make a memoized function to get the data you will compute it at program start and store it in memory rather than the class file.
if you put it in the resources directory it will not be subject to the size limit, and can still be computed at compile time.
if you disable AOT compilation, then the class limit will never be hit becuase it will be computed at load time when the program starts.

Related

How to set a dynamic var before aot compile

I want to have a *flag* variable in a given namespace that is set to true only when :aot compiling.
Is there a way to do that?
Your issue is kind of complicated because the definition of Clojure's own dynamic nature, so there's no rough equivalent of C's #ifdef or some other mechanism that happens at compile time, but here's a workaround:
I created a Leiningen project with lein new app flagdemo. This trick detects when AOT is performed, as #Biped Phill mentioned above, using the dynamic var *compile-files*, and saves a resource in the classpath of the compiled code:
The code looks like this:
(ns flagdemo.core
(:gen-class))
(def flag-path "target/uberjar/classes/flag.edn")
(defn write-flag [val]
(try (spit flag-path (str val)) (catch Exception _)))
(defn read-flag []
(some-> (clojure.java.io/resource "flag.edn") slurp clojure.edn/read-string))
(write-flag false)
(when clojure.core/*compile-files*
(write-flag true))
(defn -main
[& args]
(println "Flag is" (read-flag)))
So, when the file loads using, say, lein run or when you load it in the REPL, it will try to write an EDN file with the value false.
When you compile the package using lein uberjar, it loads the namespace and finds that *compile-files* is defined, thus it saves an EDN file that is packaged with the JAR as a resource.
The function read-flag just tries to load the EDN file from the classpath.
It works like this:
$ lein clean
$ lein run
Flag is nil
$ lein uberjar
Compiling flagdemo.core
Created /tmp/flagdemo/target/uberjar/flagdemo-0.1.0-SNAPSHOT.jar
Created /tmp/flagdemo/target/uberjar/flagdemo-0.1.0-SNAPSHOT-standalone.jar
$ java -jar target/uberjar/flagdemo-0.1.0-SNAPSHOT-standalone.jar
Flag is true
Credit to #Biped Phill and #Denis Fuenzalida for explaining it to me.
I think this works as well:
(def ^:dynamic *static* false)
(when (and *compile-files* boot/*static-flag*)
(alter-var-root #'*static* true))
then in profiles.clj:
:injections [(require 'boot)
(intern 'boot '*static-flag* true)]
First you need to indicate that a variable will change over time:
(def ^:dynamic *the-answer* nil)
*the-answer*
;=> nil
The ^:dynamic bit will allow that variable to be rebound later:
(binding [*the-answer* 42] *the-answer*)
;=> 42
ADDENDUM 1
Here's what would happen if you didn't use ^:dynamic:
(def *the-answer* nil)
(binding [*the-answer* 42] *the-answer*)
;=> [...] Can't dynamically bind non-dynamic var [...]
ADDENDUM 2
These kind of variables are commonly referred to as "earmuffs". The naming convention is unenforced but strongly encouraged.
Here's what happen in the REPL when you use this naming without declaring a dynamic variable:
(def *the-answer* nil)
; Warning: *the-answer* not declared dynamic and thus is not dynamically rebindable, but its name suggests otherwise. Please either indicate ^:dynamic *the-answer* or change the name. (/tmp/form-init7760459636905875407.clj:1)

clojure.tools.namespace.repl/refresh not working

I have this clojure file:
(ns foo.core)
(def bar 1)
And this project.clj:
(defproject foo "version"
:dependencies [[org.clojure/clojure "1.6.0"]]
:main foo.core)
I open a terminal and run lein repl. Then I change the value of bar OUTSIDE the repl.
(def bar 1)
to
(def bar 2)
I change this value on the editor and don't forget to save the file.
Then I run the command in the repl (load-string "(clojure.tools.namespace.repl/refresh)")
I type bar in the repl and still get 1 instead of 2. However if you just run (clojure.tools.namespace.repl/refresh) and then query the value of bar you get 2. Why is this so? Why is the function load-string breaking it?
Apparently, specifying a :main in your project.clj forces the project to use AOT which breaks this.
Source: http://dev.clojure.org/jira/browse/TNS-27
It's not how your invoking refresh, it's when you invoke it.
refresh uses timestamps on files to see when it needs to reload them. This way it only reloads things it needs to. Because you didn't change the file after redefining a var through the repl, it skipped loading that file.
if I start with a file like this:
(ns bla.core)
(def bar 3)
and then call refresh the first time:
bla.core> (load-string"(clojure.tools.namespace.repl/refresh)")
:reloading (bla.core bla.core-test)
:ok
then redefine bar in the repl:
bla.core> (def bar :changed-from-repl)
#'bla.core/bar
and refresh again:
bla.core> (load-string "(clojure.tools.namespace.repl/refresh)")
:reloading ()
:ok
we can see that it didn't reload any namespaces.

Can reader tags be used with ClojureScript

In Clojure, adding custom reader tags is really simple
;; data_readers.clj (on classpath, eg. src/clj/)
{rd/qux datareaders.reader/my-reader}
;; Define a namespace containing the my-reader var:
(ns datareaders.reader)
(defn my-reader [x] 'y)
;; use special tag in other namespace. Readers have to be required first.
(require 'datareaders.reader)
(defn foo [x y]
(println #rd/qux x "prints y, not x due to reader tag."))
I am trying to achieve the same thing for ClojureScript but am getting an error that #rd/qux is not defined. I am using lein cljsbuild once to build the project. Is that a limitation of ClojureScript or is it that cljsbuild builds the project before the readers have been resolved? In that case, how can I force leiningen to load the readers namespace before cljsbuild is started?
EDIT: Note that this example intends to use reader tags within ClojureScript source code and not when reading auxilliary data via read-string.
This currently isn't possible, but will be as soon as #CLJS-1194 and #CLJS-1277 are fixed. Hopefully that will happen very soon.
If you wanted to do it, just rename data_readers.clj to data_readers.cljc and use conditional readers.
As an aside, what's your use case for this?
Both #CLJS-1194 and #CLJS-1277 are fixed, so this should work as expected.
The mechanism to add a custom reader tag in cljs is different. You have to call register-tag-parser! which takes a tag and a fn.
From the cljs reader tests:
(testing "Testing tag parsers"
(reader/register-tag-parser! 'foo identity)
(is (= [1 2] (reader/read-string "#foo [1 2]")))
Your example would be:
(cljs.reader/register-tag-parser! 'rd/qux (fn [x] 'y))
(defn foo [x y]
(println #rd/qux x "prints y, not x due to reader tag."))

clojure behavior of def

I have a lein project (using cascalog--but that's not particularly important). I'm trying to externalize properties like paths to files, so I end up with code that looks something like this:
(defn output-tap [path] (hfs-textline (str (get-prop :output-path-prefix) path) :sinkmode :replace))
(def some-cascalog-query
(<- [?f1 ?f2 ?f3]
((output-tap (get-prop :output-path)) ?line)
(tab-split ?line :> ?f1 ?f2 ?f3)))
In the example above, assume the function get-prop exists; it's just using standard java to read a property value (based off this example: loading configuration file in clojure as data structure).
Now I have a main method that loads the property values, e.g. something like:
(defn -main [& args] (do (load-props (first args)) (do-cascalog-stuff)))
But when I lein uberjar I get a compile time error saying:
Caused by: java.lang.IllegalArgumentException: Can not create a Path from an empty string
at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82)
at org.apache.hadoop.fs.Path.<init>(Path.java:90)
at cascading.tap.hadoop.Hfs.getPath(Hfs.java:343)
Are defs always compile time evaluated (rather than runtime evaluated)? Or am I misunderstanding this error?
So, you want the property lookup to occur at run-time? Then yes, you'll need to define some-cascalog-query as a function or macro. A bare def causes the expression to be evaluated when the code is loaded, not when the var is dereferenced.
This can be illustrated pretty simply in the REPL:
user=> (def foo (do (println "Hello, world!") 1))
Hello, world!
#'user/foo
user=> foo
1
From the documentation (emphasis mine):
(def symbol init?)
Creates and interns or locates a global var with the name of symbol and a namespace of the value of the current namespace (ns). If init is supplied, it is evaluated, and the root binding of the var is set to the resulting value.
that error looks like (get-prop :output-path) (get-prop :output-path-prefix) are is returning nothing which is getting wrapped into an empty string by str. perhaps the property is not being found?
does get-prop work as expected?
your understanding of defs is correct, they are are compile time, not (usually) runtime.

Setting Clojure "constants" at runtime

I have a Clojure program that I build as a JAR file using Maven. Embedded in the JAR Manifest is a build-version number, including the build timestamp.
I can easily read this at runtime from the JAR Manifest using the following code:
(defn set-version
"Set the version variable to the build number."
[]
(def version
(-> (str "jar:" (-> my.ns.name (.getProtectionDomain)
(.getCodeSource)
(.getLocation))
"!/META-INF/MANIFEST.MF")
(URL.)
(.openStream)
(Manifest.)
(.. getMainAttributes)
(.getValue "Build-number"))))
but I've been told that it is bad karma to use def inside defn.
What is the Clojure-idiomatic way to set a constant at runtime? I obviously do not have the build-version information to embed in my code as a def, but I would like it set once (and for all) from the main function when the program starts. It should then be available as a def to the rest of the running code.
UPDATE: BTW, Clojure has to be one of the coolest languages I have come across in quite a while. Kudos to Rich Hickey!
I still think the cleanest way is to use alter-var-root in the main method of your application.
(declare version)
(defn -main
[& args]
(alter-var-root #'version (constantly (-> ...)))
(do-stuff))
It declares the Var at compile time, sets its root value at runtime once, doesn't require deref and is not bound to the main thread. You didn't respond to this suggestion in your previous question. Did you try this approach?
You could use dynamic binding.
(declare *version*)
(defn start-my-program []
(binding [*version* (read-version-from-file)]
(main))
Now main and every function it calls will see the value of *version*.
While kotarak's solution works very well, here is an alternative approach: turn your code into a memoized function that returns the version. Like so:
(def get-version
(memoize
(fn []
(-> (str "jar:" (-> my.ns.name (.getProtectionDomain)
(.getCodeSource)
(.getLocation))
"!/META-INF/MANIFEST.MF")
(URL.)
(.openStream)
(Manifest.)
(.. getMainAttributes)
(.getValue "Build-number")))))
I hope i dont miss something this time.
If version is a constant, it's going to be defined one time and is not going to be changed you can simple remove the defn and keep the (def version ... ) alone. I suppose you dont want this for some reason.
If you want to change global variables in a fn i think the more idiomatic way is to use some of concurrency constructions to store the data and access and change it in a secure way
For example:
(def *version* (atom ""))
(defn set-version! [] (swap! *version* ...))