Where does the repl put all the dynamically compiled class files? - clojure

If I type in the repl:
(fn []), it gives me back an anonymous function that is an instantiation of a generated class. Where can I find the byte representation of the class?

It's in memory. Even in non-REPL code (i.e. in a .clj file) you probably will never see the .class file; it'll just compile to bytecode and run it when necessary.
This is an excerpt from Michal Marczyk's excellent comment about .clj files:
Normally no actual .class files are produced, though you can ask for
them if you want (see (doc compile) and (doc compile-files)).
There's no reason to worry about this most of the time. Note that this
mode of operation is not particular to Clojure at all; Python does the
same thing, compiling .py files to Python bytecode and then running
it.
According to this thread, even manually requested compilation (via gen-class) isn't possible from the REPL, because gen-class looks for a .clj file to turn into a .class file.
P.s. To dump an object to a file (which I know isn't exactly what you're after) you can check out this site, which just uses clojure.core/prn to serialise a class and then a java.io.FileWriter to dump it to a file.

You can use Instrumentation to register a transformer to be called whenever a class is loaded or (re)defined. The Transformer receives the class name and bytes in class file format.

Related

How does Clojure's REPL maintain state?

I read online that Clojure uses the ASM library to generate JVM Bytecode, I also saw that Clojure has a REPL.
I assume each line of code executed by the REPL is compiled into a Java class using ASM and then that class is loaded to execute the code. If this is the case then each line would cause a new class file to be generated, so I'm not sure how local variables declared on one line could be shared with the lines which follow in the REPL.
Does anyone know how Clojure's REPL works? I tried reading the Clojure source code but I don't know much Clojure.
It's not "each line" that is compiled at a time, but "each form".
In the REPL, you are always in some namespace. You can change the current namespace of a REPL by using in-ns. In each namespace, there is a binding between symbols (loosely, "names") and Vars (loosely, a container that holds an immutable value). The "state" of the namespace is in the bindings of that namespace.
For example, if you evaluate the form (def a 17) in the current namespace, that will create a new (if it does not already exist) binding for the name a that points to a Var that contains the value 17. Now, you could later evaluate the form (+ a 25) in the same namespace. That will get the value of a in the namespace and add that to 25 to return 42.
The above is for symbols that are local to the namespace. These symbols are available to all forms evaluated in that namespace. (They also can be accessed from other namespaces, but I'll leave that out for now).
You might take a look at https://clojure.org/reference/evaluation if you have not already. The article at https://clojure.org/reference/vars might also be helpful.

Compile a C++ function inside a C++ program

Consider the following problem,
A C++ program may emit source of a C++ function, for example, say it will create a string with contents as below:
std::vector<std::shared_ptr<C>> get_ptr_vec()
{
std::vector<std::shared_ptr<C>> vec;
vec.push_back(std::shared_ptr<C>(new C(val1)));
vec.push_back(std::shared_ptr<C>(new C(val2)));
vec.push_back(std::shared_ptr<C>(new C(val3)));
vec.push_back(std::shared_ptr<C>(new C(val4)));
return vec;
}
The values of val1 etc will be determined at runtime when the program create the string of the source above. And this source will be write to a file, say get_ptr_vec.cpp.
Then another C++ program will need to read this source file, and compile it, and call the get_ptr_vec function and get the object it returns. Kind of like a JIT compiler.
Is there any way I can do this? One workaround I think would be having a script that will compile the file, build it into a shared library. And the second program can get the function through dlopen. However, is there anyway to skip this and having the second program to compile the file (without call to system). Note that, the second program will not be able to see this source file at compile time. In fact, there will be likely thousands such small source files emitted by the first program.
To give a little background, the first program will build a tree of expressions, and will serialize the tree by traversing through postorder. Each node of tree will have a string representation written to the file. The second program will read the list of this serialized tree nodes, and need to be able to reconstruct this list of strings to a list of C++ objects (and later from this list I can reconstruct the tree).
I think the LLVM framework may have something to offer here. Can someone give me some pointers on this? Not necessary a full answer, just somewhere for me to start.
You can compile your generated code with clang and emit LLVM bitcode (-emit-llvm flag). Then, statically link your program with parts of LLVM that read bitcode files and JITs them. Finally, take compiled bitcode and run JIT on them, so they will be available in your program's address space.

Clojure: War Compilation Failure with Missing Data File Dependency

I am working on an webapp that relies on a certain data file to be slurped at runtime. Without the datafile present I don't seem to be able to compile. Why is this?
This is in my core.clj
(def my-data (slurp "my-file.txt"))
Then when I try to compile:
$ lein ring war
I get this exception
Exception in thread "main" java.io.FileNotFoundException: my-file.txt (No such file or directory), compiling:(core.clj:24:28)
How can I compile my war? I don't need the file to be slurped or even check for existence at compile time. Thanks in advance!
[UPDATE]
This is not specific to war file packaging or ring, for example:
(ns slurp-test.core
(:gen-class))
(def x (slurp "/tmp/foo.txt"))
(defn -main [& args]
(println x))
Then:
$ lein uberjar
Compiling slurp-test.core
(ns slurp-test.core
Exception in thread "main" java.io.FileNotFoundException: /tmp/foo.txt (No such file or directory), compiling:(core.clj:4:8)
How can I fix this?
Compiling a Clojure source file involves evaluating all top-level forms. This is in fact strictly necessary to support the expected semantics -- most notably, macros couldn't work properly otherwise1.
If you AOT compile your code, top-level forms will be evaluated at compile time, and then again at run time as your compiled code is loaded.
For this reason, it is generally not a good idea to cause side effects in code living at top level. If an app requires initialization, it should be performed by a function (typically -main).
1 A macro is a function living in a Var marked as a macro (with :macro true in the Var's metadata; there's a setMacro method on clojure.lang.Var which adds this entry). Macros must clearly be available to the compiler, so they must be loaded at compile time. Furthermore, in computing an expansion, a macro function may want to call non-macro functions or otherwise make use of the values of arbitrary Vars resulting from evaluating any top-level code occurring before the point where the macro is invoked. Removing these capabilities would cripple the macro facility rather badly.

Compile lua code, store bytecode then load and execute it

I'm trying to compile a lua script that calls some exported functions, save the resulting bytecode to a file and then load this bytecode and execute it, but I haven't found any example on how to do this. Is there any example available on how to do this? How can I do this?
Edit: I'm using Lua + Luabind (C++)
This is all very simple.
First, you load the Lua script without executing it. It does not matter if you have connected the Lua state with your exported functions; all you're doing is compiling the script file.
You could use luaL_loadfile, which uses C-standard library functions to read a file from disk and load it into the lua_State. Alternatively, you can load the file yourself into a string and use luaL_loadstring to load it into the lua_State.
Both of these functions will emit return values and compiler errors as per the documentation for lua_load.
If the compilation was successful, the lua_State now has the compiled Lua chunk as a Lua function at the top of the stack. To get the compiled binary, you must use the lua_dump function. It's rather complicated as it uses a callback interface to pass you data. See the documentation for details.
After that process, you have the compiled Lua byte code. Shove that into a file of your choice. Just remember: write it as binary, not with text translation.
When it comes time to load the byte code, all you need to do is... exactly what you did before. Well, almost. Lua has heuristics to detect that a "string" it is given is a Lua source string or byte code. So yes, you can load byte code with luaL_loadfile just like before.
The difference is that you can't use luaL_loadstring with byte code. That function expects a NULL-terminated string, which is bad. Byte code can have embedded NULL characters in it, which would screw everything up. So if you want to do the file IO yourself (because you're using a special filesystem or something), you have to use lua_load directly (or luaL_loadbuffer). Which also uses a callback interface like lua_dump. So read up on how to use it.

Doesn't Clojure consume too much perm-gen space?

I'm new to Cojure, but I read that when using AOT compilation a class is generated for each function. Wouldn't that mean a whole lot of classes that consume perm-gen space? Aren't there any issues with that? What about when AOT compilation is not used, but bytecode is generated on the fly?
Well, I think it doesn't matter if the class is loaded from disk or from memory, wrt PermGen space.
However, notice that the problem may not be as bad as you think: each function is compiled once. Especially, anonymous functions which you can see here or there, generated "on the fly" are only compiled once, and each invocation of them just leads to the creation of new instances of those classes (an instance is needed to store the lexical context).
So the following code leads to the creation of two classes (one for create-fn, one for lambda-fn), whatever the number of calls to create-fn will be at runtime:
(defn create-fn [n] (fn lambda-fn [x] (add n x)))