How is the persistent JVMs feature implemented in cake? - clojure

I'm trying to understand how cake implements its multiple JVM approach. At a high level, I thought that cake was working similar to nailgun, where there is a single JVM instance (one JVM process), and new "JVMs" for different projects were actually just clojure/jars evaluated in a new classloader (along with different jar dependencies), which in my eyes is not a new JVM instance. From What's the difference between Cake and Leiningen? however, there is an implication that there are multiple JVMs (one for cake, and * for the projects), not just a single JVM instance.
If there are new JVM instances created, where does the speedup come from? With my understanding, I would reason that starting a new JVM implies creating a new JVM process which incurs the same startup overhead as usual.
If there are not, how are native dependencies added on? From what I understand, the JVM only knows about native dependencies from command line arguments passed before runtime. The only way I know how to circumvent this is with a Sun/Oracle JVM implementation specific hack listed below.
(let [clazz java.lang.ClassLoader
field (.getDeclaredField clazz "sys_paths")]
(.setAccessible field true)
(.set field clazz nil)
(System/setProperty "java.library.path" (apply str (interpose ";" native-paths))))

Cake has a Ruby script that starts up and manages the JVMs. Ruby doesn't have the JVM overhead, so the Ruby script could create the JVMs and then when you execute commands, the Ruby script would delegate those commands to the JVMs.
The reason two JVMs was necessary was so that cake's dependencies (the cake JVM) were separate from the project's dependencies (the bake JVM). Some commands like cake repl run in the bake JVM to take advantage of the project's classpath.
However, in the most recent version, there is only a single JVM per project. This is possible using different classloaders in the same JVM. The relevant library used is classlojure.
Even with the two JVM versions, the JVMs were persistent, meaning they were only spawned once and then restarted only when absolutely necessary, like in the case of a changed classpath (when you add a new dependency or something similar). I'm not sure why you'd think this would mean incurring the JVM overhead every time a command is executed. The idea is that a lot of commands happen instantly rather than every command starting a JVM.

Raynes is correct. As of cake 0.6.0, there is one JVM per project. Cake runs in the main classloader and uses classlojure to load the project in a separate classloader and reload it when the classpath changes. We have discussed a global ~/.cake/config option to share a single JVM among all projects. It shouldn't be too hard to add this using classlojure. The main issue with this approach is how to keep cake task plugins separate. Perhaps the global cake project could run in the main classloader and each project could get two classloaders (one for cake and one for the project).
As for native dependencies, classlojure doesn't support adding them after the JVM starts up. A patch to add this functionality would be welcome as long as the native library path is local to a specific classloader and isn't shared among all classloaders in the same JVM.

Related

Is VisualVM instrumenting bytecode?

I'm confused a little: AFAIK VisualVM perform profiling and sampling, so does it mean it not only makes dumps (thread stacks + memory state) but also instrumenting the code?
From here: https://stackoverflow.com/a/12130149/10894456 explained the profiling implies instrumenting. But does VisualVM makes instrumenting by itself or need something to prepare (like Java Agent or something)?
Yes, when you use the Profiler, VisualVM will instrument the bytecode as necessary. This can only be done via an Agent, so VisualVM includes such a Java Agent. When you are connected to a JVM on the same machine, it may use the Attach API to load the Agent into the target JVM dynamically. So in this use case, it doesn’t need additional preparation steps on the user’s side.

Persistence of data for MSI installation

The MSI installation would call my (native/C++) custom action functions. Since the DLL is freshly loaded, and the MSIEXEC.EXE process is launched separately for each function (the callable actions, as specified in MSI/WiX script), I cannot use any global data in C/C++ program.
How (or Where) can I store some information about the installation going on?
I cannot use named objects (like shared-memory) as the "process" that launches the DLL to call the "action" function would exit, and OS will not keep the named-object.
I may use an external file to store, but then how would I know (in the DLL's function):
When to delete the external file.
When to find that this function call is the first call (Action/function call Before="LaunchConditions" may help, not very sure).
If I cannot delete the file, I cannot know if "information" is current or stale (i.e. belonging to earlier failed/succeeded MSI run).
"Temporary MSI tables" I have heard of, but not sure how to utilize it.
Preserve Settings: I am a little confused what your custom actions do, to be honest. However, it sounds like they preserve settings from an older application and setup version and put them back in place if the MSI fails to install properly?
Migration Suggestion (please seriously consider this option): Could you install your new MSI package and delete all shortcuts and access to the old application whilst leaving it
installed instead? Your new application version installs to a new path
and a new registry hive, and then you migrate all settings on first
launch of the new application and then kick off the uninstall of the
old application - somehow - or just leave it installed if that is
acceptable? Are there COM servers in your old install? Other things that have global registration?
Custom Action Abstinence: The above is just a suggestion to avoid custom actions. There are many reasons to avoid custom actions (propaganda piece against custom actions). If you migrate settings on application launch you avoid all sequencing, conditioning, impersonation issues along with the technical issues you have already faced (there are many more) associated with custom action use. And crucially you are in a familiar debugging context (application launch code) as opposed to the unfamiliar world of setups and their poor debugability.
Preserving Settings & Data: With regards to saving data and settings in a running MSI instance, the built in mechanism is basically to set properties using Session.Property (COM / VBScript) or MsiSetProperty (Win32) calls. This allows you to preserve strings inside the MSI's Session object. Sort of global data.
Note that properties can only be set in immediate mode (custom actions that don't change the system), and sending the data to deferred mode custom actions (that can make system changes) is quite involved centering around the CustomActionData concept (more on deferred mode & CustomActionData).
Essentially you send a string to the deferred mode custom action by means of a SetProperty custom action in immediate mode. Typically a "home grown" delimited string that you construct in immediate mode and chew up into information pieces when receiving it in deferred mode. You could try to use JSON-strings and similar to make transfer easier and more reliable by serializing and de-serializing objects via JSON strings.
Alternatives?: This set property approach is involved. Some people write to and from the registry during installation, or to a temp file (in the temp folder) and then they clean up during the commit phase of MSI, but I don't like this approach for several reasons. For one thing commit custom actions might not run based on policies on target systems (when rollback is disabled, no commit script is created - see "Commit Execution" section), and it isn't best practice. Adding temporary rows is an interesting option that I have never spent much time on. I doubt you would be able to easily use this to achieve what you need, although I don't really know what you need in detail. I haven't used it properly. Quick sample. This RemoveFile example from WiX might be better.

How to use figwheel with a ring-handler that's a component?

I'd like to use figwheel to reload the frontend of an all-clojure project I'm playing with.
The backend serves a REST api and is organized as a bunch of components from which I create a system in my main function (I use duct to create the handler component). I want to pass state to my handlers using closures, but the only means of configuring figwheel to use my handler seems to be setting the ring-handler key in project.clj, and this requires that I pass a handler that is defined in a namespace at lein startup time.
So - is there a way to configure figwheel when I am doing my component startup? I'm still very new at Closure so it's likely I'm missing something in plain sight.
Passing state as parameter to a ring handler? is a similar question, but the answer there involves binding the handler a var at the top-level of a namespace, which I'm trying to avoid.
Figwheel doesn't need to be a handler. You can wrap a component that autobuilds while your server is up and running by dissecting this code into a component, a dependency to your server component so that it starts first. Note that this isn't officially supported. Running lein figwheel from the shell to boot a seperate JVM is the conventional usage.
If you are using Stuarts component lib I'd recommend to wrapping the ring handler from within a server component rather than via project.clj. Use this project or adapt the code snippet for a jetty component.
Notice that figwheel is devtooling, so in production you most likely want to serve a compiled js file built with e. g. lein-cljsbuild.
James Reeves made a component for figwheel here
Duct-Figwheel-Component
A component for the Figwheel development tool, designed to be used in the Duct framework (but can be used in any component-based system).
Installation
Add the following dependency to your project.clj:
[duct/figwheel-component "0.3.3"]

Web-app with clojure using hot-swapping of code

I'm thinking of writing a web-app in clojure that can update itself without restarting or loosing state.
I've seen some articles where Clojure apps can perform so-called hot-swapping of code. Meaning that they can update their own functions at runtime. Would this be safe to perform on a web-server?
To get hot-swap for code is tricky to get right, if possible at all.
It depends on the changeset and the running application too.
Issues:
old vars may litter namespaces and cause subtle conflicts, bugs
redefinition of multiple vars is not atomic
There may be old vars in a namespace that will not be there if you restart the application, however will interfere if you just redefine some of the functions and keep the app running without restart.
The other issue is atomicity: redefining multiple functions i.e. changing multiple vars is not atomic. If you change functions in one or more namespace that code in some other namespace depends on, reloading the namespaces with the new code is not atomic.
Generally, you are better off either
having a proxy hold the requests until your app restarts
spinning up a new app instance parallel to the "old version" and use a proxy to switch from the new version after the new version is ready to process requests
OTP applications in Erlang support this. Basically, it will spin the new version of your application up and start sending requests to the new version of your application. It will keep the old version alive until it has completed processing requests and then shut it down.

clojure rmi classpath problem

I am trying to use clojure to implement a "plugin" for some vendor
supplied software.
Here is a little background on the vendor supplied software. It
expects me to implement a particular interface and then put the jar
file containing that implementation into a directory on its server.
Then when a client runs the software, my implemented class gets "sent"
to the client from the server via RMI and then my implementation of
the interface runs on the client. The client doesn't have my jar file
(or the clojure jar file) in it's classpath. Only the server has
those jar files. RMI seems to be smart enough to upload whatever
dependencies are necessary.
I have successfully built a very simple implementation in clojure and
it seems to work. The problem is, I would like to be able to update my
implementation on the client on the fly. I embedded a repl-server in
my class and I can successfully connect to it. Just to be clear, the
repl-server is running on the client and I am able to connect to the
repl getting a prompt "clojure.core=>". However, the repl seems to be
quite crippled. If I enter (+ 1 1) I get the following error:
"java.lang.ClassNotFoundException: clojure.lang.Numbers". If enter
(str "kent") I get "java.lang.NoClassDefFoundError: clojure/lang/
AFunction". Most things I enter produce something similar. I can
however do a simple def such as (def x 3) and x does get defined so
the REPL does seem to be running in some sense.
It seems like it might be a classpath problem, but I'm not sure why my
"compiled" code, running on the client would not have a classpath
problem while the repl, running on the same client cant find core
classes.
Any ideas?
Thanks.
Kent.
First of all, would it be possible to distribute clojure.jar as part of your RMI client? Based on your description of the vendor software, I'm guessing the answer is no.
Second, is the contents of clojure.jar and your RMI object in the same jar file on the server, or are both in their own jar files?
It seems very likely that it's a classloader issue. In Clojure each defined function generates its own class file that Clojure then load via a specific class loader. IIRC each function is loaded by its own classloader instance in order to allow that function to be garbage collected in case it is redefined. Similarly, I think, RMI uses its own class loader to load remote RMI objects over the network. So possibly the two class loaders interact badly.
Sorry I can't be of more help...
-- Lauri