Clojure optimization of java interop - clojure

When working with existing java classes I often get reflection warnings if I've done something incorrectly, e.g.
IllegalArgumentException No matching field found: gets for class
java.lang.String clojure.lang.Reflector.getInstanceField
(Reflector.java:271)
Is clojure doing reflection at runtime for each invocation of the given methods? or is this cached in any sort of way? Would there be a speed benefit to moving any kind of involved java-interop into a related java class?

Clojure will do reflection at runtime only if it can't infer the exact method to call based on the surrounding context, otherwise it emits code that will call the method directly. You can use type hints to provide the compiler with this context if needed. For instance:
user=> (set! *warn-on-reflection* true)
user=> (defn str-len [x] (.length x))
Reflection warning, NO_SOURCE_PATH:1:19 - reference to field length can't be resolved.
user=> (defn str-len-2 [^String x] (.length x))
user=> (str-len "abc") ; -> 3
user=> (str-len-2 "abc") ; -> 3
user=> (time (dotimes [_ 100000] (str-len "abc")))
"Elapsed time: 1581.163611 msecs"
user=> (time (dotimes [_ 100000] (str-len-2 "abc")))
"Elapsed time: 36.838201 msecs"
The first function will use reflection every time it's invoked; the second has similar performance to native Java code.

That isn't a reflection warning, it is just an indication that it was using reflection.
You can use type hints to eliminate reflection. The *warn-on-reflection* flag as described in the above link (default false), optionally enables reflection warnings.
I find it convenient to use Leiningen's lein check utility, which attempts to compile every Clojure file in your project, with reflection warnings turned on. This will report reflection issues in your code, or in any code loaded from libraries.

Related

Reflection warning despite type hint to Java constructor in Clojure

The following code gives me a reflection warning in spite of the type hint.
(set! *warn-on-reflection* true)
(IllegalArgumentException.
^String (with-out-str (print "hi")))
The warning:
Reflection warning ...
call to java.lang.IllegalArgumentException ctor
can't be resolved.
The code has been extracted and simplified from a more complex example where pretty printing an arbitrary object is performed within the with-out-str. I am using Clojure 1.10.0.
This is CLJ-865. It's not specific to with-out-str: adding a type hint to any form which is a macro invocation will typically discard it. The typical workaround is the one in your answer: define a local saving the value, artificially introducing an non-macro form to annotate.
Carcigenicate's insight into the cause inspired me to try the following which also works.
(let [m (with-out-str (print "hi"))]
(IllegalArgumentException.
^String m ))
I'm not sure the cause, but I'll note it can be fixed with a call to str:
(IllegalArgumentException. (str (with-out-str (print "hi"))))
It seems to be something to do with try?:
(set! *warn-on-reflection* true)
(IllegalArgumentException. ^String (try "" (finally "")))
Reflection warning, C:\Users\slomi\AppData\Local\Temp\form-init3916067866461493959.clj:3:1 - call to java.lang.IllegalArgumentException ctor can't be resolved.

Using let style destructuring for def

Is there a reasonable way to have multiple def statements happen with destructing the same way that let does it? For Example:
(let [[rtgs pcts] (->> (sort-by second row)
(apply map vector))]
.....)
What I want is something like:
(defs [rtgs pcts] (->> (sort-by second row)
(apply map vector)))
This comes up a lot in the REPL, notebooks and when debugging. Seriously feels like a missing feature so I'd like guidance on one of:
This exists already and I'm missing it
This is a bad idea because... (variable capture?, un-idiomatic?, Rich said so?)
It's just un-needed and I must be suffering from withdrawals from an evil language. (same as: don't mess up our language with your macros)
A super short experiment give me something like:
(defmacro def2 [[name1 name2] form]
`(let [[ret1# ret2#] ~form]
(do (def ~name1 ret1#)
(def ~name2 ret2#))))
And this works as in:
(def2 [three five] ((juxt dec inc) 4))
three ;; => 3
five ;; => 5
Of course and "industrial strength" version of that macro might be:
checking that number of names matches the number of inputs. (return from form)
recursive call to handle more names (can I do that in a macro like this?)
While I agree with Josh that you probably shouldn't have this running in production, I don't see any harm in having it as a convenience at the repl (in fact I think I'll copy this into my debug-repl kitchen-sink library).
I enjoy writing macros (although they're usually not needed) so I whipped up an implementation. It accepts any binding form, like in let.
(I wrote this specs-first, but if you're on clojure < 1.9.0-alpha17, you can just remove the spec stuff and it'll work the same.)
(ns macro-fun
(:require
[clojure.spec.alpha :as s]
[clojure.core.specs.alpha :as core-specs]))
(s/fdef syms-in-binding
:args (s/cat :b ::core-specs/binding-form)
:ret (s/coll-of simple-symbol? :kind vector?))
(defn syms-in-binding
"Returns a vector of all symbols in a binding form."
[b]
(letfn [(step [acc coll]
(reduce (fn [acc x]
(cond (coll? x) (step acc x)
(symbol? x) (conj acc x)
:else acc))
acc, coll))]
(if (symbol? b) [b] (step [] b))))
(s/fdef defs
:args (s/cat :binding ::core-specs/binding-form, :body any?))
(defmacro defs
"Like def, but can take a binding form instead of a symbol to
destructure the results of the body.
Doesn't support docstrings or other metadata."
[binding body]
`(let [~binding ~body]
~#(for [sym (syms-in-binding binding)]
`(def ~sym ~sym))))
;; Usage
(defs {:keys [foo bar]} {:foo 42 :bar 36})
foo ;=> 42
bar ;=> 36
(defs [a b [c d]] [1 2 [3 4]])
[a b c d] ;=> [1 2 3 4]
(defs baz 42)
baz ;=> 42
About your REPL-driven development comment:
I don't have any experience with Ipython, but I'll give a brief explanation of my REPL workflow and you can maybe comment about any comparisons/contrasts with Ipython.
I never use my repl like a terminal, inputting a command and waiting for a reply. My editor supports (emacs, but any clojure editor should do) putting the cursor at the end of any s-expression and sending that to the repl, "printing" the result after the cursor.
I usually have a comment block in the file where I start working, just typing whatever and evaluating it. Then, when I'm reasonably happy with a result, I pull it out of the "repl-area" and into the "real-code".
(ns stuff.core)
;; Real code is here.
;; I make sure that this part always basically works,
;; ie. doesn't blow up when I evaluate the whole file
(defn foo-fn [x]
,,,)
(comment
;; Random experiments.
;; I usually delete this when I'm done with a coding session,
;; but I copy some forms into tests.
;; Sometimes I leave it for posterity though,
;; if I think it explains something well.
(def some-data [,,,])
;; Trying out foo-fn, maybe copy this into a test when I'm done.
(foo-fn some-data)
;; Half-finished other stuff.
(defn bar-fn [x] ,,,)
(keys 42) ; I wonder what happens if...
)
You can see an example of this in the clojure core source code.
The number of defs that any piece of clojure will have will vary per project, but I'd say that in general, defs are not often the result of some computation, let alone the result of a computation that needs to be destructured. More often defs are the starting point for some later computation that will depend on this value.
Usually functions are better for computing a value; and if the computation is expensive, then you can memoize the function. If you feel you really need this functionality, then by all means, use your macro -- that's one of the sellings points of clojure, namely, extensibility! But in general, if you feel you need this construct, consider the possibility that you're relying too much on global state.
Just to give some real examples, I just referenced my main project at work, which is probably 2K-3K lines of clojure, in about 20 namespaces. We have about 20 defs, most of which are marked private and among them, none are actually computing anything. We have things like:
(def path-prefix "/some-path")
(def zk-conn (atom nil))
(def success? #{200})
(def compile* (clojure.core.memoize/ttl compiler {} ...)))
(def ^:private nashorn-factory (NashornScriptEngineFactory.))
(def ^:private read-json (comp json/read-str ... ))
Defining functions (using comp and memoize), enumerations, state via atom -- but no real computation.
So I'd say, based on your bullet points above, this falls somewhere between 2 and 3: it's definitely not a common use case that's needed (you're the first person I've ever heard who wants this, so it's uncommon to me anyway); and the reason it's uncommon is because of what I said above, i.e., it may be a code smell that indicates reliance on too much global state, and hence, would not be very idiomatic.
One litmus test I have for much of my code is: if I pull this function out of this namespace and paste it into another, does it still work? Removing dependencies on external vars allows for easier testing and more modular code. Sometimes we need it though, so see what your requirements are and proceed accordingly. Best of luck!

Is there a good way to check return types when refactoring?

Currently when I'm refactoring in Clojure I tend to use the pattern:
(defn my-func [arg1 arg2]
(assert (= demo.core.Record1 (class arg1) "Incorrect class for arg1: " (class arg1))
(assert (= demo.core.Record1 (class arg2) "Incorrect class for arg2: " (class arg2))
...
That is, I find myself manually checking the return types in case a downstream part of the system modifies them to something I don't expect. (As in, if I refactor and get a stack-trace I don't expect, then I express my assumptions as invariants, and step forward from there).
In one sense, this is exactly the kind of invariant checking that Bertrand Meyer anticipated. (Author of Object Oriented Software Construction, and proponent of the idea of Design by Contract).
The challenge is that I don't find these out until run-time. I would be nice to find these out at compile-time - by simply stating what the function expects.
Now I know Clojure is essentially a dynamic language. (Whilst Clojure has a 'compiler' of sorts, we should expect the application of values to a function to only come to realisation at run-time.)
I just want a good pattern to make refactoring easier. (ie see all the flow-on effects of changing an argument to a function, without seeing it break on the first call, then moving onto the next, then moving onto the next breakage.)
My question is: Is there a good way to check return types when refactoring?
If I understand you right, prismatic/schema should be your choice.
https://github.com/plumatic/schema
(s/defn ^:always-validate my-func :- SomeResultClass
[arg1 :- demo.core.Record1
arg2 :- demo.core.Record1]
...)
you should just turn off all the validation before release, so it won't affect performance.
core.typed is nice, but as far as i remember, it enforces you to annotate all your code, while schema lets you only annotate critical parts.
You have a couple of options, only one of which is "compile" time:
Tests
As Clojure is a dynamic language, tests are absolutely essential. They are your safety net when refactoring. Even in statically typed languages tests are still of use.
Pre and Post Conditions
They allow you to verify your invariants by adding metadata to your functions such as in this example from Michael Fogus' blog:
(defn constrained-fn [f x]
{:pre [(pos? x)]
:post [(= % (* 2 x))]}
(f x))
(constrained-fn #(* 2 %) 2)
;=> 4
(constrained-fn #(float (* 2 %)) 2)
;=> 4.0
(constrained-fn #(* 3 %) 2)
;=> java.lang.Exception: Assert failed: (= % (* 2 x)
core.typed
core.typed is the only option in this list that will give you compile time checking. Your example would then be expressed like so:
(ann my-func (Fn [Record1 Record1 -> ResultType]))
(defn my-func [arg1 arg2]
...)
This comes at the expense of running core.typed as a seperate action, possibly as part of your test suite.
And still on the realm of runtime validation/checking, there are even more options such as bouncer and schema.

Attaching type hints to a Clojure delayed call

I'm attempting to put handles on Java objects which aren't available at compile time, but are available at runtime, in vars as follows:
(def component-manager (delay (SomeJavaObject/getHandle)))
(If a better mechanism than delays is available, this would be welcome).
When these objects are used, a reflection warning is generated. As sometimes these are fairly frequent, I've tried to avoid it with the following:
(def my-handle ^SomeJavaObject (delay (SomeJavaObject/getHandle)))
Unfortunately, the reflection warning is still generated in this case.
Modifying the references works:
(.foo ^SomeJavaObject #my-handle)
...but this uglifies the code substantially.
Wrapping in a macro which adds the type hints seems an obvious approach:
(def my-handle' (delay (SomeJavaObject/getHandle)))
(defmacro my-handle []
(with-meta '(deref my-handle')
{:tag SomeJavaObject}))
...and looks like it should do the right thing:
=> (set! *print-meta* true)
=> (macroexpand '(my-handle))
^SomeJavaObject (deref my-handle')
...but this doesn't hold true when the rubber hits the road:
=> (.foo (my-handle))
Reflection warning, NO_SOURCE_PATH:1 - reference to field foo can't be resolved.
What's the right way to do this?
I am not sure if delay is the best thing here or there can be much better solution for managing those Java objects, but as far as reflection warning is concerned below code does solve it by wrapping the delay value access in a function with type hint.
user=> (set! *warn-on-reflection* true)
true
user=> (def myobj (delay "hello world"))
#'user/myobj
user=> (defn ^String get-my-obj [] #myobj)
#'user/get-my-obj
user=> (.length (get-my-obj))
11
To make it much more easier you can create a macro that creates the delay object and also create a get-<delay object name> function to access that delay object by using type hint.
I'd like to offer two different solutions:
1. delay
First off, I wanted to post an answer that shows how your problem can be recreated. (As worded, your question is abstract and cannot be run in a REPL.)
(set! *warn-on-reflection* true)
(def rt (delay (Runtime/getRuntime)))
(.availableProcessors #rt) ; Reflection warning
You mention that you don't like the following approach because you consider it verbose:
(.availableProcessors ^java.lang.Runtime #rt) ; no warning
I tried the following, but it did not solve the reflection warning:
(def rt (delay ^java.lang.Runtime (Runtime/getRuntime)))
(.availableProcessors #rt) ; Reflection warning
Conclusion: if you want to use a delay, Ankur's answer works great. To adapt it to this example:
(def delayed-runtime (delay (Runtime/getRuntime)))
(defn ^java.lang.Runtime get-runtime [] #delayed-runtime)
(.availableProcessors (get-runtime)) ; no warning
2. memoize
You might also consider using memoization, as it requires less code:
(def ^java.lang.Runtime get-runtime (memoize #(Runtime/getRuntime)))
(.availableProcessors (get-runtime)) ; no warning

Which Vars affect a Clojure function?

How do I programmatically figure out which Vars may affect the results of a function defined in Clojure?
Consider this definition of a Clojure function:
(def ^:dynamic *increment* 3)
(defn f [x]
(+ x *increment*))
This is a function of x, but also of *increment* (and also of clojure.core/+(1); but I'm less concerned with that). When writing tests for this function, I want to make sure that I control all relevant inputs, so I do something like this:
(assert (= (binding [*increment* 3] (f 1)) 4))
(assert (= (binding [*increment* -1] (f 1)) 0))
(Imagine that *increment* is a configuration value that someone might reasonably change; I don't want this function's tests to need changing when this happens.)
My question is: how do I write an assertion that the value of (f 1) can depend on *increment* but not on any other Var? Because I expect that one day someone will refactor some code and cause the function to be
(defn f [x]
(+ x *increment* *additional-increment*))
and neglect to update the test, and I would like to have the test fail even if *additional-increment* is zero.
This is of course a simplified example – in a large system, there can be lots of dynamic Vars, and they can get referenced through a long chain of function calls. The solution needs to work even if f calls g which calls h which references a Var. It would be great if it didn't claim that (with-out-str (prn "foo")) depends on *out*, but this is less important. If the code being analyzed calls eval or uses Java interop, of course all bets are off.
I can think of three categories of solutions:
Get the information from the compiler
I imagine the compiler does scan function definitions for the necessary information, because if I try to refer to a nonexistent Var, it throws:
user=> (defn g [x] (if true x (+ *foobar* x)))
CompilerException java.lang.RuntimeException: Unable to resolve symbol: *foobar* in this context, compiling:(NO_SOURCE_PATH:24)
Note that this happens at compile time, and regardless of whether the offending code will ever be executed. Thus the compiler should know what Vars are potentially referenced by the function, and I would like to have access to that information.
Parse the source code and walk the syntax tree, and record when a Var is referenced
Because code is data and all that. I suppose this means calling macroexpand and handling each Clojure primitive and every kind of syntax they take. This looks so much like a compilation phase that it would be great to be able to call parts of the compiler, or somehow add my own hooks to the compiler.
Instrument the Var mechanism, execute the test and see which Vars get accessed
Not as complete as the other methods (what if a Var is used in a branch of the code that my test fails to exercise?) but this would suffice. I imagine I would need to redefine def to produce something that acts like a Var but records its accesses somehow.
(1) Actually that particular function doesn't change if you rebind +; but in Clojure 1.2 you can bypass that optimization by making it (defn f [x] (+ x 0 *increment*)) and then you can have fun with (binding [+ -] (f 3)). In Clojure 1.3 attempting to rebind + throws an error.
Regarding your first point you could consider using the analyze library. With it you can quite easily figure out which dynamic vars are used in an expression:
user> (def ^:dynamic *increment* 3)
user> (def src '(defn f [x]
(+ x *increment*)))
user> (def env {:ns {:name 'user} :context :eval})
user> (->> (analyze-one env src)
expr-seq
(filter (op= :var))
(map :var)
(filter (comp :dynamic meta))
set)
#{#'user/*increment*}
I know that this doesn't answer your question, but wouldn't it be a lot less work to just provide two versions of a function where one version has no free variables, and the other version calls the first one with the appropriate top-level defines?
For example:
(def ^:dynamic *increment* 3)
(defn f
([x]
(f x *increment*))
([x y]
(+ x y)))
This way you can write all your tests against (f x y), which doesn't rely on any global state.