Can I avoid a second symbol lookup in Clojure resolve? - clojure

I need to do a lot of string-to-symbol-value processing, but some of the strings will not resolve to symbols. I have written the following (example) code to test to see if a given string resolves to a known symbol, before dereferencing the var to get the symbol's value.
(if (resolve (symbol "A")) #(resolve (symbol "A")) ;; If the resolve of "A" works, deref it.
false) ;; If "A" doesn't resolve to a var, return false.
Maybe I think about things too much. I'm concerned about having to do the
(resolve (symbol "A"))
twice many times. Is there a clever way I can perform this "test-for-nil-and-deref-it-if-not," only once? Maybe a "safe" deref function that doesn't generate a runtime error when dereferencing nil? Or should I just not worry about performance in this code? Maybe the compiler's clever?

You can try the when-let macro to only call deref when the result of resolve is present:
(when-let [resolved (resolve (symbol "A"))]
#resolved)
Or a shorter version with some->:
(some-> "A" symbol resolve deref)

Related

Reflection warning despite type hint to Java constructor in Clojure

The following code gives me a reflection warning in spite of the type hint.
(set! *warn-on-reflection* true)
(IllegalArgumentException.
^String (with-out-str (print "hi")))
The warning:
Reflection warning ...
call to java.lang.IllegalArgumentException ctor
can't be resolved.
The code has been extracted and simplified from a more complex example where pretty printing an arbitrary object is performed within the with-out-str. I am using Clojure 1.10.0.
This is CLJ-865. It's not specific to with-out-str: adding a type hint to any form which is a macro invocation will typically discard it. The typical workaround is the one in your answer: define a local saving the value, artificially introducing an non-macro form to annotate.
Carcigenicate's insight into the cause inspired me to try the following which also works.
(let [m (with-out-str (print "hi"))]
(IllegalArgumentException.
^String m ))
I'm not sure the cause, but I'll note it can be fixed with a call to str:
(IllegalArgumentException. (str (with-out-str (print "hi"))))
It seems to be something to do with try?:
(set! *warn-on-reflection* true)
(IllegalArgumentException. ^String (try "" (finally "")))
Reflection warning, C:\Users\slomi\AppData\Local\Temp\form-init3916067866461493959.clj:3:1 - call to java.lang.IllegalArgumentException ctor can't be resolved.

Can I write this macro without using eval?

I'm trying to write a macro which will catch a compile time error in Clojure. Specifically, I would like to catch exceptions thrown when a protocol method, which has not been implemented for that datatype, is called and clojure.lang.Compiler$CompilerException is thrown.
So far I have:
(defmacro catch-compiler-error
[body]
(try
(eval body)
(catch Exception e e)))
But of course, I've been told that eval is evil and that you don't typically need to use it. Is there a way to implement this without using eval?
I'm inclined to believe that eval is appropriate here since I specifically want the code to be evaluated at runtime and not at compile time.
Macros are expanded at compile time. They don't need to eval code; rather, they assemble the code that will be later be evaluated at runtime. In other words, if you want to make sure that the code passed to a macro is evaluated at runtime and not at compile time, that tells you that you absolutely should not eval it in the macro definition.
The name catch-compiler-error is a bit of a misnomer with that in mind; if the code that calls your macro has a compiler error (a missing parenthesis, perhaps), there's not really anything your macro can do to catch it. You could write a catch-runtime-error macro like this:
(defmacro catch-runtime-error
[& body]
`(try
~#body
(catch Exception e#
e#)))
Here's how this macro works:
Take in an arbitrary number of arguments and store them in a sequence called body.
Create a list with these elements:
The symbol try
All the expressions passed in as arguments
Another list with these elements:
The symbol catch
The symbol java.lang.Exception (the qualified version of Exception)
A unique new symbol, which we can refer to later as e#
That same symbol that we created earlier
This is a bit much to swallow all at once. Let's take a look at what it does with some actual code:
(macroexpand
'(catch-runtime-error
(/ 4 2)
(/ 1 0)))
As you can see, I'm not simply evaluating a form with your macro as its first element; that would both expand the macro and evaluate the result. I just want to do the expansion step, so I'm using macroexpand, which gives me this:
(try
(/ 4 2)
(/ 1 0)
(catch java.lang.Exception e__19785__auto__
e__19785__auto__))
This is indeed what we expected: a list containing the symbol try, our body expressions, and another list with the symbols catch and java.lang.Exception followed by two copies of a unique symbol.
You can check that this macro does what you want it to do by directly evaluating it:
(catch-runtime-error (/ 4 2) (/ 1 0))
;=> #error {
; :cause "Divide by zero"
; :via
; [{:type java.lang.ArithmeticException
; :message "Divide by zero"
; :at [clojure.lang.Numbers divide "Numbers.java" 158]}]
; :trace
; [[clojure.lang.Numbers divide "Numbers.java" 158]
; [clojure.lang.Numbers divide "Numbers.java" 3808]
; ,,,]}
Excellent. Let's try it with some protocols:
(defprotocol Foo
(foo [this]))
(defprotocol Bar
(bar [this]))
(defrecord Baz []
Foo
(foo [_] :qux))
(catch-runtime-error (foo (->Baz)))
;=> :qux
(catch-runtime-error (bar (->Baz)))
;=> #error {,,,}
However, as noted above, you simply can't catch a compiler error using a macro like this. You could write a macro that returns a chunk of code that will call eval on the rest of the code passed in, thus pushing compile time back to runtime:
(defmacro catch-error
[& body]
`(try
(eval '(do ~#body))
(catch Exception e#
e#)))
Let's test the macroexpansion to make sure this works properly:
(macroexpand
'(catch-error
(foo (->Baz))
(foo (->Baz) nil)))
This expands to:
(try
(clojure.core/eval
'(do
(foo (->Baz))
(foo (->Baz) nil)))
(catch java.lang.Exception e__20408__auto__
e__20408__auto__))
Now we can catch even more errors, like IllegalArgumentExceptions caused by trying to pass an incorrect number of arguments:
(catch-error (bar (->Baz)))
;=> #error {,,,}
(catch-error (foo (->Baz) nil))
;=> #error {,,,}
However (and I want to make this very clear), don't do this. If you find yourself pushing compile time back to runtime just to try to catch these sorts of errors, you're almost certainly doing something wrong. You'd be much better off restructuring your project so that you don't have to do this.
I'm guessing you've already seen this question, which explains some of the pitfalls of eval pretty well. In Clojure specifically, you definitely shouldn't use it unless you completely understand the issues it raises regarding scope and context, in addition to the other problems discussed in that question.

Function symbol and local variables

The code below prints 10 as expected.
(def x 10)
(let [y 30] (eval (symbol "x")))
The code below generates an exception:
(let [y 20] (eval (symbol "y")))
Unable to resolve symbol: y in this context
which is expected (but confusing!). According to the documentation, symbols defined by let do not belong to any namespace and so can not be resolved through namespace mechanism.
So the question is: what should be an equivalent of function symbol for local variables.
Additionally:
I thought that Clojure compiler internally calls function symbol for each identifier to "intern" it, but as the example above shows it is not the case. Curious what the compiler is actually doing with the local identifiers. I assumed that when I enter x in REPL
x
is is essentially processed as this:
(deref (resolve (symbol "x")))
but again evidently it is not the case for local variables.
PS: Symbols in Clojure does not cover local variables.
All input to the clojure compiler is read, to form lists, symbols, keywords, numbers, and other readable data (eg. if you use a hash map literal, the compiler will get a hash map).
For example:
user=> (eval '(let [y 20] y))
20
In this case, we give the compiler a list starting with the symbol let (which resolves to a var macro wrapping a special form).
When you ask "what should be an equivalent of function symbol for local variables", my first thought is that you are misunderstanding what the function symbol is for. The following is equivalent to my initial example:
user=> (eval (list (symbol "let") [(symbol "y") 20] (symbol "y")))
20
symbol is only incidentally useful for getting a var from a string. In fact this usage is usually a hack, and a sign you are doing something wrong. Its primary purpose is to construct input for the compiler. Writing a form that gets a binding from its lexical scope is best done by writing a function, and having the user pass in the value to be used. History shows us that implicitly using locals in the callers environment is messy and error prone, and it is not a feature that Clojure explicitly supports (though there are definitely hacks that will work, which will be based on implementation details that are not guaranteed to behave properly the next release of the language).

Coordinating auto-gensym in nested syntax-quotes in Clojure

In Clojure, you need to use gensym to create symbols for internal use in your macros to keep them hygienic. However, sometimes you need to use the same symbol in nested syntax-quotes. For example, if I want to bind a value to a symbol with let and print it three times in an unrolled loop, I'd do
`(let [x# 1]
~#(repeat 3
`(println x#)))
But that would produce
(clojure.core/let [x__2__auto__ 1]
(clojure.core/println x__1__auto__)
(clojure.core/println x__1__auto__)
(clojure.core/println x__1__auto__))
x# generates a different symbol in the let form than in the println forms nested within it - because they were created from different syntax-quotes.
To solve it, I can generate the symbol beforehand and inject it to the syntax-quotes:
(let [x (gensym)]
`(let [~x 1]
~#(repeat 3
`(println ~x)))
)
This will produce the correct result, with the same symbol everywhere needed:
(clojure.core/let [G__7 1]
(clojure.core/println G__7)
(clojure.core/println G__7)
(clojure.core/println G__7))
Now, while it does produce the right result, the code itself looks ugly and verbose. I don't like having to "declare" a symbol, and the injection syntax makes it look like it came from outside the macro, or calculated somewhere within in it. I want to be able to use the auto-gensym syntax, which makes it clear that those are macro-internal symbols.
So, is there any way to use auto-gensym with nested syntax-quotes and make them produce the same symbol?
Auto-gensym'd symbols are only valid within the syntax-quote that defines them and they don't work in unquoted code because that is not part of the syntax quote.
Here the symbol x# gets replaced by it's gensym because it is within the scope of the syntax quote:
core> `(let [x# 1] x#)
(clojure.core/let [x__1942__auto__ 1] x__1942__auto__)
And if you unquote it it no longer gets translated into it's syntax quote:
core> `(let [x# 1] ~#x#)
CompilerException java.lang.RuntimeException: Unable to resolve symbol: x# in this context, compiling:(NO_SOURCE_PATH:1)
Auto-gensyms are a very convenient shortcut within the syntax-quote, everywhere else you apparently need to use gensym directly as is the case with your later example.
there are other ways to structure this macro so autogensyms will work though declaring gensymed symbols in a let at the top of a macro is very normal in Clojure and other lisps as well.
Your method (calling gensym) is the right one.
However in some cases you can get by with a clever use of doto, -> or ->>. See:
`(let [x# 1]
(doto x#
~#(repeat 3 `println)))
More generally you can do the following whenever faced with this situation:
(let [x `x#]
`(let [~x 1]
~#(repeat 3
`(println ~x))))
To be clear, you create the auto-gensym and bind it outside of the syntax-quoted form, and then inside any nested forms which require it you can just use syntax-unquote.

Which Vars affect a Clojure function?

How do I programmatically figure out which Vars may affect the results of a function defined in Clojure?
Consider this definition of a Clojure function:
(def ^:dynamic *increment* 3)
(defn f [x]
(+ x *increment*))
This is a function of x, but also of *increment* (and also of clojure.core/+(1); but I'm less concerned with that). When writing tests for this function, I want to make sure that I control all relevant inputs, so I do something like this:
(assert (= (binding [*increment* 3] (f 1)) 4))
(assert (= (binding [*increment* -1] (f 1)) 0))
(Imagine that *increment* is a configuration value that someone might reasonably change; I don't want this function's tests to need changing when this happens.)
My question is: how do I write an assertion that the value of (f 1) can depend on *increment* but not on any other Var? Because I expect that one day someone will refactor some code and cause the function to be
(defn f [x]
(+ x *increment* *additional-increment*))
and neglect to update the test, and I would like to have the test fail even if *additional-increment* is zero.
This is of course a simplified example – in a large system, there can be lots of dynamic Vars, and they can get referenced through a long chain of function calls. The solution needs to work even if f calls g which calls h which references a Var. It would be great if it didn't claim that (with-out-str (prn "foo")) depends on *out*, but this is less important. If the code being analyzed calls eval or uses Java interop, of course all bets are off.
I can think of three categories of solutions:
Get the information from the compiler
I imagine the compiler does scan function definitions for the necessary information, because if I try to refer to a nonexistent Var, it throws:
user=> (defn g [x] (if true x (+ *foobar* x)))
CompilerException java.lang.RuntimeException: Unable to resolve symbol: *foobar* in this context, compiling:(NO_SOURCE_PATH:24)
Note that this happens at compile time, and regardless of whether the offending code will ever be executed. Thus the compiler should know what Vars are potentially referenced by the function, and I would like to have access to that information.
Parse the source code and walk the syntax tree, and record when a Var is referenced
Because code is data and all that. I suppose this means calling macroexpand and handling each Clojure primitive and every kind of syntax they take. This looks so much like a compilation phase that it would be great to be able to call parts of the compiler, or somehow add my own hooks to the compiler.
Instrument the Var mechanism, execute the test and see which Vars get accessed
Not as complete as the other methods (what if a Var is used in a branch of the code that my test fails to exercise?) but this would suffice. I imagine I would need to redefine def to produce something that acts like a Var but records its accesses somehow.
(1) Actually that particular function doesn't change if you rebind +; but in Clojure 1.2 you can bypass that optimization by making it (defn f [x] (+ x 0 *increment*)) and then you can have fun with (binding [+ -] (f 3)). In Clojure 1.3 attempting to rebind + throws an error.
Regarding your first point you could consider using the analyze library. With it you can quite easily figure out which dynamic vars are used in an expression:
user> (def ^:dynamic *increment* 3)
user> (def src '(defn f [x]
(+ x *increment*)))
user> (def env {:ns {:name 'user} :context :eval})
user> (->> (analyze-one env src)
expr-seq
(filter (op= :var))
(map :var)
(filter (comp :dynamic meta))
set)
#{#'user/*increment*}
I know that this doesn't answer your question, but wouldn't it be a lot less work to just provide two versions of a function where one version has no free variables, and the other version calls the first one with the appropriate top-level defines?
For example:
(def ^:dynamic *increment* 3)
(defn f
([x]
(f x *increment*))
([x y]
(+ x y)))
This way you can write all your tests against (f x y), which doesn't rely on any global state.