How is the set! Function implemented in clojure - clojure

In examples, I see
(set! *unchecked-math* true)
and then operations are done. However, what exactly in the function set! And how come it is allowed to mutate unchecked-math which is a boolean?

set! is a special form (i.e. neither a function nor a macro) which sets the value of thread-a local-bound dynamic Var, or a Java instance/static field.
set! is implemented in Java as part of the core language: Var.java on GitHub.
You should read up on Var and set! on clojure.org, as Ankur points out in his comment: http://clojure.org/vars#set

To explain why set! works on *unchecked-math*:
*unchecked-math* is a dynamic Var for which the compiler installs a thread-local binding before it actually starts compiling. It is this thread-local binding that is set to true by (set! *unchecked-math* true). *warn-on-reflection* works similarly.
The initial value of the compiler's binding is simply whatever is obtained by derefing the Var. In particular, if the compiler is called upon to compile code on a thread which already has its own bindings for the dynamic Vars relevant to the compilation process, the compiler will use the values of those bindings; that is, it will still install its own bindings, but it will use the current values.

Related

Stars in the symbols in clojure

Do stars have a specific meaning when used in defining symbols (such as in functions, bindings etc.)? Is it just a normal binding name when i define something like:
(def *clojure* "CLOJURE")
As i def this i get in the REPL:
Warning: *clojure* not declared dynamic and thus is not dynamically rebindable, but its name suggests otherwise. Please either indicate ^:dynamic *clojure* or change the name.
Where can i read learn more about the special characters and things like **?
By convention, variables with 'earmuffs' (i.e. enclosed by *s) are dynamic vars which can be rebound using binding and related functions e.g.
(def ^:dynamic *dyn*)
(binding [*dyn* "Hello world!"]
(println *dyn*))
if you name a variable in this way without making it dynamic you get the warning you're seeing.
Check this clojure style guide. Earmuffs are just one type of convention as already been mentioned by #Lee.

What namespaces Clojure uses for def-ing

According to spec, def should intern the var in the current ns (i.e. *ns*). However, the following code does not look anything like it:
(ns namespace-b)
(defn def_something []
(ns namespace-a)
(println *ns*) ;prints namespace-a as it should
(def something 1)
)
(def_something)
(println namespace-b/something) ; prints 1
(println namespace-a/something) ; throws
What am I missing?
Notes:
defn is used just for clarity. Defining and running anonymous function works just as well.
I know that using def inside function is probably not very idiomatic. However, this is just extracted essence of a bigger problem I ran into.
The parser already interns the var to the current namespace at compile time, although it won't be bound immediately:
(defn dd [] (def x 0))
x ;; => #<Unbound Unbound: #'user/x>
The relevant piece of code can be found here, with the second parameter to lookupVar triggering the aforementioned interning for non-existing vars here.
The parses then generates an expression that references the previously created var, so the expression logic never leaves the current namespace.
TL;DR: def is something that the compiler handles in a special kind of way.
The key thing to understand about def is that it is a macro. This means that it does not resolve the namespace or create the binding at runtime, but beforehand, while the code is being compiled.
If you call a function that calls def, that call to def was already resolved to use the namespace in which the function was defined. Similarly, if you call functions inside a function body, the functions to call are resolved at compile time within the namespace where that function was defined.
If you want to generally bind values to namespaces at runtime, you should use the function intern, which lets you explicitly set the namespace to mutate.
All this said, namespace mutation is just that, it's procedural and is not thread safe and does not have nice declarative semantics like other options Clojure makes available. I would strongly suggest finding a way to express your solution that does not involve unsafe runtime mutation.

Idiomatic Way to Set & Update Clojure Namespace Flags?

I have a ring/compojure based web API and I need to be able to optionally turn on and off caching (or any flag for that matter) depending on a startup flag or having a param passed in to the request.
I tried having the flag set as a dynamic var:
(def ^:dynamic *cache* true)
(defmacro cache [source record options & body]
`(let [cachekey# (gen-cachekey ~source ~record ~options)]
(if-let [cacheval# (if (and (:ttl ~source) ~*cache*) (mc/generic-get cachekey#) nil)]
cacheval#
(let [ret# (do ~#body)]
(if (and (:ttl ~source) ~*cache*) (mc/generic-set cachekey# ret# :ttl (:ttl ~source)))
ret#))))
...but that only allows me to update the flag within a binding block which isn't ideal to wrap every data fetching function, and it din't allow me to optionally set the flag on start up
I then tried to set the flag in an atom, which allowed me to set the flag on the start up, and easily update the flag if a certain param was passed to the request, but the update would be changing the flag for all threads and not just the specific request's flag.
What's the most idiomatic way to do something like this in Clojure?
Firstly, unquoting *cache* in your macro definition means that it's compile-time value will be included in the compiled output and rebinding it at runtime will have no effect. If you want the value to be looked up at runtime, you should not unquote *cache*.
As for the actual question: if you want the various data fetching functions to react to a cache setting, you'll need to communicate it to them somehow anyway. Additionally, there are the two separate concerns of (1) computing the relevant flag values, (2) making them available to the handler so that it can communicate them to the functions which care.
Computing flag values and making them available to the main handler
For decisions on a per-request basis, examining some incoming parameters and settings, you might want to use a piece of middleware which will determine the correct values of the various flags and assoc them onto the request map. This way handlers living downstream from this piece of middleware will be able to look them up in the request map without knowing how they were computed.
You can of course install multiple pieces of middleware, each responsible for computing a different set of flags.
If you do use middleware, you'll likely want it to handle the default values. In this case, the note about setting defaults at startup in the section on dynamic Vars below may not be relevant.
Finally, if the application-level (global, thread-independent) defaults might change at runtime (as a result of a "turn off all caching" request, perhaps), you can store these in Atoms.
Communicating flag values to functions which care
First approach: dynamic Vars
Once you do that, you'll have to communicate the flags to the functions which actually perform operations where the flags are relevant; here dynamic Vars and explicit arguments are the most natural options.
Using a dynamic Var means that you don't have to do it explicitly for every function call involving such functions; instead, you can do it once per request, say. Installing a default value at startup is quite possible too; for example, you could use alter-var-root for that. (Or you could simply define the initial value of the Var in terms of information obtained from the environment.)
NB. if you launch new threads within the scope of a binding block, they will not see the bindings installed by this binding block automatically -- you'll have to arrange for them to be transmitted. The bound-fn macro is useful for creating functions which handle this automatically; see (doc bound-fn) for details.
The idea of using a single map with all flags described below is relevant here too, if perhaps not equally necessary for reasonable convenience; in essence, you'd be using a single dynamic Var instead of many.
Second approach: explicit arguments and flag maps
The other natural option is simply to pass in any relevant flags to the functions which need them. If you pass all the flags in a map, you can just assemble all options relevant to a request in a single map and pass it in to all the flag-aware functions without caring which flags any given function needs (as each function will simply examine the map for the flags it cares about, disregarding the others).
With this approach, you'll likely want to split the data fetching functionality into a function to get the value from the cache, a function to get the value from the data store and a flag-aware function which calls one of the other two depending on the flag value. This way you can, for example, test them separately. (Although if the individual functions really are completely trivial, I'd say it's ok to create only the flag-taking version at first; just remember to factor out any pieces which become more complex in the course of development.)

What's the use of ^:dynamic on a defonce?

Looking at clojure.test source code, I spotted the following:
(defonce ^:dynamic
^{:doc "True by default. If set to false, no test functions will
be created by deftest, set-test, or with-test. Use this to omit
tests when compiling or loading production code."
:added "1.1"}
*load-tests* true)
Is there any benefit or reason behind preventing redefinition (i.e. using defonce) of a var which is marked as ^:dynamic?
defonce doesn't prevent redefinition in general, but only when one reloads the file. This is useful typically when the var is maintaining some sort of state or context. I believe the usage of defonce here, could be an artifact from development of the library, where the developer needs to reload the file many times during development while still wanting to retain the same value.
Since the var is not pointing to a ref, but a direct var, using ^:dynamic is the right choice. Now the code can use set! or binding to change the value in a thread-local way.

Transforming Lisp to C++

I am working on a toy language that compiles to C++ based on lisp (very small subset of scheme), I am trying to figure out how to represent let expression,
(let ((var 10)
(test 12))
(+ 1 1)
var)
At first I thought execute all exprs then return the last one but returning will kill my ability to nest let expressions, what would be the way to go for representing let?
Also, any resources on source to source transformation is appriciated, I have googled but all I could fing is the 90 min scheme compiler.
One way to expand let is to treat it as a lambda:
((lambda (var test) (+ 1 1) var) 10 12)
Then, transform this to a function and a corresponding call in C++:
int lambda_1(int var, int test) {
1 + 1;
return var;
}
lambda_1(10, 12);
So in a larger context:
(display (let ((var 10)
(test 12))
(+ 1 1)
var))
becomes
display(lambda_1(10, 12));
There are a lot more details, such as needing to access lexical variables outside the let from within the let. Since C++ doesn't have lexically nested functions (unlike Pascal, for example), this will require additional implementation.
I'll try to explain a naive approach to compiling nested
lambdas. Since Greg's explanation of expanding let into a lambda
is very good, I won't address let at all, I'll assume that let is
a derived form or macro and is expanded into a lambda form that is
called immediately.
Compiling Lisp or Scheme functions directly into C or C++ functions
will be tricky due to the issues other posters raised. Depending on
the approach, the resulting C or C++ won't be recognizeable (or even
very readable).
I wrote a Lisp-to-C compiler after finishing Structure and Interpretation of Computer Programs (it's one of the final exercises, and actually I cheated and just wrote a translator from SICP byte code to C). The subset of C that it emitted didn't use C functions to handle Lisp functions at all. This is because the
register machine language in chapter 5 of SICP is really lower level
than C.
Assuming that you have some form of environments, which bind names to values, you can define the crux of function calling like this: extend the environment which the function was defined in with the formal parameters bound to the arguments, and then evaluate the body of the function in this new environment.
In SICP's compiler, the environment is held in a global variable, and there are other
global variables holding the argument list for a function call, as
well as the procedure object that is being called (the procedure object includes a pointer to the environment in which it was defined), and a label to jump to when a function returns.
Keep in mind that when you are compiling a lambda expression, there
are two syntactic components you know at compile-time: the formal
parameters and the body of the lambda.
When a function is compiled, the emitted code looks something like
this pseudocode:
some-label
*env* = definition env of *proc*
*env* = extend [formals] *argl* *env*
result of compiling [body]
...
jump *continue*
... where *env* and *argl* are the global variables holding the
environment and argument list, and extend is some function (this can
be a proper C++ function) that extends the environment *env* by
pairing up names in *argl* with values in [formals].
Then, when the compiled code is run, and there is a call to this
lambda somewhere else in your code, the calling convention is to put
the result of evaluating the argument list into the *argl* variable, put the return label in the *continue* variable, and then jump to some-label.
In the case of nested lambdas, the emitted code would look something
like this:
some-label
*env* = definition env of *proc*
*env* = extend [formals-outer] *argl* *env*
another-label
*env* = definition env of *proc*
*env* = extend [formals-inner] *argl* *env*
result of compiling [body-inner]
...
jump *continue*
rest of result of compiling [body-outer]
... somewhere in here there might be a jump to another-label
jump *continue*
This is a bit difficult to explain, and I'm sure I've done a muddled
job of it. I can't think of a decent example that doesn't involve me basically sloppily describing the whole chapter 5 of SICP. Since I spent the time to write this answer, I'm going to post it, but I'm very sorry if it's hopelessly confusing.
I strongly recommend SICP and Lisp in Small Pieces.
SICP covers metacircular interpretation for beginners, as well as a number of variants on the interpreter, and a byte code compiler which I have managed to obfuscate and mangle above. That's just the last two chapters, the first 3 chapters are just as good. It's a wonderful book. Absolutely read it if you haven't yet.
L.i.S.P includes a number of interpreters written in Scheme, a compiler to byte code and a compiler to C. I'm in the middle of it and can say with confidence it's a deep, rich book well worth the time of anyone interested in the implementation of Lisp. It may be a bit dated by this point, but for a beginner like me, it's still valuable. It's more advanced than SICP though, so beware. It incudes a chapter in the middle on denotational semantics which went basically right over my head.
Some other notes:
Darius Bacon's self-hosting Lisp to C compiler
lambda lifting, which is a more advanced technique that I think Marc Feeley uses
If you're looking for tools to help with source-to-source translation, I'd recommend ANTLR. It is most excellent.
However, you'll need to think about how to translate from a loosely-typed language (lisp) to a less-loosely-typed language (c). For example, in your question, what is the type of 10? A short? An int? A double?