What does the with-open macro do behind the scenes in clojure? - clojure

Visually, with-open looks similar to let. I know with-open is for a different purpose but I cannot find a clear answer as to what with-open is doing. And, what is the first argument in with-open?
The documentation says this:
"bindings => [name init ...]
Evaluates body in a try expression with names bound to the values
of the inits, and a finally clause that calls (.close name) on each
name in reverse order."
I do not understand this. I would really appreciate if someone explained what with-open actually does?

what does it do?
A macroexpanded example, with some formatting and after removing unnecessary explicit usages of clojure.core/...:
(macroexpand-1
'(with-open [reader (some-fn-that-creates-a-reader)]
(read-stuff reader)))
=>
(let [reader (some-fn-that-creates-a-reader)]
(try
;; This `with-let` is just a no-op, like `(let [] ...)` - an artifact of the implementation.
(with-open [] (read-stuff reader))
(finally (. reader close))))
As you can see, it's exactly the same as let, but it wraps the body in a try form and closes the values provided in the binding vector in the finally form.
when should I use it?
When you need for something to be closed at the end, regardless of whether the code in the body was successful or not.
It's a common pattern for reading/writing files with an explicit reader/writer or for using other IO that needs to be opened and closed explicitly.

Related

In Clojure, why do you have to use parenthesis when def'ing a function and use cases of let

I'm starting to learn clojure and I've stumbled upon the following, when I found myself declaring a "sum" function (for learning purposes) I wrote the following code
(def sum (fn [& args] (apply + args)))
I have understood that I defined the symbol sum as containing that fn, but why do I have to enclose the Fn in parenthesis, isn't the compiler calling that function upon definition instead of when someone is actually invoking it? Maybe it's just my imperative brain talking.
Also, what are the use cases of let? Sometimes I stumble on code that use it and other code that don't, for example on the Clojure site there's an exercise to use the OpenStream function from the Java Interop, I wrote the following code:
(defn http-get
[url]
(let [url-obj (java.net.URL. url)]
(slurp (.openStream url-obj))))
(http-get "https://www.google.com")
whilst they wrote the following on the clojure site as an answer
(defn http-get [url]
(slurp
(.openStream
(java.net.URL. url))))
Again maybe it's just my imperative brain talking, the need of having a "variable" or an "object" to store something before using it, but I quite don't understand when I should use let or when I shouldn't.
To answer both of your questions:
1.
(def sum (fn [& args] (apply + args)))
Using def here is very unorthodox. When you define a function you usually want to use defn. But since you used def you should know that def binds a name to a value. fn's return value is a function. Effectively you bound the name sum to the function returned by applying (using parenthesis which are used for application) fn.
You could have used the more traditional (defn sum [& args] (apply + args))
2.
While using let sometimes makes sense for readability (separating steps outside their nested use) it is sometimes required when you want to do something once and use it multiple times. It binds the result to a name within a specified context.
We can look at the following example and see that without let it becomes harder to write (function is for demonstration purposes):
(let [db-results (query "select * from table")] ;; note: query is not a pure function
;; do stuff with db-results
(f db-results)
;; return db-results
db-results)))
This simply re-uses a return value (db-results) from a function that you usually only want to run once - in multiple locations. So let can be used for style like the example you've given, but its also very useful for value reuse within some context.
Both def and defn define a global symbol, sort of like a global variable in Java, etc. Also, (defn xxx ...) is a (very common) shortcut for (def xxx (fn ...)). So, both versions will work exactly the same way when you run the program. Since the defn version is shorter and more explicit, that is what you will do 99% of the time.
Typing (let [xxx ...] ...) defines a local symbol, which cannot be seen by code outside of the let form, just like a local variable (block-scope) in Java, etc.
Just like Java, it is optional when to have a local variable like url-obj. It will make no difference to the running program. You must answer the question, "Which version makes my code easier to read and understand?" This part is no different than Java.

Use of ^ in clojure function parameter definition

(defn lines
"Given an open reader, return a lazy sequence of lines"
[^java.io.BufferedReader reader]
(take-while identity (repeatedly #(.readLine reader))))
what does this line mean? -> [^java.io.BufferedReader reader]
also I know this is a dumb question. can you show me the documentation where I could read this myself? So that I don't have to ask it here :)
You can find documentation here:
https://clojure.org/reference/java_interop#typehints
Clojure supports the use of type hints to assist the compiler in avoiding reflection in performance-critical areas of code. Normally, one should avoid the use of type hints until there is a known performance bottleneck. Type hints are metadata tags placed on symbols or expressions that are consumed by the compiler. They can be placed on function parameters, let-bound names, var names (when defined), and expressions:
(defn len [x]
(.length x))
(defn len2 [^String x]
(.length x))
...
Once a type hint has been placed on an identifier or expression, the compiler will try to resolve any calls to methods thereupon at compile time. In addition, the compiler will track the use of any return values and infer types for their use and so on, so very few hints are needed to get a fully compile-time resolved series of calls.
You should also check out:
https://clojure.org/guides/weird_characters
https://clojure.org/reference/reader
And never, ever fail to keep open a browser tab to The Clojure CheatSheet
You may also wish to review this answer.

Eval with local bindings function

I'm trying to write a function which takes a sequence of bindings and an expression and returns the result.
The sequence of bindings are formatted thus: ([:bind-type [bind-vec] ... ) where bind-type is either let or letfn. For example:
([:let [a 10 b 20]] [:letfn [(foo [x] (inc x))]] ... )
And the expression just a regular Clojure expression e.g. (foo (+ a b)) so together this example pair of inputs would yeild 31.
Currently I have this:
(defn wrap-bindings
[[[bind-type bind-vec :as binding] & rest] expr]
(if binding
(let [bind-op (case bind-type :let 'let* :letfn 'letfn*)]
`(~bind-op ~bind-vec ~(wrap-bindings rest expr)))
expr))
(defn eval-with-bindings
([bindings expr]
(eval (wrap-bindings bindings expr))))
I am not very experienced with Clojure and have been told that use of eval is generally bad practice. I do not believe that I can write this as a macro since the bindings and expression may only be given at run-time, so what I am asking is: is there a more idiomatic way of doing this?
eval is almost always not the answer though sometimes rare things happen. In this case you meet the criteria because:
since the bindings and expression may only be given at run-time
You desire arbitrary code to be input and run while the program is going
The binding forms to be used can take any data as it's input, even data from elsewhere in the program
So your existing example using eval is appropriate given the contraints of the question at least as I'm understanding it. Perhaps there is room to change the requirements to allow the expressions to be defined in advance and remove the need for eval, though if not then i'd suggest using what you have.

Clojure style, anything like --> with doto semantics?

I'm starting to use clojure to test a java library. So my question is more "what is the clojure way of doing this".
I have a lot of code that looks like the following
...
(with-open [fs (create-filesystem)]
(let [root (.getFile fs path)]
(is (.isDirectory root))
(is (.isComplete root)))
(let [suc (.getFile fs (str path "/_SUCCESS"))]
(is (.isFile suc))
(is (.isComplete suc))))
Since this is a java object, I need to verify a set of properties are true on the object. I know about doto that will let me do things like
(doto (.getFile fs path)
(.setPath path2)
(.setName name2))
and --> will let me have a list of partial functions and have each result pass through each function. Been thinking that something like --> but keeps passing the same object like doto would help with these tests. Would something like this be a good way to do this, or am I not really doing this the clojure way?
Thanks for your time!
You can use doto:
(doto (.getFile fs path) (.setPath path2) (.setPath name2))
You may learn about other threading macros like cond->, as->, some-> etc... for "the Clojure way" purpose.
There's no macro that does exactly what you wanted. Only .. and doto are object-specific macros.
Because your question is more "the Clojure way" thing, I recommend this:
https://github.com/bbatsov/clojure-style-guide
Yes, you can use the .. macro for java interop with threading semantics.
Example:
(.. (SomeObject/newBuilder)
(withId 1)
(withFooEnabled true)
(build))
There is nothing wrong with using a let as you have; it is a very direct and clear implementation of what you want to do. I think it is the best approach :)
But it is interesting to examine how you could achieve this with Clojure built in macros too:
(doto (.getFile fs path)
(-> (.isDirectory) (is))
(-> (.isComplete) (is)))
doto is special in that it propagates the initial value. It takes subsequent forms and inserts the initial value into the second position of the forms.
-> allows you to arrange forms inside out... which means you can combine it with doto such that the initial value is threaded through whatever layers of forms you would like to execute on it.
Now doto and -> are macros, so the syntax can be a little confusing... if you prefer to use functions then juxt is handy for calling multiple functions:
((juxt #(is (.isDirectory %)) #(is (.isComplete))) (.getFile fs path))
But for this situation it's quite ugly :)
Coming back to your original question, you can of course also create your own syntax with a macro... I describe one approach here: http://timothypratley.blogspot.pt/2016/05/composing-test-assertions-in-pipeline.html which chains multiple assertions in a convenient way.
I think for the example you describe sticking with a let form is the best way as it is most obvious... the other forms are shown here to discuss the options available, some of which might be appropriate in other circumstances.

Best Practice for globals in clojure, (refs vs alter-var-root)?

I've found myself using the following idiom lately in clojure code.
(def *some-global-var* (ref {}))
(defn get-global-var []
#*global-var*)
(defn update-global-var [val]
(dosync (ref-set *global-var* val)))
Most of the time this isn't even multi-threaded code that might need the transactional semantics that refs give you. It just feels like refs are for more than threaded code but basically for any global that requires immutability. Is there a better practice for this? I could try to refactor the code to just use binding or let but that can get particularly tricky for some applications.
I always use an atom rather than a ref when I see this kind of pattern - if you don't need transactions, just a shared mutable storage location, then atoms seem to be the way to go.
e.g. for a mutable map of key/value pairs I would use:
(def state (atom {}))
(defn get-state [key]
(#state key))
(defn update-state [key val]
(swap! state assoc key val))
Your functions have side effects. Calling them twice with the same inputs may give different return values depending on the current value of *some-global-var*. This makes things difficult to test and reason about, especially once you have more than one of these global vars floating around.
People calling your functions may not even know that your functions are depending on the value of the global var, without inspecting the source. What if they forget to initialize the global var? It's easy to forget. What if you have two sets of code both trying to use a library that relies on these global vars? They are probably going to step all over each other, unless you use binding. You also add overheads every time you access data from a ref.
If you write your code side-effect free, these problems go away. A function stands on its own. It's easy to test: pass it some inputs, inspect the outputs, they'll always be the same. It's easy to see what inputs a function depends on: they're all in the argument list. And now your code is thread-safe. And probably runs faster.
It's tricky to think about code this way if you're used to the "mutate a bunch of objects/memory" style of programming, but once you get the hang of it, it becomes relatively straightforward to organize your programs this way. Your code generally ends up as simple as or simpler than the global-mutation version of the same code.
Here's a highly contrived example:
(def *address-book* (ref {}))
(defn add [name addr]
(dosync (alter *address-book* assoc name addr)))
(defn report []
(doseq [[name addr] #*address-book*]
(println name ":" addr)))
(defn do-some-stuff []
(add "Brian" "123 Bovine University Blvd.")
(add "Roger" "456 Main St.")
(report))
Looking at do-some-stuff in isolation, what the heck is it doing? There are a lot of things happening implicitly. Down this path lies spaghetti. An arguably better version:
(defn make-address-book [] {})
(defn add [addr-book name addr]
(assoc addr-book name addr))
(defn report [addr-book]
(doseq [[name addr] addr-book]
(println name ":" addr)))
(defn do-some-stuff []
(let [addr-book (make-address-book)]
(-> addr-book
(add "Brian" "123 Bovine University Blvd.")
(add "Roger" "456 Main St.")
(report))))
Now it's clear what do-some-stuff is doing, even in isolation. You can have as many address books floating around as you want. Multiple threads could have their own. You can use this code from multiple namespaces safely. You can't forget to initialize the address book, because you pass it as an argument. You can test report easily: just pass the desired "mock" address book in and see what it prints. You don't have to care about any global state or anything but the function you're testing at the moment.
If you don't need to coordinate updates to a data structure from multiple threads, there's usually no need to use refs or global vars.