How Might Tracing A Function Affect What It Does? - clojure

I defined a function play like so:
(let [play-switch (fn [c]
(condp = c
:scales (play-scale)
:intervals (play-interval)
:chords (play-chord)
:inversions (play-inversion)
(print "play error, invalid :practice_mode state")))]
(def ^:dynamic play
"if no argument, evaluates the appropriate play-<mode> function, based on the current state of #opts.
if one argument, does the corresponding play-<mode> function."
(fn
([] (play-switch (get #opts :practice_mode)))
([method] (play-switch method)))))
When I call play in my application like (play), its intended side effect does not occur. However, when I play in my application like (clojure.tools.trace/dotrace [play] (play)). How could tracing a function affect what it does?

Tracing a function can force the realization of otherwise unrealized lazy sequences.
Such problems are very commonly instances of "the dreaded lazy bug" where the side effects (aka the work) of one of the functions is being run in a lazy sequence. This has the frusterating effect that if you call the function directly from the REPL the side effect happens as the result is printed and if you call it with trace the lazy sequence is realized when trace prints the result. Though in the normal case the side effects never happen because the lazy sequence it not realized.
put calls to doall (if you need the result) or dorun (if you don't need the result) around each side effecting lazy sequence.

Related

Clojure - how macroexpansion works inside of the "some" function

Just when I thought I had a pretty good handle on macros, I came across the source for some which looked a bit odd to me at first glance.
(defn some
[pred coll]
(when (seq coll)
(or (pred (first coll)) (recur pred (next coll)))))
My first instinct was that seems like it would be stack consuming, but then I remembered: "No, dummy, or is a macro so it would simply expand into a ton of nested ifs".
However mulling it over a bit more I ended up thinking myself in a corner. At expansion time the function source would look like this:
(defn some
[pred coll]
(when (seq coll)
(let [or__4469__auto__ (pred (first coll))]
(if or__4469__auto__
or__4469__auto__
(recur pred (next coll))))))
Now what's got me confused is that final recur call. I've always thought that macroexpansion occurs prior to runtime, yet here you have to actually call the already expanded code at runtime in order for the second macroexp .... wait a second, I think i just figured it out.
There is no second macroexpansion, there are no nested if blocks, only the one if block. The call to recur just keeps rebinding pred and coll but the same single block above keeps testing for truth until it finds it, or the collection runs out and nil is returned.
Can someone confirm if this is a correct interpretation? I had initially confused myself thinking that there would be an interleaving of macroexpansion and runtime wherein at runtime the call to recur would somehow result in a new macro call, which didn't make sense since macroexpansion must occur prior to runtime. Now I think I see where my confusion was, there is only ever one macro expansion and the resulting code is used over and over in a loop.
To start with, note that any function can serve as an implicit loop expression. Also, recur works just like a recursive function call, except it does not use up the stack because of a compiler trick (that is why loop & recur are "special forms" - they don't follow the rules of normal functions).
Also, remember that when is a macro that expands into an if expression.
Having said all that, you did reach the correct conclusion.
There are two modes of recursion going on here:
The or macro is implicitly recursive, provoked by the sequence of argument
forms into generating a tree of if forms.
The some function is explicitly recursive, provoked into telling the single
sequence of its final argument. The fact that this recursion is
recurable is irrelevant.
Every argument to the or macro beyond the first generates a nested if form. For example, ...
=> (clojure.walk/macroexpand-all '(or a b c))
(let* [or__5501__auto__ a]
(if or__5501__auto__ or__5501__auto__
(let* [or__5501__auto__ b]
(if or__5501__auto__ or__5501__auto__ c))))
You have two arguments to or, so one if form. As Alan Thompson's excellent answer points out, the surrounding when unwraps into another if form.
You can have as many nested if forms as you like, the leaves of the if tree, all of them, are in tail position. Hence all immediate recursive calls there are recurable. If there was no such tail recursion, the recur call would fail to compile.

What are side-effects in predicates and why are they bad?

I'm wondering what is considered to be a side-effect in predicates for fns like remove or filter. There seems to be a range of possibilities. Clearly, if the predicate writes to a file, this is a side-effect. But consider a situation like this:
(def *big-var-that-might-be-garbage-collected* ...)
(let [my-ref *big-var-that-might-be-garbage-collected*]
(defn my-pred
[x]
(some-operation-on my-ref x)))
Even if some-operation-on is merely a query that does not change state, the fact that my-pred retains a reference to *big... changes the state of the system in that the big var cannot be garbage collected. Is this also considered to be side-effect?
In my case, I'd like to write to a logging system in a predicate. Is this a side effect?
And why are side-effects in predicates discouraged exactly? Is it because filter and remove and their friends work lazily so that you cannot determine when the predicates are called (and - hence - when the side-effects happen)?
GC is not typically considered when evaluating if a function is pure or not, although many actions that make a function impure can have a GC effect.
Logging is a side effect, as is changing any state in the program or the world. A pure function takes data and returns data, without modifying anything else.
https://softwareengineering.stackexchange.com/questions/15269/why-are-side-effects-considered-evil-in-functional-programming covers why side effects are avoided in functional languages.
I found this link helpful
The problem is determining when, or even whether, the side-effects will occur on any given call to the function.
If you only care that the same inputs return the same answer, you are fine. Side-effects are dependent on how the function is executed.
For example,
(first (filter odd? (range 20)))
; 1
But if we arrange for odd? to print its argument as it goes:
(first (filter #(do (print %) (odd? %)) (range 20)))
It will print 012345678910111213141516171819 before returning 1!
The reason is that filter, where it can, deals with its sequence argument in chunks of 32 elements.
If we take the limit off the range:
(first (filter #(do (print %) (odd? %)) (range)))
... we get a full-size chunk printed: 012345678910111213141516171819012345678910111213141516171819202122232425262728293031
Just printing the argument is confusing. If the side effects are significant, things could go seriously awry.

Side effect optimized out

I am new to clojure and at some moment I faced with the problem.
I have such code in my program:
(let [ ... ]
(map (fn [[v f]] (do-side-effect v f)) {:v1 f1, :v2 f2})
(do-the-job ...))
This do-side-effect can be, for example, println of another side effect function like intern. The problem is that side effect doesn't happen.
But if i change the line to
(println (map #(fn [[v f]] (do-side-effect v f)) {:v1 f1, :v2 f2}))
Then everything is ok.
So the last idea i came to is that clojure
just optimize out the map because
it think that it's result is useless because I don't use it.
In case if this actually happens, how can I show clojure that this form
can have side effects to prevent compiler from optimizing it out?
In case if it's a bug, how can I find where the bug is?
map is lazy. It is not meant to be used directly for side effects, and it only produces values when they are consumed.
You can use dorun to force the values to be realized, even if you are not consuming them, or use doseq instead of map, doseq is intended to be used for side effects, and unlike map won't spend time constructing objects you won't ever access.

couldn't use for loop in go block of core.async?

I'm new to clojure core.async library, and I'm trying to understand it through experiment.
But when I tried:
(let [i (async/chan)] (async/go (doall (for [r [1 2 3]] (async/>! i r)))))
it gives me a very strange exception:
CompilerException java.lang.IllegalArgumentException: No method in multimethod '-item-to-ssa' for dispatch value: :fn
and I tried another code:
(let [i (async/chan)] (async/go (doseq [r [1 2 3]] (async/>! i r))))
it have no compiler exception at all.
I'm totally confused. What happend?
So the Clojure go-block stops translation at function boundaries, for many reasons, but the biggest is simplicity. This is most commonly seen when constructing a lazy seq:
(go (lazy-seq (<! c)))
Gets compiled into something like this:
(go (clojure.lang.LazySeq. (fn [] (<! c))))
Now let's think about this real quick...what should this return? Assuming what you probably wanted was a lazy seq containing the value taken from c, but the <! needs to translate the remaining code of the function into a callback, but LazySeq is expecting the function to be synchronous. There really isn't a way around this limitation.
So back to your question if, you macroexpand for you'll see that it doesn't actually loop, instead it expands into a bunch of code that eventually calls lazy-seq and so parking ops don't work inside the body. doseq (and dotimes) however are backed by loop/recur and so those will work perfectly fine.
There are a few other places where this might trip you up with-bindings being one example. Basically if a macro sticks your core.async parking operations into a nested function, you'll get this error.
My suggestion then is to keep the body of your go blocks as simple as possible. Write pure functions, and then treat the body of go blocks as the places to do IO.
------------ EDIT -------------
By stops translation at function boundaries, I mean this: the go block takes its body and translates it into a state-machine. Each call to <! >! or alts! (and a few others) are considered state machine transitions where the execution of the block can pause. At each of those points the machine is turned into a callback and attached to the channel. When this macro reaches a fn form it stops translating. So you can only make calls to <! from inside a go block, not inside a function inside a code block.
This is part of the magic of core.async. Without the go macro, core.async code would look a lot like callback-hell in other langauges.

Delayed evaluation in Clojure

I'm having some trouble understanding how the delay macro works in Clojure. It doesn't seem to do what expect it to do (that is: delaying evaluation). As you can see in this code sample:
; returns the current time
(defn get-timestamp [] (System/currentTimeMillis))
; var should contain the current timestamp after calling "force"
(def current-time (delay (get-timestamp)))
However, calling current-time in the REPL appears to immediately evaluate the expression, even without having used the force macro:
user=> current-time
#<Delay#19b5217: 1276376485859>
user=> (force current-time)
1276376485859
Why was the evaluation of get-timestamp not delayed until the first force call?
The printed representation of various objects which appears at the REPL is the product of a multimethod called print-method. It resides in the file core_print.clj in Clojure's sources, which constitutes part of what goes in the clojure.core namespace.
The problem here is that for objects implementing clojure.lang.IDeref -- the Java interface for things deref / # can operate on -- print-method includes the value behind the object in the printed representation. To this end, it needs to deref the object, and although special provisions are made for printing failed Agents and pending Futures, Delays are always forced.
Actually I'm inclined to consider this a bug, or at best a situation in need of an improvement. As a workaround for now, take extra care not to print unforced delays.