Simple macro doesn't seem to execute given forms - clojure

I'm using this basic macro as a building block for other timing macros:
(defmacro time-pure
"Evaluates expr and returns the time it took.
Modified the native time macro to return the time taken."
[expr]
`(let [start# (current-nano-timestamp)
ret# ~expr]
(/ (double (- (current-nano-timestamp) start#)) 1000000.0)))
I've tested this, and used it in other macros, so I know it works fine.
My problem can be described by the following snippet:
(defmacro time-each [& exprs]
`(mapv #(time-pure %) '~exprs))
I would expect this to give each expression to time-each, where it executes and times it; returning the result. When I test it however, it finishes instantly:
(time-each
(Thread/sleep 500)
(Thread/sleep 1000))
[0.036571 0.0]
I'm confused by this since I know that (time-pure (Thread/sleep 1000)) will take around a second to return, and all this macro does it delegate to time-pure.
What's causing this? I really have no idea how to properly debug a macro. I used macro-expand-1 to inspect the generated code:
(clojure.pprint/pprint
(macroexpand-1
'(time-each
(Thread/sleep 500)
(Thread/sleep 1000))))
(clojure.core/mapv
(fn*
[p1__1451__1452__auto__]
(helpers.general-helpers/time-pure p1__1451__1452__auto__))
'((Thread/sleep 500) (Thread/sleep 1000)))
But nothing really stands out to me.
What's going on here?
(Note, this is a dupe of a question I posted a few minutes ago. I realized the case I was showing was convoluted, so I made up a better example.)

Solution
I think what happens here is that your mapv executes after compile time, meaning at run time, this is why it cant just "paste" the code list as an sexp. I think a better approach would be to keep the mapv out of the syntax quoting:
(defmacro each-time-mine [& exprs]
(mapv #(time-pure %) exprs))
(each-time-mine
(Thread/sleep 500)
(Thread/sleep 1000))
[501.580465 1001.196752]
Original
Although eval is usually frowned upon, it seems to solve the problem in this situation:
(defmacro time-pure
[expr]
`(let [start# (current-nano-timestamp)
ret# (eval ~expr)]
(/ (double (- (current-nano-timestamp) start#)) 1000000.0)))
(defmacro time-each [& exprs]
`(mapv #(time-pure %) '~exprs))
(time-each
(Thread/sleep 500)
(Thread/sleep 1000))
[501.249249 1001.242522]
What happens is that mapv sees the sexps as lists, and at the moment time-pure wants to execute them, it simply assigns ret# the value of the list. So that list needs to be evaled.
There might be better ways to achieve this though.

(defmacro time-each [& exprs]
`(list ~#(for [expr exprs]
`(time-pure ~expr))))
You need to be careful not to evaluate an expression outside of a time-pure context, as is done in Arthur's answer, which evaluates each expression and then calls time-pure on the (very fast) operation of "looking at the result". Rather, wrap time-pure around each expression before evaluating it.

"I'm still trying to don't wrap my head around macros."
It takes a while. :) One thing that helped me was macro-expanding:
(macroexpand-1 '(time-each (+ 2 2 )(+ 3 3)))
That produces:
(clojure.core/mapv
(fn* [p1__24013__24014__auto__]
(rieclj.core/time-pure p1__24013__24014__auto__))
(quote ((+ 2 2) (+ 3 3))))
The alarming thing there is that time-pure is being passed a quoted list, so it is all just symbols at run time, not macroexpansion time.
Then another thing that helped me was that the beauty of Lispy macros is that they are normal functions that just run at a different time (while the reader is processing the source). But because they are normal functions we can probe them with print statements just as we probe normal code.
I have modified time-pure to print out its macroexpansion-time argument expr and, in the generate code, to have it print out the evaluated input, which we now suspect is a list of symbols, or '(+ 2 2) in the first case.
(defmacro time-pure
"Evaluates expr and returns the time it took.
Modified the native time macro to return the time taken."
[expr]
(prn :time-pure-sees expr)
`(let [start# (now)
ret# ~expr]
(prn :ret-is ret# (type (first ret#)))
(/ (double (- (now) start#)) 1000000.0)))
I print the type of the first of ret to drive home that + is a symbol, not a function. Evaluating:
(time-each (+ 2 2 )(+ 3 3))
yields:
:time-pure-sees p1__24013__24014__auto__
:ret-is (+ 2 2) clojure.lang.Symbol
:ret-is (+ 3 3) clojure.lang.Symbol
Seeing (+ 2 2) might make you think all is well, but the key is that the rightside is '(+ 2 2) so ret# gets bound to that symbolic expression instead of the intended computation.
Again, printing the type of the first hopefully makes clear that time-pure was working on a list of symbols at runtime.
Sorry if all that was confusing, but the moral is simple: use macroexpand-1 and embedded print statements when macros mess with your head. In time you will internalize the two different times, macroexpansion and run.

Related

Clojure: when to use memoize and when to use delay/force?

I've just started learning Clojure and trying to understand the difference between 2 approaches which at first sight seem very identical.
(def func0 (delay (do
(println "did some work")
100)))
so.core=> (force my-delay2)
did some work
100
so.core=> (force my-delay2)
100
(defn vanilla-func [] (println "did some work") 100)
(def func1 (memoize vanilla-func))
so.core=> (func1)
did some work
100
so.core=> (func1)
100
Both approaches do some sort of function memoization. What am I missing?
I've tried to find the explanation on https://clojuredocs.org/clojure.core/delay & https://clojuredocs.org/clojure.core/memoize but couldn't.
delay holds one result and you have to deref to get the result.
memoize is an unbound cache, that caches the result depending on the
input arguments. E.g.
user=> (def myinc (memoize (fn [x] (println x) (inc x))))
#'user/myinc
user=> (myinc 1)
1
2
user=> (myinc 1)
2
In your (argument-less) example the only difference is that you can use
the result directly (no deref needed)
Classic use-cases for delay are things needed later, that would block
or delay startup. Or if you want to "hide" top-level defs from
the compiler (e.g. they do side-effects).
memoize is a classic cache and is best used if the calculation is
expensive and the set of input arguments is not excessive. There are
other caching options in the clojure-verse, that allow better
configurations (e.g. they are not unbound).

How to implement a parallel logical or with early termination in Clojure

I would like to define a predicate that, taking as input some predicates
with corresponding inputs (they could be given as a lazy sequence of calls),
runs them in parallel and computes the logical or of the results,
in such a way that, the moment a predicate call terminates returning true,
the whole computation also terminates (returning true).
Apart from offering time optimization, this would also help avoiding
non-termination in some cases (some predicate calls may not terminate).
Actually, interpreting non-termination as a third undefined value,
this predicate simulates the or operation in Kleene's K3 logic
(the join in the initial centered Kleene algebra).
Something similar is presented here for the Haskell family.
Is there any (preferably simple) way to do this in Clojure?
EDIT: I decided to add some clarifications after reading the comments.
(a) First of all, what happens after the thread pool gets exhausted is of less importance. I think creating a thread pool large enough for our needs is a reasonable convention.
(b) The most crucial requirement is that the predicate calls start running in parallel and, once a predicate call terminates returning true, all the other threads running get interrupted. The intended behavior is that:
if there is a predicate call returning true: the parallel or returns true
else if there is a predicate call that does not terminate: the parallel or does not terminate
else: the parallel or returns false
In other words, it behaves like the join in the 3-element lattice given by false<undefined<true, with undefined representing non-termination.
(c) The parallel or should be able to take as input many predicates and many predicate-inputs (each one corresponding to a predicate). But it would be even better if it took as input a lazy sequence. Then, naming the parallel or pany (for "parallel any"), we could have calls like the following:
(pany (map (comp eval list) predicates inputs))
(pany (map (comp eval list) predicates (repeat input)))
(pany (map (comp eval list) (repeat predicate) inputs)) which is equivalent to (pany (map predicate (unchunk inputs)))
As a final remark, I think that it is quite natural to ask for things like pany, a dual pall or a mechanism for building such early-terminating parallel reductions to be easily implementable or even built-in in a parallelism-oriented language like Clojure.
I will define our predicates in terms of a reducing function. Practically, we could reimplement all of the Clojure iteration functions to support this parallel operation, but I'll just use reduce as an example.
I'll define a computation function. I'll just use the same one, but nothing stopping you from having many. The function is "true" if it accumulates 1000.
(defn computor [acc val]
(let [new (+' acc val)] (if (> new 1000) (reduced new) new)))
(reduce computor 0 (range))
;; =>
1035
(reduce computor 0 (range Long/MIN_VALUE 0))
;; =>
;; ...this is a proxy for a non-returning computation
;; wrap these up in a form suitable for application of reduction
(def predicates [[computor 0 (range)]
[computor 0 (range Long/MIN_VALUE 0)]])
Now let's get to the meat of this. I want to take a step in each computation, and if one of the computations completes, I want to return it. In actual fact one step at a time using pmap is very slow - the units of work are too small to be worth threading. Here I've changed things to do 1000 iterations of each unit of work before moving on. You'd probably tune this based on your workload and the cost of a step.
(defn p-or-reducer* [reductions]
(let [splits (map #(split-at 1000 %) reductions) ;; do at least 1000 iterations per cycle
complete (some #(if (empty? (second %)) (last (first %))) splits)]
(or complete (recur (map second splits)))))
I then wrap this in a driver.
(defn p-or [s]
(p-or-reducer* (map #(apply reductions %) s)))
(p-or predicates)
;; =>
1035
Where to insert the CPU parallelism? s/map/pmap/ in p-or-reducer* should do it. I suggest just parallelising the first operation, as this will drive the reducing sequences to compute.
(defn p-or-reducer* [reductions]
(let [splits (pmap #(split-at 1000 %) reductions) ;; do at least 1000 iterations per cycle
complete (some #(if (empty? (second %)) (last (first %))) splits)]
(or complete (recur (map second splits)))))
(def parallelism-tester (conj (vec (repeat 40000 [computor 0 (range Long/MIN_VALUE 0)]))
[computor 0 (range)]))
(p-or parallelism-tester) ;; terminates even though the first 40K predicates will not
It's extremely hard to define a performant generic version of this. Without knowing the cost per iteration an efficient parallelism strategy is hard to derive - if one iteration take 10s then we'd probably take a single step at a time. If it takes 100ns then we need to take many steps at a time.
Will you consider adopting core.async to handle parallel tasks with async/go or async/thread, and early return with async/alts!?
For example, to turn the core or function from serial into parallel. We can create a macro (I called it por) to wrap the input functions (or predicates) into async/thread and then do a socket select async/alts! on top of them:
(defmacro por [& fns]
`(let [[v# c#] (async/alts!!
[~#(for [f fns]
(list `async/thread f))])]
v#))
(time
(por (do (println "running a") (Thread/sleep 30) :a)
(do (println "running b") (Thread/sleep 20) :b)
(do (println "running c") (Thread/sleep 10) :c)))
;; running a
;; running b
;; running c
;; "Elapsed time: 11.919169 msecs"
;; => :c
In comparison with the original or (which run in serial):
(time
(or (do (println "running a") (Thread/sleep 30) :a)
(do (println "running b") (Thread/sleep 20) :b)
(do (println "running c") (Thread/sleep 10) :c)))
;; running a
;; => :a
;; "Elapsed time: 31.642506 msecs"

Using let style destructuring for def

Is there a reasonable way to have multiple def statements happen with destructing the same way that let does it? For Example:
(let [[rtgs pcts] (->> (sort-by second row)
(apply map vector))]
.....)
What I want is something like:
(defs [rtgs pcts] (->> (sort-by second row)
(apply map vector)))
This comes up a lot in the REPL, notebooks and when debugging. Seriously feels like a missing feature so I'd like guidance on one of:
This exists already and I'm missing it
This is a bad idea because... (variable capture?, un-idiomatic?, Rich said so?)
It's just un-needed and I must be suffering from withdrawals from an evil language. (same as: don't mess up our language with your macros)
A super short experiment give me something like:
(defmacro def2 [[name1 name2] form]
`(let [[ret1# ret2#] ~form]
(do (def ~name1 ret1#)
(def ~name2 ret2#))))
And this works as in:
(def2 [three five] ((juxt dec inc) 4))
three ;; => 3
five ;; => 5
Of course and "industrial strength" version of that macro might be:
checking that number of names matches the number of inputs. (return from form)
recursive call to handle more names (can I do that in a macro like this?)
While I agree with Josh that you probably shouldn't have this running in production, I don't see any harm in having it as a convenience at the repl (in fact I think I'll copy this into my debug-repl kitchen-sink library).
I enjoy writing macros (although they're usually not needed) so I whipped up an implementation. It accepts any binding form, like in let.
(I wrote this specs-first, but if you're on clojure < 1.9.0-alpha17, you can just remove the spec stuff and it'll work the same.)
(ns macro-fun
(:require
[clojure.spec.alpha :as s]
[clojure.core.specs.alpha :as core-specs]))
(s/fdef syms-in-binding
:args (s/cat :b ::core-specs/binding-form)
:ret (s/coll-of simple-symbol? :kind vector?))
(defn syms-in-binding
"Returns a vector of all symbols in a binding form."
[b]
(letfn [(step [acc coll]
(reduce (fn [acc x]
(cond (coll? x) (step acc x)
(symbol? x) (conj acc x)
:else acc))
acc, coll))]
(if (symbol? b) [b] (step [] b))))
(s/fdef defs
:args (s/cat :binding ::core-specs/binding-form, :body any?))
(defmacro defs
"Like def, but can take a binding form instead of a symbol to
destructure the results of the body.
Doesn't support docstrings or other metadata."
[binding body]
`(let [~binding ~body]
~#(for [sym (syms-in-binding binding)]
`(def ~sym ~sym))))
;; Usage
(defs {:keys [foo bar]} {:foo 42 :bar 36})
foo ;=> 42
bar ;=> 36
(defs [a b [c d]] [1 2 [3 4]])
[a b c d] ;=> [1 2 3 4]
(defs baz 42)
baz ;=> 42
About your REPL-driven development comment:
I don't have any experience with Ipython, but I'll give a brief explanation of my REPL workflow and you can maybe comment about any comparisons/contrasts with Ipython.
I never use my repl like a terminal, inputting a command and waiting for a reply. My editor supports (emacs, but any clojure editor should do) putting the cursor at the end of any s-expression and sending that to the repl, "printing" the result after the cursor.
I usually have a comment block in the file where I start working, just typing whatever and evaluating it. Then, when I'm reasonably happy with a result, I pull it out of the "repl-area" and into the "real-code".
(ns stuff.core)
;; Real code is here.
;; I make sure that this part always basically works,
;; ie. doesn't blow up when I evaluate the whole file
(defn foo-fn [x]
,,,)
(comment
;; Random experiments.
;; I usually delete this when I'm done with a coding session,
;; but I copy some forms into tests.
;; Sometimes I leave it for posterity though,
;; if I think it explains something well.
(def some-data [,,,])
;; Trying out foo-fn, maybe copy this into a test when I'm done.
(foo-fn some-data)
;; Half-finished other stuff.
(defn bar-fn [x] ,,,)
(keys 42) ; I wonder what happens if...
)
You can see an example of this in the clojure core source code.
The number of defs that any piece of clojure will have will vary per project, but I'd say that in general, defs are not often the result of some computation, let alone the result of a computation that needs to be destructured. More often defs are the starting point for some later computation that will depend on this value.
Usually functions are better for computing a value; and if the computation is expensive, then you can memoize the function. If you feel you really need this functionality, then by all means, use your macro -- that's one of the sellings points of clojure, namely, extensibility! But in general, if you feel you need this construct, consider the possibility that you're relying too much on global state.
Just to give some real examples, I just referenced my main project at work, which is probably 2K-3K lines of clojure, in about 20 namespaces. We have about 20 defs, most of which are marked private and among them, none are actually computing anything. We have things like:
(def path-prefix "/some-path")
(def zk-conn (atom nil))
(def success? #{200})
(def compile* (clojure.core.memoize/ttl compiler {} ...)))
(def ^:private nashorn-factory (NashornScriptEngineFactory.))
(def ^:private read-json (comp json/read-str ... ))
Defining functions (using comp and memoize), enumerations, state via atom -- but no real computation.
So I'd say, based on your bullet points above, this falls somewhere between 2 and 3: it's definitely not a common use case that's needed (you're the first person I've ever heard who wants this, so it's uncommon to me anyway); and the reason it's uncommon is because of what I said above, i.e., it may be a code smell that indicates reliance on too much global state, and hence, would not be very idiomatic.
One litmus test I have for much of my code is: if I pull this function out of this namespace and paste it into another, does it still work? Removing dependencies on external vars allows for easier testing and more modular code. Sometimes we need it though, so see what your requirements are and proceed accordingly. Best of luck!

How do I pass a variadic macro to a function?

I am attempting to learn Clojure and am having trouble with macros. I have a simple macro:
(defmacro add [& x] `(apply + (list ~#x)))
and then I have a function that can call multiple operations (add, subtract, multiply, and divide). I get these from stdin but for simplicity I have created a variable holding a list of numbers:
(defn do-op
[op]
(def numbers (list 1 2 3))
(apply op numbers))
In order to call do-op with the appropriate operation I need to pass the macro to do-op in an anonymous function:
(do-op #(add %&))
When I run this code using arguments passed from stdin I get a confusing error:
Cannot cast clojure.lang.ChunkedCons to java.lang.Number
When I execute this in a REPL I get:
Cannot cast clojure.lang.PersistentList to java.lang.Number
I assume that this has something to do with my lack of understanding of how these variadic arguments are getting handled but I am thoroughly stumped. What am I missing?
You can never ever pass a macro as if it was a function, since it doesn't exist after the code is compiled and ready to run.
When you have;
(do-op #(add %&)) ; which is shorthand for
(do-op (fn [& rest] (add rest))) ; then you expand the macro
(do-op (fn [& rest] (apply + (list (UNQUOTE-SPLICE 'rest)))) ; how do you unsplice the symbol rest to be separate elements?
The short answer is that rest can never be unspliced and thus you are using the macro wrong. However it seems you are just making an alias for + and it can be done by making a function or just making a binding to the original:
(defn add [& args] (apply + args)) ; making a function that calls + with its arguments
(def add +) ; or just make a binding pointing to the same function as + points to
(do-op add) ; ==> 6
You should keep to using functions as abstraction. Whenever you need to write something very verbose, where you cannot make it a function because of too early evaluation, then you have a candidate for a macro. It then translates a shorter way to write it with the long verbose way to write it without knowing the actual data types behind the variable symbols, since the compiler won't know what they are when expanding the macro. Also a macro gets expanded once, while the function that had the macro might get executed many times.
EDIT
In your pastebin code I can confirm you never ever need a macro:
(cond
(= 1 option) (do-op #(add %&))
...)
Should just be
(cond
(= 1 option) (do-op +)
...)
No need to wrap or make macro for this. Pass the function as is and it will do what you want.
Macros run at compile time and transform code data, not runtime data.
There is simply no way to generically apply to a macro a form evaluating to a sequence (like the symbol numbers) because it needs be to evaluated first, which can only happen at runtime (after compile time).
This is the full answer. Hopefully it helps you along understanding macros in Clojure.
If I understand what you are thinking, you should do it more like this:
(ns clj.core
(:require
[tupelo.core :as t]
))
(t/refer-tupelo)
(defn add [& x]
(apply + x))
(def numbers [1 2 3] )
(defn do-op
[op]
(apply op numbers))
(spyx (do-op add))
(defn do-op-2
[op vals]
(apply op vals))
(spyx (do-op-2 add numbers))
to get result:
> lein run
(do-op add) => 6
(do-op-2 add numbers) => 6
Usually we try to avoid having functions access global values like numbers. With do-op-2 we pass the vector of numbers into the function instead.
Note: For (spyx ...) to work you're project.clj must include
:dependencies [
[tupelo "0.9.9"]
Update
Of course, you don't really need do-op, you could do:
(defn add-all [nums-vec]
(apply + nums-vec))
(spyx (add-all numbers))
;=> (add-all numbers) => 6

Could Clojure do without let?

I find I very rarely use let in Clojure. For some reason I took a dislike to it when I started learning and have avoided using it ever since. It feels like the flow has stopped when let comes along. I was wondering, do you think we could do without it altogether ?
You can replace any occurrence of (let [a1 b1 a2 b2...] ...) by ((fn [a1 a2 ...] ...) b1 b2 ...) so yes, we could. I am using let a lot though, and I'd rather not do without it.
Let offers a few benefits. First, it allows value binding in a functional context. Second, it confers readability benefits. So while technically, one could do away with it (in the sense that you could still program without it), the language would be impoverished without a valuable tool.
One of the nice things about let is that it helps formalize a common (mathematical) way of specifying a computation, in which you introduce convenient bindings and then a simplified formula as a result. It's clear the bindings only apply to that "scope" and it's tie in with a more mathematical formulation is useful, especially for more functional programmers.
It's not a coincidence that let blocks occur in other languages like Haskell.
Let is indispensable to me in preventing multiple execution in macros:
(defmacro print-and-run [s-exp]
`(do (println "running " (quote ~s-exp) "produced " ~s-exp)
s-exp))
would run s-exp twice, which is not what we want:
(defmacro print-and-run [s-exp]
`(let [result# s-exp]
(do (println "running " (quote ~s-exp) "produced " result#)
result#))
fixes this by binding the result of the expression to a name and referring to that result twice.
because the macro is returning an expression that will become part of another expression (macros are function that produce s-expressions) they need to produce local bindings to prevent multiple execution and avoid symbol capture.
I think I understand your question. Correct me if it's wrong. Some times "let" is used for imperative programming style. For example,
... (let [x (...)
y (...x...)
z (...x...y...)
....x...y...z...] ...
This pattern comes from imperative languages:
... { x = ...;
y = ...x...;
...x...y...;} ...
You avoid this style and that's why you also avoid "let", don't you?
In some problems imperative style reduces amount of code. Furthermore, some times It's more efficient to write in java or c.
Also in some cases "let" just holds values of subexpressions regardless of evaluation order. For example,
(... (let [a (...)
b (...)...]
(...a...b...a...b...) ;; still fp style
There are at least two important use cases for let-bindings:
First, using let properly can make your code clearer and shorter. If you have an expression that you use more than once, binding it in a let is very nice. Here's a portion of the standard function map that uses let:
...
(let [s1 (seq c1) s2 (seq c2)]
(when (and s1 s2)
(cons (f (first s1) (first s2))
(map f (rest s1) (rest s2)))))))
...
Even if you use an expression only once, it can still be helpful (to future readers of the code) to give it a semantically meaningful name.
Second, as Arthur mentioned, if you want to use the value of an expression more than once, but only want it evaluated once, you can't simply type out the entire expression twice: you need some kind of binding. This would be merely wasteful if you have a pure expression:
user=> (* (+ 3 2) (+ 3 2))
25
but actually changes the meaning of the program if the expression has side-effects:
user=> (* (+ 3 (do (println "hi") 2))
(+ 3 (do (println "hi") 2)))
hi
hi
25
user=> (let [x (+ 3 (do (println "hi") 2))]
(* x x))
hi
25
Stumbled upon this recently so ran some timings:
(testing "Repeat vs Let vs Fn"
(let [start (System/currentTimeMillis)]
(dotimes [x 1000000]
(* (+ 3 2) (+ 3 2)))
(prn (- (System/currentTimeMillis) start)))
(let [start (System/currentTimeMillis)
n (+ 3 2)]
(dotimes [x 1000000]
(* n n))
(prn (- (System/currentTimeMillis) start)))
(let [start (System/currentTimeMillis)]
(dotimes [x 1000000]
((fn [x] (* x x)) (+ 3 2)))
(prn (- (System/currentTimeMillis) start)))))
Output
Testing Repeat vs Let vs Fn
116
18
60
'let' wins over 'pure' functional.