I am writing a custom transducer as an exercise, but I am surprised to see that its 0-arity init function is not called.
Why?
Is it related to which aggregation function I am using? If yes, which ones would call the init function and why others are not?
(defn inc-xf [xf]
"inc-xf should be equivalent to (map inc)"
(fn
;; init
([]
(println "init") ;; <- this is not called (???)
(xf))
;; step
([result input]
(println "step" result input)
(xf result (inc input)))
;; completion
([result]
(println "completion" result)
(xf result))))
(transduce inc-xf
+
100
[5 5 5])
If you look at the implementation of transduce you can see what happens.
(defn transduce
;; To get the init value, (f) is used instead of ((xform f))
([xform f coll] (transduce xform f (f) coll))
([xform f init coll]
,,,))
Why, however, is more difficult to answer.
Transducers implementing the zero arity is part of the requirements for a transducer, but it is never actually used in any transducing context in clojure.core. On the mailing list there's been a post asking the same question as you and proposal of an implementation of transduce that actually uses the init arity. The jira ticket was declined, however, with the explanation:
Rich asked me to decline the ticket because the init arity of the xform should not be involved in the reducing function accumulation.
- Alex Miller
Why, then, is the init arity part of the contract for a transducer if it's not used anywhere? ¯\_(ツ)_/¯
Related
Given that :post takes a form that gets evaluated later (e.g. {:post [(= 10 %)]}). How could one dynamically pass a 'pre-made' vector of functions to :post?
For example:
(def my-post-validator
[prediate1 predicate2 predicate3])
(defn foo [x]
{:post my-post-validator}
x)
this throws a syntax error
Don't know how to create ISeq from: clojure.lang.Symbol
With my fuzzy understanding, it's because defn is a macro, and the thing that allows the % syntax in :post is that it's quoted internally..?
I thought maybe I then use a macro to pass a 'literal' of what I wanted evaluated
(defmacro my-post-cond [spec]
'[(assert spec %) (predicate2 %) (predicate n)])
example:
(defn foo [x]
{:post (my-post-cond :what/ever)}
x)
However, this attempt gives the error:
Can't take value of a macro
Is there a way to pass a vector of things to :post rather than having to define it inline?
You can't pass a vector of predefined predicates, but you can combine multiple predicates under a single name and use that name in :post:
(defn my-post-cond [spec val]
(and
;; Not sure if this is exactly what you want,
;; given that `val` becomes an assert message.
(assert spec val)
(predicate2 val)
;; You used `n` - I assume it was supposed to be `%`.
(predicate val)))
(defn foo [x]
{:post [(my-post-cond :what/ever %)]}
x)
I started off as a fan of pre- and post-conditions, but I've changed over the years.
For simple things, I prefer to use Plumatic Schema to not only test inputs & outputs, but to document them as well.
For more complicated tests & verifications, I just put in an explicit assert or similar. I also wrote a helper function in the Tupelo library to reduce repetition, etc when debugging or verifying return values:
(ns tst.demo.core
(:use tupelo.core tupelo.test))
(defn oddly
"Transforms its input. Throws if result is not odd"
[x]
(let [answer (-> x (* 3) (+ 2))]
(with-result answer
(newline)
(println :given x)
(assert (odd? answer))
(println :returning answer))))
(dotest
(is= 5 (oddly 1))
(throws? (oddly 2)))
with result
------------------------------------
Clojure 1.10.3 Java 11.0.11
------------------------------------
Testing tst.demo.core
:given 1
:returning 5
:given 2
Ran 2 tests containing 2 assertions.
0 failures, 0 errors.
Passed all tests
So with either the println or assert, the returned value is easy to see. If it fails the assert, an Exception is thrown as normal.
This Clojure code on GitHub refers to the Unified Update Model.
Could you explain how the fixing function works well with the Unified Update Model?
(defn fixing
"A version of fix that fits better with the unified update model: instead of multiple clauses,
additional args to the transform function are permitted. For example,
(swap! my-atom fixing map? update-in [k] inc)"
[x pred transform & args]
(if ((as-fn pred) x)
(apply (as-fn transform) x args)
x))
The unified update model is the API convention whereby objects of various reference types can be updated in a consistent way, albeit with specialized functions:
;; Atoms
(swap! the-atom f …)
;; Agents
(send the-agent f …)
(send-off the-agent f …)
;; send-via takes an additional initial argument, but otherwise
;; follows the same convention (and of course it's an exact match
;; when partially applied – (partial send-via some-executor))
(send-via executor the-agent f …)
;; Refs
(dosync
(alter the-ref f …)
(commute the-ref f…))
In each case f is the function that should be used to update the value held by the Atom / Agent / Ref, … are additional arguments, if any (e.g. (swap! the-atom f 1 2 3)), and the result of the call is that the reference will atomically assume the value (f old-value …) – although exactly when that will happen with respect to the swap! / alter / send / … call depends on the reference type in question and the update function used.
Here's an example:
(def a (atom 0))
(swap! a - 5)
#a
;= -5
Vars are generally not meant to be used for the same purposes that one might use the above-mentioned reference types for, but they also have an update function with the same contract:
(alter-var-root #'the-var f …)
Finally, the update and update-in functions are worth mentioning in this connection; in effect, they extend the unified update model convention to values – now of course, values are immutable, so calling update or update-in does not result in any object being visibly changed, but the return value is produced similarly as the result of applying an update function to a preexisting value and possibly some extra arguments:
(update {:foo 1} :foo inc)
;= {:foo 2}
The fixing function quoted in the question works better than fix (defined in the same namespace) in the context of UUM update calls, because it can be passed multiple arguments in a way that meshes well with how UUM update functions like swap! work, whereas with fix you'd have to use an anonymous function:
;; the example from the docstring of fixing
(swap! my-atom fixing map? update-in [k] inc)
;; modified to use fix
(swap! my-atom fix map? #(update-in % [k] inc))
I wrote a specialized function construct, which under the hood is really just a Clojure function. So basically I have a function that makes (similar to fn) and a function that calls my specialized functions (similar to CL's funcall).
My constructor assigns metadata (at compile-time) so I could distinguish between "my" functions and other/normal Clojure functions.
What I want to do is to make a macro that lets users write code as if my functions were normal functions. It would do so by walking over the code, and in functions calls, when the callee is a specialized function, it would change the call so it would use my caller (and also inject some extra information). For example:
(defmacro my-fn [args-vector & body] ...)
(defmacro my-funcall [myfn & args] ...)
(defmacro with-my-fns [& body] ...)
(with-my-fns
123
(first [1 2 3])
((my-fn [x y] (+ x y))) 10 20)
; should yield:
(do
123
(first [1 2 3])
(my-funcall (my-fn [x y] (+ x y)) 10 20))
I run into problems in lexical environments. For example:
(with-my-fns
(let [myf (my-fn [x y] (+ x y))]
(myf))
In this case, when the macro I want to write (i.e. with-my-fns) encounters (myf), it sees myf as a symbol, and I don't have access to the metadata. It's also not a Var so I can't resolve it.
I care to know because otherwise I'll have to put checks on almost every single function call at runtime. Note that I don't really care if my metadata on the values are actual Clojure metadata; if it's possible with the type-system and whatnot it's just as good.
P.S. I initially wanted to just ask about lexical environments, but maybe there are more pitfalls I should be aware of where my approach would fail? (or maybe even the above is actually an XY problem? I'd welcome suggestions).
As #OlegTheCat already pointed out in the comment section, the idea to use meta-data does not work.
However I might have a solution you can live with:
(ns cl-myfn.core)
(defprotocol MyCallable
(call [this magic args]))
(extend-protocol MyCallable
;; a clojure function implements IFn
;; we use this knowledge to simply call it
;; and ignore the magic
clojure.lang.IFn
(call [this _magic args]
(apply this args)))
(deftype MyFun [myFun]
MyCallable
;; this is our magic type
;; for now it only adds the magic as first argument
;; you may add all the checks here
(call [this magic args]
(apply (.myFun this) magic args)))
;;turn this into a macro if you want more syntactic sugar
(defn make-myfun [fun]
(MyFun. fun))
(defmacro with-myfuns [magic & funs]
`(do ~#(map (fn [f#]
;; if f# is a sequence it is treated as a function call
(if (seq? f#)
(let [[fun# & args#] f#]
`(call ~fun# ~magic [~#args#]))
;; if f# is nonsequential it is left alone
f#))
funs)))
(let [my-prn (make-myfun prn)]
(with-myfuns :a-kind-of-magic
123
[1 2 3]
(prn :hello)
(my-prn 123)))
;; for your convenience: the macro-expansion
(let [my-prn (make-myfun prn)]
(prn (macroexpand-1 '(with-myfuns :a-kind-of-magic
123
[1 2 3]
(prn :hello)
(my-prn 123)))))
the output:
:hello
:a-kind-of-magic 123
(do 123 [1 2 3] (cl-myfn.core/call prn :a-kind-of-magic [:hello]) (cl-myfn.core/call my-prn :a-kind-of-magic [123]))
I have a requirement for a function that when called with particular input args executes a supplied function g, but only after another supplied function f has finished executing with the same input args. There is also a requirement that when the function is called multiple times with the same input args, f is only executed once on the first call, and the other calls wait for this to complete, then execute g directly.
Edit: The solution should work when run in parallel on different threads, and should also use threads efficiently. E.g. blocking should be on a per input basis rather than the whole function.
My first attempt at the function is as follows:
(defn dependent-func
([f g]
(let [mem (atom {})]
(fn [& args]
(->> (get (locking mem
(swap! mem (fn [latch-map args]
(if (contains? latch-map args)
latch-map
(let [new-latch (CountDownLatch. 1)
new-latch-map (assoc latch-map args new-latch)]
(->> (Thread. #(do (apply f args)
(.countDown new-latch)))
(.start))
new-latch-map))) args)) args)
(.await))
(apply g args)))))
This appears to meet my requirements, and awaits on f are on a per input basis, so I'm relatively happy with that. Initially I had hoped to just use swap! to do the mem updating but unfortunately swap! explicitly states that the function in the swap! could be called multiple times (I have seen this in testing). As a result of this I ended up having to lock on mem when updating which is really ugly.
I am sure there must be a cleaner way of doing this that leverages Closure's concurrency mechanisms better than I have, but so far I've been unable to find it.
Any advice would be greatly appreciated.
Thanks,
Matt.
Clojure's combination of future, promise, and deliver is well suited to starting a process and have several threads wait for it to finish.
Future is used to start a thread in the background (it can do more, though in this example I didn't need it to)
Promise is used to immediately return an object that will contain the answer once it is ready.
Deliver is used to supply the promised answer once it is ready.
I'll also split the waiting part into it's own function to make the code easier to follow, and so I can use the built in memoize function:
This question is a very good example of when to use promise and deliver rather than simply a future.
Because we are going to use memoize where it's not safe to run the function twice,
we need to be careful that the two calls don't enter memoize at exactly the same
time. so we are going to lock only the moment we enter memoize, not the duration
of the memoized function.
hello.core> (def lock [])
#'hello.core/lock
this function will always return the same future Object for every time f is called
with a given set of arguments, except we need to make memoize safe by wrapping this
in a function that does the locking (you could also use an agent for this)
hello.core> (def wait-for-function-helper
(memoize (fn [f args]
(let [answer (promise)]
(println "waiting for function " f " with args" args)
(future (deliver answer (apply f args)))
answer))))
#'hello.core/wait-for-function-helper
hello.core> (defn wait-for-function [& args]
(locking lock
(apply wait-for-function-helper args)))
#'hello.core/wait-for-function
and now we write the actual dependent-func function that uses the safely memoized,
future producing, wait-for-function function.
hello.core> (defn dependent-func [f g & args]
#(wait-for-function f args)
(apply g args))
#'hello.core/dependent-func
and define a slow opperation to see it in action:
hello.core> (defn slow-f-1 [x]
(println "starting slow-f-1")
(Thread/sleep 10000)
(println "finishing slow-f-1")
(dec x))
#'hello.core/slow-f-1
and to test it we want to start two of the same function at exactly the same time.
hello.core> (do (future
(println "first" (dependent-func slow-f-1 inc 4)))
(future
(println "second" (dependent-func slow-f-1 inc 4))))
waiting for function
#object[clojure.core$future_call$reify__6736 0x40534083 {:status :pending, :val nil}] with args (4)
#object[hello.core$slow_f_1 0x4f9b3396 hello.core$slow_f_1#4f9b3396]
starting slow-f-1
finishing slow-f-1
second
first
5
5
and if we call it again we see that slow-f-1 only ever ran once:
hello.core> (do (future
(println "first" (dependent-func slow-f-1 inc 4)))
(future
(println "second" (dependent-func slow-f-1 inc 4))))
#object[clojure.core$future_call$reify__6736 0x3935ea29 {:status :pending, :val nil}]
first 5
second 5
Something like this is a much simpler answer to your question:
(defn waiter
[f g & args]
(let [f-result (f args)
g-result (g args) ]
(println (format "waiter: f-result=%d g-result=%d" f-result g-result))))
(defn my-f
[args]
(let [result (apply + args)]
(println "my-f running:" result)
result))
; change your orig prob a bit, and define/use my-f-memo instead of the original my-f
(def my-f-memo (memoize my-f))
(defn my-g
[args]
(let [result (apply * args)]
(println "my-g running:" result)
result))
(waiter my-f-memo my-g 2 3 4)
(waiter my-f-memo my-g 2 3 4)
> lein run
my-f running: 9
my-g running: 24
waiter: f-result=9 g-result=24
my-g running: 24
waiter: f-result=9 g-result=24
main - enter
If you change the problem statement a bit and pass in a memoized version of your first function f, the solution is much easier.
Just calling the functions in sequence in a (let [...]...) form enforces the completion of the first before the execution of the 2nd function.
Also, you could force the waiter function to do the memoization of f for you, but it would be a bit more work to manually simulate what memoize already does.
Update: The original problem didn't explicitly imply it needed to work in a concurrent environment. If multiple threads are an issue, just change the definition of waiter to be:
(defn waiter
[f g & args]
(let [f-result (locking f (f args))
g-result (g args) ]
(println (format "waiter: f-result=%d g-result=%d" f-result g-result))))
There is little purpose to starting up a thread to run f on, if the very next thing you will do is wait for that thread to complete. You might as well just run f on the current thread. In that case, your problem decomposes nicely into two subproblems:
How to memoize the call to f without risking concurrent execution like the standard memoize does.
Returning a lambda that uses that memoized function and then calls g.
Let's solve these in reverse order, by first assuming (my-memoize f) works as you need it to, and then later writing it:
(defn dependent-func [f g]
(let [f' (my-memoize f)]
(fn [& args]
(apply f' args)
(apply g args))))
Very simple with a competent memoize, right? Now, to implement memoize there are a few things you can do. You could use locking, as you did, and I think that's pretty reasonable, since you explicitly want to prevent concurrent execution; once you throw out the thread-launching business it is very easy as well:
(defn my-memoize [f]
(let [memo (atom {})]
(fn [& args]
(locking memo
(if (contains? #memo args)
(get #memo args)
(get (swap! memo assoc args (apply f args))))))))
Or you can reinvent locking yourself, by storing a delay in the atom and then having each call dereference it instead:
(defn my-memoize [f]
(let [memo (atom {})]
(fn [& args]
(-> memo
(swap! update-in [args]
(fn [v]
(or v (delay (apply f args)))))
(get args)
(deref)))))
It's readable and "clever", because it does everything in a swap!, and I felt quite smug back when I figured this out the first time, but later I realized that this is just hijacking the mutex in Delay.deref() to accomplish locking, so honestly I think you might as well just use locking to make it clear there is a lock.
Is there a way to generically get metadata for arguments to a function in clojure? The answer posted in this question does not, actually, work in general:
user> (defn foo "informative dox!" [] 1)
#'user/foo
user> (defmacro get-docs [func] `(:doc (meta (var ~func))))
#'user/get-docs
user> (get-docs foo)
"informative dox!"
user> (get-docs (identity foo))
; Evaluation aborted.
user> (defn process-docs [f] (let [docs (get-docs f)] (reverse docs)))
; Evaluation aborted.
The second-to-last line doesn't work because you can't call var on the list (identity foo), and the last line doesn't even compile because the compiler complains about being unable to resolve f.
Most of the solutions for this problem I've found rely on the idea that you have access to the symbol in the function's definition, or something like that, so that you can do something like (resolve 'f) or (var f). But I want something that I can use on the argument to a function, where you don't know that information.
Essentially, I'd like an expression I can put in place of the question marks below to get the metadata of #'map:
(let [x map] (??? x))
its a mouthful though possible:
(let [x map]
(:doc (meta (second (first (filter #(and (var? (second %))
(= x (var-get (second %))))
(ns-map *ns*)))))))
produces the desired result:
"Returns a lazy sequence consisting of the result of applying f to the
set of first items of each coll, followed by applying f to the set
of second items in each coll, until any one of the colls is\n exhausted. Any remaining items in other colls are ignored. Function
f should accept number-of-colls arguments."
under the hood Namespaces are essentially maps of names to vars and the vars contain functions. you can search the contents of these vars for the one that matches the function you are seeking and then look at it's associated var and get the metadata from that var.