Conditional elements in -> / ->> pipelines - clojure

Given a ->> pipeline like so:
(defn my-fn []
(->> (get-data)
(do-foo)
(do-bar)
(do-baz)))
I wish to make the various stages conditional.
The first way of writing this that came to mind was as such:
(defn my-fn [{:keys [foo bar baz]}]
(->> (get-data)
(if foo (do-foo) identity)
(if bar (do-bar) identity)
(if baz (do-baz) identity))
However, as the ->> macro attempts to insert into the if form, this not only looks unfortunate in terms of performance (having the noop identity calls), but actually fails to compile.
What would an appropriate, reasonably DRY way of writing this be?

Modern Clojure (that is, as of 1.5) supports a variety of options for conditional threading, but you probably want cond->>.
conditional threading with cond-> and cond->>
Clojure offers cond-> and cond->>, which each ask for a set of pairs: a test and an expression if that test evaluates to true. These are quite similar to cond but don't stop at the first true test.
(cond->> 5
true inc
false inc
nil inc)
=> 6
Your specific example is probably best written like so:
(defn my-fn [{:keys [foo bar baz]}]
(cond->> (get-data)
foo (do-foo)
bar (do-bar)
baz (do-baz)))
as->
It's worth mentioning as-> because it is perhaps the most versatile threading macro. It gives you a name to refer to the "thing" being threaded through the forms. For instance:
(as-> 0 n
(inc n)
(if false
(inc n)
n))
=> 1
(as-> 0 n
(inc n)
(if true
(inc n)
n))
=> 2
This gives substantial flexibility when working with a mix of functions that require threading the expression at different points in the parameter list (that is, switching from -> to ->> syntax). One should avoid the use of extraneous named variables in the interest of code readability, but oftentimes this is the clearest and simplest way to express a process.

This works too:
(defn my-fn [{:keys [foo bar baz]}]
(->> (get-data)
(#(if foo (do-foo %) %))
(#(if bar (do-bar %) %))
(#(if baz (do-baz %) %)) ))

You might be interested in these macros https://github.com/pallet/thread-expr

you can fix the compiling part (though not the ascetic part by having the if statement decide which function to run by putting another set of ( ) as so:
(defn my-fn [{:keys [foo bar baz]}]
(->> (get-data)
((if foo do-foo identity))
((if bar do-bar identity))
((if baz do-baz identity)))
would expand into a series of calls like this:
; choose function and call it
( (if foo do-foo identity) args )
( (if bar do-bar identity) args )
( (if baz do-baz identity) args )

A more general way to go if you need this sort of thing often:
(defn comp-opt [& flag-fns]
(->> flag-fns
(partition 2)
(filter first)
(map second)
(apply comp)))
(defn my-fn [{:keys [foo bar baz]}]
(map (comp-opt foo do-foo
bar do-bar
baz do-baz)
(get-data)))

Related

create new variables in a long function in clojure

So I recently learned that I cannot modify parameters in a Clojure function.
I have a big function that takes in a list, then does about 5 different steps to modify it and then returns the output.
But what I'm doing now is
(defn modify [my-list other params]
(if (nil? my-list)
my-list
(let [running my-list]
(def running (filter #(> (count %) 1) my-list))
(def running (adjust-values running))
; a whole bunch of other code with conditions that depend on other parameters and that modify the running list
(if (< (count other) (count my-list))
(def running (something))
(def running (somethingelse)))
(def my-map (convert-list-to-map my-list))
(loop [] .... ) loop through map to do some operation
(def my-list (convert-map-to-list my-map)
running
)))
This doesn't seem correct, but I basically tried writing the code as I'd do in Python cuz I wasn't sure how else to do it. How is it done in Clojure?
Instead of using def inside the modify function, you can have a let with multiple bindings. Actually, def is typically only used at the top-level to define things once and is not meant to be a mechanism to allow for mutability. Here is an example of a let with multiple bindings, which is similar to introducing local variables except that you cannot change them:
(defn modify2 [my-list other params]
(if (nil? my-list)
my-list
(let [a (filter #(> (count %) 1) my-list)
b (adjust-values a)
c (adjust-values1 other b)
d (adjust-values2 params c)]
d)))
Here we introduce new names a, b, c and d for each partial result of the final computation. But it is OK to just have a single binding that gets rebound on each line, that is, you could have a single binding running that gets rebound:
(defn modify2 [my-list other params]
(if (nil? my-list)
my-list
(let [running (filter #(> (count %) 1) my-list)
running (adjust-values running)
running (adjust-values1 other running)
running (adjust-values2 params running)]
running)))
Which one you prefer is a matter of style and taste, there are up and downsides with either approach. The let form to introduce new bindings is a powerful construct, but for this specific example where we have a pipeline of steps, we can use the ->> macro that will generate the code for us. So we would instead write
(defn modify3 [my-list other params]
(if (nil? my-list)
my-list
(->> my-list
(filter #(> (count %) 1))
adjust-values
(adjust-values1 other)
(adjust-values2 params))))
It takes the first macro argument and then passes it in as the last parameter to the function call on the following line. Then the result of that line goes in as a last parameter to the line that follows and so on. If a function call just takes a single argument as is the case for adjust-values in the example above, we don't need to surround it with parentheses. See also the similar -> macro.
To see which code is generated by the ->>, we can use macroexpand:
(macroexpand '(->> my-list
(filter #(> (count %) 1))
adjust-values
(adjust-values1 other)
(adjust-values2 params)))
;; => (adjust-values2 params (adjust-values1 other (adjust-values (filter (fn* [p1__7109#] (> (count p1__7109#) 1)) my-list))))
Added: Summary
If your computation has a pipeline structure, the -> and ->> macros can be used to express that computation concisely. However, if your computation has a general shape where, you will want to use let to associate symbols with results of sub-expressions, so that you can use those symbols in subsequent expressions inside the let form.
Yes, this is the way Clojure was designed: as a functional language with
immutable data (of course with fallbacks to what the host offers if you
want or need it).
So if you want to modify data in consecutive steps, then you can either
chain the calls (looks nicer with the threading macros). E.g.
(->> my-list
(filter #(> (count %) 1))
(adjust-values))
This is the same as:
(adjust-values
(filter #(> (count %) 1)
my-list))
If you prefer to do that in steps (e.g. you want to print intermediate
results or you need them), you can have multiple bindings in the let. E.g.
(let [running my-list
filttered (filter #(> (count %) 1) running)
adjusted (adjust-values filtered)
running ((if (< (count other) (count adjusted)) something somethingelse))
my-map (convert-list-to-map my-list)
transformed-map (loop [] .... )
result (convert-map-to-list transformed-map)]
result)
This returns the adjusted values and holds on to all the things in
between (this does nothing right now with the intermediate results, just an example).
And aside: never ever def inside other forms unless you know what you
are doing; def define top level vars in a namespace - it's not a way
to define mutable variables you can bang on iteratively like you might
be used to from other languages).

Dynamic symbols in Clojure Macro/Special Form

I have a question regarding how to define functions/macros which call other macros or special forms but where one of the symbols passed in needs to be dynamic.
The simplest version of this question is described below:
We can define variables using def
(def x 0)
But what if we wanted the name x to be determined programmatically so that we could do the equivalent of?
(let [a 'b]
(our-def a 3)) => user/b
We could try to define a function
(defn defn2 [sym val]
(def sym val))
However it does not do what we want
(def2 'y 1) => #'user/sym
At first it seems like a macro works (even though it seems like it would be unnecessary)
(defmacro def3 [sym val]
`(def ~sym ~val))
(def3 z 2) => user/z
but it is just superficial, because we're really right back where we started with regular def.
(let [a 'b]
(def3 a 3)) => user/a
I can do it if I use eval, but it doesn't seem like eval should be necessary
(defn def4 [sym val]
(eval `(def ~sym ~val)))
(let [a 'b]
(def4 a 4)) => user/b
If there are other built-in commands that could achieve this particular example, they are not really what I am looking for since def is just to show a particular example. There are macros more complicated than def that I might want to call and not have to worry about how they were internally implemented.
First: The right way to do this is to use macro that starts with def... since this is the way people have been doing defs and is of little surprise to the user.
To answer you question: Use intern:
(def foo 'bar)
(intern *ns* foo :hi)
(pr bar) ;; => :hi
(intern *ns* foo :hi2)
(pr bar) ;; => :hi2
If you want to use macros do this:
(def z 'aa)
(defmacro def3 [sym val]
`(def ~(eval sym) ~val))
(def3 z 2)
(pr aa) ;; => 2

Why in this example calling (f arg) and calling the body of f explicitly yields different results?

First, I have no experience with CS and Clojure is my first language, so pardon if the following problem has a solution, that is immediately apparent for a programmer.
The summary of the question is as follows: one needs to create atoms at will with unknown yet symbols at unknown times. My approach revolves around a) storing temporarily the names of the atoms as strings in an atom itself; b) changing those strings to symbols with a function; c) using a function to add and create new atoms. The problem pertains to step "c": calling the function does not create new atoms, but using its body does create them.
All steps taken in the REPL are below (comments follow code blocks):
user=> (def atom-pool
#_=> (atom ["a1" "a2"]))
#'user/atom-pool
'atom-pool is the atom that stores intermediate to-be atoms as strings.
user=> (defn atom-symbols []
#_=> (mapv symbol (deref atom-pool)))
#'user/atom-symbols
user=> (defmacro populate-atoms []
#_=> (let [qs (vec (remove #(resolve %) (atom-symbols)))]
#_=> `(do ~#(for [s qs]
#_=> `(def ~s (atom #{}))))))
#'user/populate-atoms
'populate-atoms is the macro, that defines those atoms. Note, the purpose of (remove #(resolve %) (atom-symbols)) is to create only yet non-existing atoms. 'atom-symbols reads 'atom-pool and turns its content to symbols.
user=> (for [s ['a1 'a2 'a-new]]
#_=> (resolve s))
(nil nil nil)
Here it is confirmed that there are no 'a1', 'a2', 'a-new' atoms as of yet.
user=> (defn new-atom [a]
#_=> (do
#_=> (swap! atom-pool conj a)
#_=> (populate-atoms)))
#'user/new-atom
'new-atom is the function, that first adds new to-be atom as string to `atom-pool. Then 'populate-atoms creates all the atoms from 'atom-symbols function.
user=> (for [s ['a1 'a2 'a-new]]
#_=> (resolve s))
(#'user/a1 #'user/a2 nil)
Here we see that 'a1 'a2 were created as clojure.lang.Var$Unbound just by defining a function, why?
user=> (new-atom "a-new")
#'user/a2
user=> (for [s ['a1 'a2 'a-new]]
#_=> (resolve s))
(#'user/a1 #'user/a2 nil)
Calling (new-atom "a-new") did not create the 'a-new atom!
user=> (do
#_=> (swap! atom-pool conj "a-new")
#_=> (populate-atoms))
#'user/a-new
user=> (for [s ['a1 'a2 'a-new]]
#_=> (resolve s))
(#'user/a1 #'user/a2 #'user/a-new)
user=>
Here we see that resorting explicitly to 'new-atom's body did create the 'a-new atom. 'a-new is a type of clojure.lang.Atom, but 'a1 and 'a2 were skipped due to already being present in the namespace as clojure.lang.Var$Unbound.
Appreciate any help how to make it work!
EDIT: Note, this is an example. In my project the 'atom-pool is actually a collection of maps (atom with maps). Those maps have keys {:name val}. If a new map is added, then I create a corresponding atom for this map by parsing its :name key.
"The summary of the question is as follows: one needs to create atoms at will with unknown yet symbols at unknown times. "
This sounds like a solution looking for a problem. I would generally suggest you try another way of achieving whatever the actual functionality is without generating vars at runtime, but if you must, you should use intern and leave out the macro stuff.
You cannot solve this with macros since macros are expanded at compile time, meaning that in
(defn new-atom [a]
(do
(swap! atom-pool conj a)
(populate-atoms)))
populate-atoms is expanded only once; when the (defn new-atom ...) form is compiled, but you're attempting to change its expansion when new-atom is called (which necessarily happens later).
#JoostDiepenmaat is right about why populate-atoms is not behaving as expected. You simply cannot do this using macros, and it is generally best to avoid generating vars at runtime. A better solution would be to define your atom-pool as a map of keywords to atoms:
(def atom-pool
(atom {:a1 (atom #{}) :a2 (atom #{})}))
Then you don't need atom-symbols or populate-atoms because you're not dealing with vars at compile-time, but typical data structures at run-time. Your new-atom function could look like this:
(defn new-atom [kw]
(swap! atom-pool assoc kw (atom #{})))
EDIT: If you don't want your new-atom function to override existing atoms which might contain actual data instead of just #{}, you can check first to see if the atom exists in the atom-pool:
(defn new-atom [kw]
(when-not (kw #atom-pool)
(swap! atom-pool assoc kw (atom #{}))))
I've already submitted one answer to this question, and I think that that answer is better, but here is a radically different approach based on eval:
(def atom-pool (atom ["a1" "a2"]))
(defn new-atom! [name]
(load-string (format "(def %s (atom #{}))" name)))
(defn populate-atoms! []
(doseq [x atom-pool]
(new-atom x)))
format builds up a string where %s is substituted with the name you're passing in. load-string reads the resulting string (def "name" (atom #{})) in as a data structure and evals it (this is equivalent to (eval (read-string "(def ...)
Of course, then we're stuck with the problem of only defining atoms that don't already exist. We could change the our new-atom! function to make it so that we only create an atom if it doesn't already exist:
(defn new-atom! [name]
(when-not (resolve (symbol name))
(load-string (format "(def %s (atom #{}))" name name))))
The Clojure community seems to be against using eval in most cases, as it is usually not needed (macros or functions will do what you want in 99% of cases*), and eval can be potentially unsafe, especially if user input is involved -- see Brian Carper's answer to this question.
*After attempting to solve this particular problem using macros, I came to the conclusion that it either cannot be done without relying on eval, or my macro-writing skills just aren't good enough to get the job done with a macro!
At any rate, I still think my other answer is a better solution here -- generally when you're getting way down into the nuts & bolts of writing macros or using eval, there is probably a simpler approach that doesn't involve metaprogramming.

Clojure: issues passing a bound variable by doseq to another function

I am really not sure what is the problem here. I started to experience this "issue" with this kind of code:
First I did define that string with some metadata:
(def ^{:meta-attr ["foo" "bar"]
:meta-attr2 "some value"} foo "some value")
Then I did create the following two functions:
(defn second-fn [values]
(for [x values] (println x)))
(defn first-fn [value]
(doseq [[meta-key meta-val] (seq (meta value))]
(if (= meta-key :meta-attr)
(second-fn meta-val))))
Now when I run this command in the REPL:
(first-fn #'foo)
I am getting nil.
However, if I change second-fn for:
(defn second-fn [values]
(println values))
And if I run that command again, I am getting this in the REPL:
user> (first-fn #'foo)
[foo bar]
nil
What I was expecting to get in the REPL with the first version of my function is the following:
user> (first-fn #'foo)
foo
bar
nil
But somehow, I think there is something I don't get that is related to the bound variable by doseq.
Here is another set of functions that has exactly the same behavior:
(defn test-2 [values]
; (println values))
(for [x values] (println x)))
(defn test-1 [values]
(doseq [x values]
(test-2 x)))
(test-1 [["1.1" "1.2"] ["2"] ["3"]])
I think I am missing some Clojure knowledge to understand what is going on here. Why it looks like good when I println or pprint the value in the second function, but the for is not working...
Update and Final Thoughts
As answered for this question, the problem has to do with lazyness of the for function. Let's take the simplest example to illustrate what is going on.
(defn test-2 [values]
(for [x values] (println x)))
(defn test-1 [values]
(doseq [x values]
(test-2 x)))
What happens there is that in test-1, every time that doseq "iterate", then a new non-lazy sequence is being created. That means that they are accessible like any other collection during the "looping".
doseq should generally be used when you work with non-pure functions that may have side effects, or I think when you are playing with relatively small collections.
Then when test-2 is called, the for will create a lazy-seq. That means that the sequence exists, but that it never did get realized (so, each step hasn't been computed yet). As is, nothing will happen with these two functions, since none of the values returned by the for have been realized.
If we want to keep this doseq and this for loops, then we have to make sure that for get realized in test-2. We can do this that way:
(defn test-2 [values]
(doall (for [x values] (println x))))
(defn test-1 [values]
(doseq [x values]
(test-2 x)))
That doall does here, is to force the full realization of the sequence returned by the for loop. That way, we will end with the expected result.
Additionally, we could realize the lazy-seq returned by for using other functions like:
(defn test-2 [values]
(first (for [x values] (println x))))
(defn test-2 [values]
(count (for [x values] (println x))))
None of this make sense, but all of these examples for the realization of the lazy-seq returned by the for.
Additionally, we could have simply used two doseq like this:
(defn test-2 [values]
(doseq [x values] (println x)))
(defn test-1 [values]
(doseq [x values]
(test-2 x)))
That way, we don't use any lazy-seq and so we don't have to realize anything since nothing is evaluated lazilly.
for is lazy, while doseq is eager.
for is "functional" (values) and doseq is "imperative" (side-effects).
In other words, you should not be using for in second-fn, since you seem to be worried only with side-effects. What you are actually doing there is building a lazy sequence (which, it seems, is never executed).
See Difference between doseq and for in Clojure for further info.

What is the difference between the reader monad and a partial function in Clojure?

Leonardo Borges has put together a fantastic presentation on Monads in Clojure. In it he describes the reader monad in Clojure using the following code:
;; Reader Monad
(def reader-m
{:return (fn [a]
(fn [_] a))
:bind (fn [m k]
(fn [r]
((k (m r)) r)))})
(defn ask [] identity)
(defn asks [f]
(fn [env]
(f env)))
(defn connect-to-db []
(do-m reader-m
[db-uri (asks :db-uri)]
(prn (format "Connected to db at %s" db-uri))))
(defn connect-to-api []
(do-m reader-m
[api-key (asks :api-key)
env (ask)]
(prn (format "Connected to api with key %s" api-key))))
(defn run-app []
(do-m reader-m
[_ (connect-to-db)
_ (connect-to-api)]
(prn "Done.")))
((run-app) {:db-uri "user:passwd#host/dbname" :api-key "AF167"})
;; "Connected to db at user:passwd#host/dbname"
;; "Connected to api with key AF167"
;; "Done."
The benefit of this is that you're reading values from the environment in a purely functional way.
But this approach looks very similar to the partial function in Clojure. Consider the following code:
user=> (def hundred-times (partial * 100))
#'user/hundred-times
user=> (hundred-times 5)
500
user=> (hundred-times 4 5 6)
12000
My question is: What is the difference between the reader monad and a partial function in Clojure?
The reader monad is a set of rules we can apply to cleanly compose readers. You could use partial to make a reader, but it doesn't really give us a way to put them together.
For example, say you wanted a reader that doubled the value it read. You might use partial to define it:
(def doubler
(partial * 2))
You might also want a reader that added one to whatever value it read:
(def plus-oner
(partial + 1))
Now, suppose you wanted to combine these guys in a single reader that adds their results. You'll probably end up with something like this:
(defn super-reader
[env]
(let [x (doubler env)
y (plus-oner env)]
(+ x y)))
Notice that you have to explicitly forward the environment to those readers. Total bummer, right? Using the rules provided by the reader monad, we can get much cleaner composition:
(def super-reader
(do-m reader-m
[x doubler
y plus-oner]
(+ x y)))
You can use partial to "do" the reader monad. Turn let into a do-reader by doing syntactic transformation on let with partial application of the environment on the right-hand side.
(defmacro do-reader
[bindings & body]
(let [env (gensym 'env_)
partial-env (fn [f] (list `(partial ~f ~env)))
bindings* (mapv #(%1 %2) (cycle [identity partial-env]) bindings)]
`(fn [~env] (let ~bindings* ~#body))))
Then do-reader is to the reader monad as let is to the identity monad (relationship discussed here).
Indeed, since only the "do notation" application of the reader monad was used in Beyamor's answer to your reader monad in Clojure question, the same examples will work as is with m/domonad Reader replaced with do-reader as above.
But, for the sake of variety I'll modify the first example to be just a bit more Clojurish with the environment map and take advantage of the fact that keywords can act as functions.
(def sample-bindings {:count 3, :one 1, :b 2})
(def ask identity)
(def calc-is-count-correct?
(do-reader [binding-count :count
bindings ask]
(= binding-count (count bindings))))
(calc-is-count-correct? sample-bindings)
;=> true
Second example
(defn local [modify reader] (comp reader modify))
(def calc-content-len
(do-reader [content ask]
(count content)))
(def calc-modified-content-len
(local #(str "Prefix " %) calc-content-len))
(calc-content-len "12345")
;=> 5
(calc-modified-content-len "12345")
;=> 12
Note since we built on let, we still have destructing at our disposal. Silly example:
(def example1
(do-reader [a :foo
b :bar]
(+ a b)))
(example1 {:foo 2 :bar 40 :baz 800})
;=> 42
(def example2
(do-reader [[a b] (juxt :foo :bar)]
(+ a b)))
(example2 {:foo 2 :bar 40 :baz 800})
;=> 42
So, in Clojure, you can indeed get the functionality of the do notation of reader monad without introducing monads proper. Analagous to doing a ReaderT transform on the identity monad, we can do a syntactic transformation on let. As you surmised, one way to do so is with partial application of the environment.
Perhaps more Clojurish would be to define a reader-> and reader->> to syntactically insert the environment as the second and last argument respectively. I'll leave those as an exercise for the reader for now.
One take-away from this is that while types and type-classes in Haskell have a lot of benefits and the monad structure is a useful idea, not having the constraints of the type system in Clojure allows us to treat data and programs in the same way and do arbitrary transformations to our programs to implement syntax and control as we see fit.