"Updating" a map/data structure in consecutive function calls in Clojure - clojure

I want to "update" a map through multiple calls of functions, in Clojure. The idea is expressed as below:
(defn foo1
[a-map]
(assoc a-map :key1 "value1"))
(defn foo2
[a-map]
(assoc a-map :key2 "value2"))
(defn foo3
[a-map]
(assoc a-map :key3 "value3"))
(defn -main
[]
(let [a-map {}]
(do (foo1 a-map)
(foo2 a-map)
(foo3 a-map)
a-map)))
Apparently this piece of code is wrong because the a-map is not updated outside the scope of subroutines. It's written like this simply because it's clearer as compared to my current "correct" implementation. The expected result is:
{:key1 "value1" :key2 "value2" :key3 "value3"}
My question is, what is the best way to fulfil the task in the clojure way.
I have considered nesting a few let expressions where each let can hold the result of the updated a-map. I also considered using a loop-recur structure where the a-map is passed as a binding. But both approaches look untidy.
Any advice would be appreciated.
Cheers
EDIT: Adding a constraint to the question. The functions foo1, foo2, foo3 actually takes more parameters than only the a-map. And some of these parameters can only be determined by processing the arguments passed to -main.

assoc returns the updated map so you can chain your calls using the -> threading macro:
(let [a-map (-> {}
foo1
foo2
foo3)] ...)
or you could use comp:
(let [a-map ((comp foo3 foo2 foo1) {})]...)

Related

Conditional "assignment" in functional programming

I am programming something that doesn't have side-effects, but my code is not very readable.
Consider the following piece of code:
(let [csv_data (if header_row (cons header_row data_rows) data_rows)]
)
I'm trying to use csv_data in a block of code. What is a clean way of conditioning on the presence of a header_row? I've looked at if-let, but couldn't see how that could help here.
I have run into similar situations with functional for-loops as well where I'm binding the result to a local variable, and the code looks like a pile of expressions.
Do I really have to create a separate helper function in so many cases?
What am I missing here?
Use the cond->> macro
(let [csv_data (cond->> data_rows
header_row (cons header-row)]
)
It works like the regular ->> macro, but before each threading form a test expression has to be placed that determines whether the threading form will be used.
There is also cond->. Read more about threading macros here: Official threading macros guide
First, don't use underscore, prefer dashes.
Second, there is nothing wrong with a little helper function; after all, this seems to be a requirement for handling your particular data format.
Third, if you can change your data so that you can skip those decisions and have a uniform representation for all corner cases, this is even better. A header row contains a different kind of data (column names?), so you might prefer to keep them separate:
(let [csv {:header header :rows rows}]
...)
Or maybe at some point you could have "headers" and "rows" be of the same type: sequences of rows. Then you can concat them directly.
The ensure-x idiom is a very common way to normalize your data:
(defn ensure-list [data]
(and data (list data)))
For example:
user=> (ensure-list "something")
("something")
user=> (ensure-list ())
(())
user=> (ensure-list nil)
nil
And thus:
(let [csv (concat (ensure-list header) rows)]
...)
i would propose an utility macro. Something like this:
(defmacro update-when [check val-to-update f & params]
`(if-let [x# ~check]
(~f x# ~val-to-update ~#params)
~val-to-update))
user> (let [header-row :header
data-rows [:data1 :data2]]
(let [csv-data (update-when header-row data-rows cons)]
csv-data))
;;=> (:header :data1 :data2)
user> (let [header-row nil
data-rows [:data1 :data2]]
(let [csv-data (update-when header-row data-rows cons)]
csv-data))
;;=> [:data1 :data2]
it is quite universal, and lets you fulfill more complex tasks then just simple consing. Like for example you want to reverse some coll if check is trueish, and concat another list...
user> (let [header-row :header
data-rows [:data1 :data2]]
(let [csv-data (update-when header-row data-rows
(fn [h d & params] (apply concat (reverse d) params))
[1 2 3] ['a 'b 'c])]
csv-data))
;;=> (:data2 :data1 1 2 3 a b c)
update
as noticed by #amalloy , this macro should be a function:
(defn update-when [check val-to-update f & params]
(if check
(apply f check val-to-update params)
val-to-update))
After thinking about the "cost" of a one-line helper function in the namespace I've came up with a local function instead:
(let [merge_header_fn (fn [header_row data_rows]
(if header_row
(cons header_row data_rows)
data_rows))
csv_data (merge_header_fn header_row data_rows) ]
...
<use csv_data>
...
)
Unless someone can suggest a more elegant way of handling this, I will keep this as an answer.

Why in this example calling (f arg) and calling the body of f explicitly yields different results?

First, I have no experience with CS and Clojure is my first language, so pardon if the following problem has a solution, that is immediately apparent for a programmer.
The summary of the question is as follows: one needs to create atoms at will with unknown yet symbols at unknown times. My approach revolves around a) storing temporarily the names of the atoms as strings in an atom itself; b) changing those strings to symbols with a function; c) using a function to add and create new atoms. The problem pertains to step "c": calling the function does not create new atoms, but using its body does create them.
All steps taken in the REPL are below (comments follow code blocks):
user=> (def atom-pool
#_=> (atom ["a1" "a2"]))
#'user/atom-pool
'atom-pool is the atom that stores intermediate to-be atoms as strings.
user=> (defn atom-symbols []
#_=> (mapv symbol (deref atom-pool)))
#'user/atom-symbols
user=> (defmacro populate-atoms []
#_=> (let [qs (vec (remove #(resolve %) (atom-symbols)))]
#_=> `(do ~#(for [s qs]
#_=> `(def ~s (atom #{}))))))
#'user/populate-atoms
'populate-atoms is the macro, that defines those atoms. Note, the purpose of (remove #(resolve %) (atom-symbols)) is to create only yet non-existing atoms. 'atom-symbols reads 'atom-pool and turns its content to symbols.
user=> (for [s ['a1 'a2 'a-new]]
#_=> (resolve s))
(nil nil nil)
Here it is confirmed that there are no 'a1', 'a2', 'a-new' atoms as of yet.
user=> (defn new-atom [a]
#_=> (do
#_=> (swap! atom-pool conj a)
#_=> (populate-atoms)))
#'user/new-atom
'new-atom is the function, that first adds new to-be atom as string to `atom-pool. Then 'populate-atoms creates all the atoms from 'atom-symbols function.
user=> (for [s ['a1 'a2 'a-new]]
#_=> (resolve s))
(#'user/a1 #'user/a2 nil)
Here we see that 'a1 'a2 were created as clojure.lang.Var$Unbound just by defining a function, why?
user=> (new-atom "a-new")
#'user/a2
user=> (for [s ['a1 'a2 'a-new]]
#_=> (resolve s))
(#'user/a1 #'user/a2 nil)
Calling (new-atom "a-new") did not create the 'a-new atom!
user=> (do
#_=> (swap! atom-pool conj "a-new")
#_=> (populate-atoms))
#'user/a-new
user=> (for [s ['a1 'a2 'a-new]]
#_=> (resolve s))
(#'user/a1 #'user/a2 #'user/a-new)
user=>
Here we see that resorting explicitly to 'new-atom's body did create the 'a-new atom. 'a-new is a type of clojure.lang.Atom, but 'a1 and 'a2 were skipped due to already being present in the namespace as clojure.lang.Var$Unbound.
Appreciate any help how to make it work!
EDIT: Note, this is an example. In my project the 'atom-pool is actually a collection of maps (atom with maps). Those maps have keys {:name val}. If a new map is added, then I create a corresponding atom for this map by parsing its :name key.
"The summary of the question is as follows: one needs to create atoms at will with unknown yet symbols at unknown times. "
This sounds like a solution looking for a problem. I would generally suggest you try another way of achieving whatever the actual functionality is without generating vars at runtime, but if you must, you should use intern and leave out the macro stuff.
You cannot solve this with macros since macros are expanded at compile time, meaning that in
(defn new-atom [a]
(do
(swap! atom-pool conj a)
(populate-atoms)))
populate-atoms is expanded only once; when the (defn new-atom ...) form is compiled, but you're attempting to change its expansion when new-atom is called (which necessarily happens later).
#JoostDiepenmaat is right about why populate-atoms is not behaving as expected. You simply cannot do this using macros, and it is generally best to avoid generating vars at runtime. A better solution would be to define your atom-pool as a map of keywords to atoms:
(def atom-pool
(atom {:a1 (atom #{}) :a2 (atom #{})}))
Then you don't need atom-symbols or populate-atoms because you're not dealing with vars at compile-time, but typical data structures at run-time. Your new-atom function could look like this:
(defn new-atom [kw]
(swap! atom-pool assoc kw (atom #{})))
EDIT: If you don't want your new-atom function to override existing atoms which might contain actual data instead of just #{}, you can check first to see if the atom exists in the atom-pool:
(defn new-atom [kw]
(when-not (kw #atom-pool)
(swap! atom-pool assoc kw (atom #{}))))
I've already submitted one answer to this question, and I think that that answer is better, but here is a radically different approach based on eval:
(def atom-pool (atom ["a1" "a2"]))
(defn new-atom! [name]
(load-string (format "(def %s (atom #{}))" name)))
(defn populate-atoms! []
(doseq [x atom-pool]
(new-atom x)))
format builds up a string where %s is substituted with the name you're passing in. load-string reads the resulting string (def "name" (atom #{})) in as a data structure and evals it (this is equivalent to (eval (read-string "(def ...)
Of course, then we're stuck with the problem of only defining atoms that don't already exist. We could change the our new-atom! function to make it so that we only create an atom if it doesn't already exist:
(defn new-atom! [name]
(when-not (resolve (symbol name))
(load-string (format "(def %s (atom #{}))" name name))))
The Clojure community seems to be against using eval in most cases, as it is usually not needed (macros or functions will do what you want in 99% of cases*), and eval can be potentially unsafe, especially if user input is involved -- see Brian Carper's answer to this question.
*After attempting to solve this particular problem using macros, I came to the conclusion that it either cannot be done without relying on eval, or my macro-writing skills just aren't good enough to get the job done with a macro!
At any rate, I still think my other answer is a better solution here -- generally when you're getting way down into the nuts & bolts of writing macros or using eval, there is probably a simpler approach that doesn't involve metaprogramming.

Side effects in Lisp/Clojure

My question is about structuring lisp code with side effects. The particular example I have in mind comes from Clojure, but I think it can apply to any lisp.
In this case, I am interacting with an existing library that requires some functions to be called in a particular order. The final function call creates the value I need for the rest of the procedure.
The code looks like this:
(defn foo []
(let [_ procedure-with-side-effect
__ another-procedure-with-side-effect
value procedure-that-creates-the-value]
(do-something value)))
This works and everything is great, except I think the let block looks hideous. Is there a better way to do this?
If you don't need the intermediate values of the function calls, you can just put a bunch of function calls in the body of the defn:
(defn foo []
(procedure-with-side-effect)
(another-procedure-with-side-effect)
(do-something (procedure-that-creates-the-value)))
While this is the best for this code, there are other options. You can also put any number of function calls in the body of a let:
(let [val 3]
(fun-call-1)
(fun-call-2)
(fun-call-3 val))
And if you don't want to bind any values, you can use do:
(do (fun-call-1)
(fun-call-2)
(fun-call-3))
In Lisp every function body is a ordered set of forms. The value(s) of the last form will be returned. If the procedures don't use intermediate result values as arguments, a LET is not necessary. If the procedure-that-creates-the-value does not need to be documented by naming a variable, the LET binding for its value is also not necessary.
So in Lisp the code is just this:
(defun foo ()
(procedure-with-side-effect)
(another-procedure-with-side-effect)
(do-something (procedure-that-creates-the-value)))
I'm not super experienced, but I'd do it this way:
(defn foo []
(procedure-with-side-effect)
(another-procedure-with-side-effect)
(let [value (procedure-that-creates-the-value)]
(do-something value)))
or
(defn foo []
(procedure-with-side-effect)
(another-procedure-with-side-effect)
(-> (procedure-that-creates-the-value)
do-something))
or
(defn foo []
(procedure-with-side-effect)
(another-procedure-with-side-effect)
(do-something (procedure-that-creates-the-value)))
Edit: defn expressions are wrapped with an implicit do.

Conditional elements in -> / ->> pipelines

Given a ->> pipeline like so:
(defn my-fn []
(->> (get-data)
(do-foo)
(do-bar)
(do-baz)))
I wish to make the various stages conditional.
The first way of writing this that came to mind was as such:
(defn my-fn [{:keys [foo bar baz]}]
(->> (get-data)
(if foo (do-foo) identity)
(if bar (do-bar) identity)
(if baz (do-baz) identity))
However, as the ->> macro attempts to insert into the if form, this not only looks unfortunate in terms of performance (having the noop identity calls), but actually fails to compile.
What would an appropriate, reasonably DRY way of writing this be?
Modern Clojure (that is, as of 1.5) supports a variety of options for conditional threading, but you probably want cond->>.
conditional threading with cond-> and cond->>
Clojure offers cond-> and cond->>, which each ask for a set of pairs: a test and an expression if that test evaluates to true. These are quite similar to cond but don't stop at the first true test.
(cond->> 5
true inc
false inc
nil inc)
=> 6
Your specific example is probably best written like so:
(defn my-fn [{:keys [foo bar baz]}]
(cond->> (get-data)
foo (do-foo)
bar (do-bar)
baz (do-baz)))
as->
It's worth mentioning as-> because it is perhaps the most versatile threading macro. It gives you a name to refer to the "thing" being threaded through the forms. For instance:
(as-> 0 n
(inc n)
(if false
(inc n)
n))
=> 1
(as-> 0 n
(inc n)
(if true
(inc n)
n))
=> 2
This gives substantial flexibility when working with a mix of functions that require threading the expression at different points in the parameter list (that is, switching from -> to ->> syntax). One should avoid the use of extraneous named variables in the interest of code readability, but oftentimes this is the clearest and simplest way to express a process.
This works too:
(defn my-fn [{:keys [foo bar baz]}]
(->> (get-data)
(#(if foo (do-foo %) %))
(#(if bar (do-bar %) %))
(#(if baz (do-baz %) %)) ))
You might be interested in these macros https://github.com/pallet/thread-expr
you can fix the compiling part (though not the ascetic part by having the if statement decide which function to run by putting another set of ( ) as so:
(defn my-fn [{:keys [foo bar baz]}]
(->> (get-data)
((if foo do-foo identity))
((if bar do-bar identity))
((if baz do-baz identity)))
would expand into a series of calls like this:
; choose function and call it
( (if foo do-foo identity) args )
( (if bar do-bar identity) args )
( (if baz do-baz identity) args )
A more general way to go if you need this sort of thing often:
(defn comp-opt [& flag-fns]
(->> flag-fns
(partition 2)
(filter first)
(map second)
(apply comp)))
(defn my-fn [{:keys [foo bar baz]}]
(map (comp-opt foo do-foo
bar do-bar
baz do-baz)
(get-data)))

In Clojure, how to define a variable named by a string?

Given a list of names for variables, I want to set those variables to an expression.
I tried this:
(doall (for [x ["a" "b" "c"]] (def (symbol x) 666)))
...but this yields the error
java.lang.Exception: First argument to def must be a Symbol
Can anyone show me the right way to accomplish this, please?
Clojure's "intern" function is for this purpose:
(doseq [x ["a" "b" "c"]]
(intern *ns* (symbol x) 666))
(doall (for [x ["a" "b" "c"]] (eval `(def ~(symbol x) 666))))
In response to your comment:
There are no macros involved here. eval is a function that takes a list and returns the result of executing that list as code. ` and ~ are shortcuts to create a partially-quoted list.
` means the contents of the following lists shall be quoted unless preceded by a ~
~ the following list is a function call that shall be executed, not quoted.
So ``(def ~(symbol x) 666)is the list containing the symboldef, followed by the result of executingsymbol xfollowed by the number of the beast. I could as well have written(eval (list 'def (symbol x) 666))` to achieve the same effect.
Updated to take Stuart Sierra's comment (mentioning clojure.core/intern) into account.
Using eval here is fine, but it may be interesting to know that it is not necessary, regardless of whether the Vars are known to exist already. In fact, if they are known to exist, then I think the alter-var-root solution below is cleaner; if they might not exist, then I wouldn't insist on my alternative proposition being much cleaner, but it seems to make for the shortest code (if we disregard the overhead of three lines for a function definition), so I'll just post it for your consideration.
If the Var is known to exist:
(alter-var-root (resolve (symbol "foo")) (constantly new-value))
So you could do
(dorun
(map #(-> %1 symbol resolve (alter-var-root %2))
["x" "y" "z"]
[value-for-x value-for-y value-for z]))
(If the same value was to be used for all Vars, you could use (repeat value) for the final argument to map or just put it in the anonymous function.)
If the Vars might need to be created, then you can actually write a function to do this (once again, I wouldn't necessarily claim this to be cleaner than eval, but anyway -- just for the interest of it):
(defn create-var
;; I used clojure.lang.Var/intern in the original answer,
;; but as Stuart Sierra has pointed out in a comment,
;; a Clojure built-in is available to accomplish the same
;; thing
([sym] (intern *ns* sym))
([sym val] (intern *ns* sym val)))
Note that if a Var turns out to have already been interned with the given name in the given namespace, then this changes nothing in the single argument case or just resets the Var to the given new value in the two argument case. With this, you can solve the original problem like so:
(dorun (map #(create-var (symbol %) 666) ["x" "y" "z"]))
Some additional examples:
user> (create-var 'bar (fn [_] :bar))
#'user/bar
user> (bar :foo)
:bar
user> (create-var 'baz)
#'user/baz
user> baz
; Evaluation aborted. ; java.lang.IllegalStateException:
; Var user/baz is unbound.
; It does exist, though!
;; if you really wanted to do things like this, you'd
;; actually use the clojure.contrib.with-ns/with-ns macro
user> (binding [*ns* (the-ns 'quux)]
(create-var 'foobar 5))
#'quux/foobar
user> quux/foobar
5
Evaluation rules for normal function calls are to evaluate all the items of the list, and call the first item in the list as a function with the rest of the items in the list as parameters.
But you can't make any assumptions about the evaluation rules for special forms or macros. A special form or the code produced by a macro call could evaluate all the arguments, or never evaluate them, or evaluate them multiple times, or evaluate some arguments and not others. def is a special form, and it doesn't evaluate its first argument. If it did, it couldn't work. Evaluating the foo in (def foo 123) would result in a "no such var 'foo'" error most of the time (if foo was already defined, you probably wouldn't be defining it yourself).
I'm not sure what you're using this for, but it doesn't seem very idiomatic. Using def anywhere but at the toplevel of your program usually means you're doing something wrong.
(Note: doall + for = doseq.)