So I've got something where I want to, within a try block, add various data to some data object, and then in case an exception is thrown, save that record with the error and all data fields retrieved before the exception. In Java this'd be easy. Even if you go with some kind of immutable type of record, you can do:
MyRecord record = new MyRecord();
try {
record = record.withA(dangerouslyGetA());
record = record.withB(dangerouslyGetB());
record = record.withC(dangerouslyGetC());
} catch (Exception ex) {
record = record.withError(ex);
}
save(record);
So if it bombs at step C, then it'll save record with A, B and the error.
I can't figure out any straightforward way to do this in Clojure. If you put a try around a let then you have to assign the record's "updates" to new variables each, and thus they aren't in scope in the catch expression. And even if they were, you wouldn't know which one to use.
I guess I could put a try/catch/let around each expression, but that's a lot more code than the Java version and requires duplicating the save statement everywhere. My understanding was that Clojure was great for its terseness and easy avoidance of duplication, so something makes me think this is the wrong way to go.
Certainly this is a fairly common need and has a simple idiomatic solution, right?
I think wrapping every single statement is actually the most idiomatic solution to this. And if you don't want to write too much, you can construct a macro to add exception handling to your single steps.
(defmacro with-error->
[error-fn value & forms]
(if-not (seq forms)
value
`(let [ef# ~error-fn
v# ~value]
(try
(with-error-> ef# (-> v# ~(first forms)) ~#(rest forms))
(catch Exception ex# (ef# v# ex#))))))
This behaves like -> but will call error-fn on the current value (and the exception) if the catch block is invoked:
(with-error-> #(assoc % :error %2) {}
(assoc :x 0)
(assoc :y 1)
(assoc :z (throw (Exception. "oops.")))
(assoc :a :i-should-not-be-reached))
;; => {:error #<Exception java.lang.Exception: oops.>, :y 1, :x 0}
Of course, you could always do it with mutable state, e.g. an atom, but I don't think you should if you can achieve the same with a little bit of macro-fu.
Related
Suppose I've got a function (remove-bad-nodes g) that returns a sequence like this:
[updated-g bad-nodes]
where updated-g is a graph with its bad nodes removed, and bad-nodes is a collection containing the removed nodes.
As an argument to a function or inside a let, I could destructure it like this:
(let [[g bads] (remove-bad-nodes g)]
...)
but that only defines local variables. How could I do that in the REPL, so that in future commands I can refer to the updated graph as g and the removed nodes as bads? The first thing that comes to mind is this:
(def [g bads] (remove-bad-nodes g)
but that doesn't work, because def needs its first argument to be a Symbol.
Note that I'm not asking why def doesn't have syntax like let; there's already a question about that. I'm wondering what is a convenient, practical way to work in the REPL with functions that return "multiple values". If there's some reason why in normal Clojure practice there's no need to destructure in the REPL, because you do something else instead, explaining that might make a useful answer. I've been running into this a lot lately, which is why I'm asking. Usually, but not always, these functions return an updated version of something along with some other information. In side-effecting code, the function would modify the object and return only one value (the removed nodes, in the example), but obviously that's not the Clojurely way to do it.
I think the way to work with such functions in the repl is just to not def your intermediate results unless they are particularly interesting; for interesting-enough intermediate results it's not a big hassle to either def them to a single name, or to write multiple defs inside a destructuring form.
For example, instead of
(def [x y] (foo))
(def [a b] (bar x y))
you could write
(let [[x y] (foo),
[x' y'] (bar x y)])
(def a x') ; or maybe leave this out if `a` isn't that interesting
(def b y'))
A nice side effect of this is that the code you write while playing around in the repl will look much more similar to the code you will one day add to your source file, where you will surely not be defing things over and over, but rather destructuring them, passing them to functions, and so on. It will be easier to adapt the information you learned at the repl into a real program.
There's nothing unique about destructuring w/r/t the REPL. The answer to your question is essentially the same as this question. I think your options are:
let:
(let [[light burnt just-right] (classify-toasts (make-lots-of-toast))]
(prn light burnt just-right))
def the individual values:
(def result (classify-toasts (make-lots-of-toast)))
(def light (nth result 0))
(def burnt (nth result 1))
(def just-right (nth result 2))
Or write a macro to do that def work for you.
You could also consider a different representation if your function is always returning a 3-tuple/vector e.g. you could alternatively return a map from classify-toasts:
{:light 1, :burnt 2, :just-right 3}
And then when you need one of those values, destructure the map using the keywords wherever you need:
(:light the-map) => 1
Observe:
user=> (def results [1 2 3])
#'user/results
user=> (let [[light burnt just-right] results] (def light light) (def burnt burnt) (def just-right just-right))
#'user/just-right
user=> light
1
user=> burnt
2
user=> just-right
3
I've made the same dumb mistake many many times:
(defrecord Record [field-name])
(let [field (:feld-name (->Record 1))] ; Whoops!
(+ 1 field))
Since I misspelled the field name keyword, this will cause a NPE.
The "obvious" solution to this would be to have defrecord emit namespaced keywords instead, since then, especially when working in a different file, the IDE will be able to immediately show what keywords are available as soon as I type ::n/.
I could probably with some creativity create a macro that wraps defrecord that creates the keywords for me, but this seems like overkill.
Is there a way to have defrecord emit namespaced field accessors, or is there any other good way to avoid this problem?
Because defrecords compile to java classes and fields on a java class don't have a concept of namespaces, I don't think there's a good way to have defrecord emit namespaced keywords.
One alternative, if the code is not performance sensitive and doesn't need to implement any protocols and similar, is to just use maps.
Another is, like Alan Thompson's solution, to make a safe-get funtion. The prismatic/plumbing util library also has an implementation of this.
(defn safe-get [m k]
(let [ret (get m k ::not-found)]
(if (= ::not-found ret)
(throw (ex-info "Key not found: " {:map m, :key k}))
ret)))
(defrecord x [foo])
(safe-get (->x 1) :foo) ;=> 1
(safe-get (->x 1) :fo) ;=>
;; 1. Unhandled clojure.lang.ExceptionInfo
;; Key not found:
;; {:map {:foo 1}, :key :fo}
I feel your pain. Thankfully I have a solution that saves me many times/week that I've been using a couple of years. It is the grab function from the Tupelo library. It does not provide the type of IDE integration you are hoping for, but it does provide fail-fast typo-detection, so you always be notified the very first time you try to use the non-existant key. Another benefit is that you'll get a stacktrace showing the line number with the misspelled keyword, not the line number (possibly far, far away) where the nil value causes a NPE.
It also works equally well for both records & plain-old maps (my usual use-case).
From the README:
Map Value Lookup
Maps are convenient, especially when keywords are used as functions to look up a value in a map. Unfortunately, attempting to look up a non-existent keyword in a map will return nil. While sometimes convenient, this means that a simple typo in the keyword name will silently return corrupted data (i.e. nil) instead of the desired value.
Instead, use the function grab for keyword/map lookup:
(grab k m)
"A fail-fast version of keyword/map lookup. When invoked as (grab :the-key the-map),
returns the value associated with :the-key as for (clojure.core/get the-map :the-key).
Throws an Exception if :the-key is not present in the-map."
(def sidekicks {:batman "robin" :clark "lois"})
(grab :batman sidekicks)
;=> "robin"
(grab :spiderman m)
;=> IllegalArgumentException Key not present in map:
map : {:batman "robin", :clark "lois"}
keys: [:spiderman]
The function grab should also be used in place of clojure.core/get. Simply reverse the order of arguments to match the "keyword-first, map-second" convention.
For looking up values in nested maps, the function fetch-in replaces clojure.core/get-in:
(fetch-in m ks)
"A fail-fast version of clojure.core/get-in. When invoked as (fetch-in the-map keys-vec),
returns the value associated with keys-vec as for (clojure.core/get-in the-map keys-vec).
Throws an Exception if the path keys-vec is not present in the-map."
(def my-map {:a 1
:b {:c 3}})
(fetch-in my-map [:b :c])
3
(fetch-in my-map [:b :z])
;=> IllegalArgumentException Key seq not present in map:
;=> map : {:b {:c 3}, :a 1}
;=> keys: [:b :z]
Your other option, using records, is to use the Java-interop style of accessor:
(.field-name myrec)
Since Clojure defrecord compiles into a simple Java class, your IDE may be able to recognize these names more easily. YMMV
What is the simplest way to trigger a side-effecting function to be called only when an atom's value changes?
If I were using a ref, I think I could just do this:
(defn transform-item [x] ...)
(defn do-side-effect-on-change [] nil)
(def my-ref (ref ...))
(when (dosync (let [old-value #my-ref
_ (alter! my-ref transform-item)
new-value #my-ref]
(not= old-value new-value)))
(do-side-effect-on-change))
But this seems seems a bit roundabout, since I'm using a ref even though I am not trying to coordinate changes across multiple refs. Essentially I am using it just to conveniently access the old and new value within a successful transaction.
I feel like I should be able to use an atom instead. Is there a solution simpler than this?
(def my-atom (atom ...))
(let [watch-key ::side-effect-watch
watch-fn (fn [_ _ old-value new-value]
(when (not= old-value new-value)
(do-side-effect-on-change)))]
(add-watch my-atom watch-key watch-fn)
(swap! my-atom transform-item)
(remove-watch watch-key))
This also seems roundabout, because I am adding and removing the watch around every call to swap!. But I need this, because I don't want a watch hanging around that causes the side-effecting function to be triggered when other code modifies the atom.
It is important that the side-effecting function be called exactly once per mutation to the atom, and only when the transform function transform-item actually returns a new value. Sometimes it will return the old value, yielding new change.
(when (not= #a (swap! a transform))
(do-side-effect))
But you should be very clear about what concurrency semantics you need. For example another thread may modify the atom between reading it and swapping it:
a = 1
Thread 1 reads a as 1
Thread 2 modifies a to 2
Thread 1 swaps a from 2 to 2
Thread 1 determines 1 != 2 and calls do-side-effect
It is not clear to me from the question whether this is desirable or not desirable. If you do not want this behavior, then an atom just will not do the job unless you introduce concurrency control with a lock.
Seeing as you started with a ref and asked about an atom, I think you have probably given some thought to concurrency already. It seems like from your description the ref approach is better:
(when (dosync (not= #r (alter r transform))
(do-side-effect))
Is there a reason you don't like your ref solution?
If the answer is "because I don't have concurrency" Then I would encourage you to use a ref anyway. There isn't really a downside to it, and it makes your semantics explicit. IMO programs tend to grow and to a point where concurrency exists, and Clojure is really great at being explicit about what should happen when it exists. (For example oh I'm just calculating stuff, oh I'm just exposing this stuff as a web service now, oh now I'm concurrent).
In any case, bear in mind that functions like alter and swap! return the value, so you can make use of this for concise expressions.
I'm running into the same situation and just come up 2 solutions.
state field :changed?
Keeping a meanless :changed mark in atom to track swap function. And take the return value of swap! to see if things changed. For example:
(defn data (atom {:value 0 :changed? false}))
(let [{changed? :changed?} (swap! data (fn [data] (if (change?)
{:value 1 :changed? true}
{:value 0 :change? false})))]
(when changed? (do-your-task)))
exception based
You can throw an Exception in swap function, and catch it outside:
(try
(swap! data (fn [d] (if (changed?) d2 (ex-info "unchanged" {})))
(do-your-task)
(catch Exception _
))
I've found myself using the following idiom lately in clojure code.
(def *some-global-var* (ref {}))
(defn get-global-var []
#*global-var*)
(defn update-global-var [val]
(dosync (ref-set *global-var* val)))
Most of the time this isn't even multi-threaded code that might need the transactional semantics that refs give you. It just feels like refs are for more than threaded code but basically for any global that requires immutability. Is there a better practice for this? I could try to refactor the code to just use binding or let but that can get particularly tricky for some applications.
I always use an atom rather than a ref when I see this kind of pattern - if you don't need transactions, just a shared mutable storage location, then atoms seem to be the way to go.
e.g. for a mutable map of key/value pairs I would use:
(def state (atom {}))
(defn get-state [key]
(#state key))
(defn update-state [key val]
(swap! state assoc key val))
Your functions have side effects. Calling them twice with the same inputs may give different return values depending on the current value of *some-global-var*. This makes things difficult to test and reason about, especially once you have more than one of these global vars floating around.
People calling your functions may not even know that your functions are depending on the value of the global var, without inspecting the source. What if they forget to initialize the global var? It's easy to forget. What if you have two sets of code both trying to use a library that relies on these global vars? They are probably going to step all over each other, unless you use binding. You also add overheads every time you access data from a ref.
If you write your code side-effect free, these problems go away. A function stands on its own. It's easy to test: pass it some inputs, inspect the outputs, they'll always be the same. It's easy to see what inputs a function depends on: they're all in the argument list. And now your code is thread-safe. And probably runs faster.
It's tricky to think about code this way if you're used to the "mutate a bunch of objects/memory" style of programming, but once you get the hang of it, it becomes relatively straightforward to organize your programs this way. Your code generally ends up as simple as or simpler than the global-mutation version of the same code.
Here's a highly contrived example:
(def *address-book* (ref {}))
(defn add [name addr]
(dosync (alter *address-book* assoc name addr)))
(defn report []
(doseq [[name addr] #*address-book*]
(println name ":" addr)))
(defn do-some-stuff []
(add "Brian" "123 Bovine University Blvd.")
(add "Roger" "456 Main St.")
(report))
Looking at do-some-stuff in isolation, what the heck is it doing? There are a lot of things happening implicitly. Down this path lies spaghetti. An arguably better version:
(defn make-address-book [] {})
(defn add [addr-book name addr]
(assoc addr-book name addr))
(defn report [addr-book]
(doseq [[name addr] addr-book]
(println name ":" addr)))
(defn do-some-stuff []
(let [addr-book (make-address-book)]
(-> addr-book
(add "Brian" "123 Bovine University Blvd.")
(add "Roger" "456 Main St.")
(report))))
Now it's clear what do-some-stuff is doing, even in isolation. You can have as many address books floating around as you want. Multiple threads could have their own. You can use this code from multiple namespaces safely. You can't forget to initialize the address book, because you pass it as an argument. You can test report easily: just pass the desired "mock" address book in and see what it prints. You don't have to care about any global state or anything but the function you're testing at the moment.
If you don't need to coordinate updates to a data structure from multiple threads, there's usually no need to use refs or global vars.
I'm looking for an easy and safe way to parse a map, and only a map, from a string supplied by an untrusted source. The map contains keywords and numbers. What are the security concerns of using read to do this?
read is by default totally unsafe, it allows arbitrary code execution. Try (read-string "#=(println \"hello\")") as an example.
You can make it safer by binding *read-eval* to false. This will cause an exception to be triggered if there #= notation is used. For example:
(binding [*read-eval* false] (read-string "#=(println \"hello\")"))
Finally, depending on how you are using it there is a potential denial of service attack by supplying a large number of keywords (:foo, :bar). Keywords are interned and never freed so if enough are used the process will run out of memory. There's some discussion about that on the clojure-dev list.
If you want to be safe I think you basically need to parse it by hand without doing an eval. Here is an example of one way to do it:
(apply hash-map
(map #(%1 %2)
(cycle [#(keyword (apply str (drop 1 %)))
#(Integer/parseInt %)])
(string/split ":a 23 :b 32 :c 32" #" ")))
Depending on the types of numbers you want to support and how much error checking you want to do you will want to modify the two functions that are being cycled over to process every map every other value to a keyword or number.