Redefining a let'd variable in Clojure loop - clojure

OK. I've been tinkering with Clojure and I continually run into the same problem. Let's take this little fragment of code:
(let [x 128]
(while (> x 1)
(do
(println x)
(def x (/ x 2)))))
Now I expect this to print out a sequence starting with 128 as so:
128
64
32
16
8
4
2
Instead, it's an infinite loop, printing 128 over and over. Clearly my intended side effect isn't working.
So how am I supposed to redefine the value of x in a loop like this? I realize this may not be Lisp like (I could use an anonymous function that recurses on it's self, perhaps), but if I don't figure out how to set variable like this, I'm going to go mad.
My other guess would be to use set!, but that gives "Invalid assignment target", since I'm not in a binding form.
Please, enlighten me on how this is supposed to work.

def defines a toplevel var, even if you use it in a function or inner loop of some code. What you get in let are not vars. Per the documentation for let:
Locals created with let are not variables. Once created their values never change!
(Emphasis not mine.) You don't need mutable state for your example here; you could use loop and recur.
(loop [x 128]
(when (> x 1)
(println x)
(recur (/ x 2))))
If you wanted to be fancy you could avoid the explicit loop entirely.
(let [xs (take-while #(> % 1) (iterate #(/ % 2) 128))]
(doseq [x xs] (println x)))
If you really wanted to use mutable state, an atom might work.
(let [x (atom 128)]
(while (> #x 1)
(println #x)
(swap! x #(/ %1 2))))
(You don't need a do; while wraps its body in an explicit one for you.) If you really, really wanted to do this with vars you'd have to do something horrible like this.
(with-local-vars [x 128]
(while (> (var-get x) 1)
(println (var-get x))
(var-set x (/ (var-get x) 2))))
But this is very ugly and it's not idiomatic Clojure at all. To use Clojure effectively you should try to stop thinking in terms of mutable state. It will definitely drive you crazy trying to write Clojure code in a non-functional style. After a while you may find it to be a pleasant surprise how seldom you actually need mutable variables.

Vars (that's what you get when you "def" something) aren't meant to be reassigned (but can be):
user=> (def k 1)
#'user/k
user=> k
1
There's nothing stopping you from doing:
user=> (def k 2)
#'user/k
user=> k
2
If you want a thread-local settable "place" you can use "binding" and "set!":
user=> (def j) ; this var is still unbound (no value)
#'user/j
user=> j
java.lang.IllegalStateException: Var user/j is unbound. (NO_SOURCE_FILE:0)
user=> (binding [j 0] j)
0
So then you can write a loop like this:
user=> (binding [j 0]
(while (< j 10)
(println j)
(set! j (inc j))))
0
1
2
3
4
5
6
7
8
9
nil
But I think this is pretty unidiomatic.

If you think that having mutable local variables in pure functions would be a nice convenient feature that does no harm, because the function still remains pure, you might be interested in this mailing list discussion where Rich Hickey explains his reasons for removing them from the language. Why not mutable locals?
Relevant part:
If locals were variables, i.e. mutable, then closures could close over
mutable state, and, given that closures can escape (without some extra
prohibition on same), the result would be thread-unsafe. And people
would certainly do so, e.g. closure-based pseudo-objects. The result
would be a huge hole in Clojure's approach.
Without mutable locals, people are forced to use recur, a functional
looping construct. While this may seem strange at first, it is just as
succinct as loops with mutation, and the resulting patterns can be
reused elsewhere in Clojure, i.e. recur, reduce, alter, commute etc
are all (logically) very similar. Even though I could detect and
prevent mutating closures from escaping, I decided to keep it this way
for consistency. Even in the smallest context, non-mutating loops are
easier to understand and debug than mutating ones. In any case, Vars
are available for use when appropriate.
The majority of the subsequent posts concerns implementing a with-local-vars macro ;)

You could more idiomatically use iterate and take-while instead,
user> (->> 128
(iterate #(/ % 2))
(take-while (partial < 1)))
(128 64 32 16 8 4 2)
user>

Related

Functional alternative to "let"

I find myself writing a lot of clojure in this manner:
(defn my-fun [input]
(let [result1 (some-complicated-procedure input)
result2 (some-other-procedure result1)]
(do-something-with-results result1 result2)))
This let statement seems very... imperative. Which I don't like. In principal, I could be writing the same function like this:
(defn my-fun [input]
(do-something-with-results (some-complicated-procedure input)
(some-other-procedure (some-complicated-procedure input)))))
The problem with this is that it involves recomputation of some-complicated-procedure, which may be arbitrarily expensive. Also you can imagine that some-complicated-procedure is actually a series of nested function calls, and then I either have to write a whole new function, or risk that changes in the first invocation don't get applied to the second:
E.g. this works, but I have to have an extra shallow, top-level function that makes it hard to do a mental stack trace:
(defn some-complicated-procedure [input] (lots (of (nested (operations input)))))
(defn my-fun [input]
(do-something-with-results (some-complicated-procedure input)
(some-other-procedure (some-complicated-procedure input)))))
E.g. this is dangerous because refactoring is hard:
(defn my-fun [input]
(do-something-with-results (lots (of (nested (operations (mistake input))))) ; oops made a change here that wasn't applied to the other nested calls
(some-other-procedure (lots (of (nested (operations input))))))))
Given these tradeoffs, I feel like I don't have any alternatives to writing long, imperative let statements, but when I do, I cant shake the feeling that I'm not writing idiomatic clojure. Is there a way I can address the computation and code cleanliness problems raised above and write idiomatic clojure? Are imperitive-ish let statements idiomatic?
The kind of let statements you describe might remind you of imperative code, but there is nothing imperative about them. Haskell has similar statements for binding names to values within bodies, too.
If your situation really needs a bigger hammer, there are some bigger hammers that you can either use or take for inspiration. The following two libraries offer some kind of binding form (akin to let) with a localized memoization of results, so as to perform only the necessary steps and reuse their results if needed again: Plumatic Plumbing, specifically the Graph part; and Zach Tellman's Manifold, whose let-flow form furthermore orchestrates asynchronous steps to wait for the necessary inputs to become available, and to run in parallel when possible. Even if you decide to maintain your present course, their docs make good reading, and the code of Manifold itself is educational.
I recently had this same question when I looked at this code I wrote
(let [user-symbols (map :symbol states)
duplicates (for [[id freq] (frequencies user-symbols) :when (> freq 1)] id)]
(do-something-with duplicates))
You'll note that map and for are lazy and will not be executed until do-something-with is executed. It's also possible that not all (or even not any) of the states will be mapped or the frequencies calculated. It depends on what do-something-with actually requests of the sequence returned by for. This is very much functional and idiomatic functional programming.
i guess the simplest approach to keep it functional would be to have a pass-through state to accumulate the intermediate results. something like this:
(defn with-state [res-key f state]
(assoc state res-key (f state)))
user> (with-state :res (comp inc :init) {:init 10})
;;=> {:init 10, :res 11}
so you can move on to something like this:
(->> {:init 100}
(with-state :inc'd (comp inc :init))
(with-state :inc-doubled (comp (partial * 2) :inc'd))
(with-state :inc-doubled-squared (comp #(* % %) :inc-doubled))
(with-state :summarized (fn [st] (apply + (vals st)))))
;;=> {:init 100,
;; :inc'd 101,
;; :inc-doubled 202,
;; :inc-doubled-squared 40804,
;; :summarized 41207}
The let form is a perfectly functional construct and can be seen as syntactic sugar for calls to anonymous functions. We can easily write a recursive macro to implement our own version of let:
(defmacro my-let [bindings body]
(if (empty? bindings)
body
`((fn [~(first bindings)]
(my-let ~(rest (rest bindings)) ~body))
~(second bindings))))
Here is an example of calling it:
(my-let [a 3
b (+ a 1)]
(* a b))
;; => 12
And here is a macroexpand-all called on the above expression, that reveal how we implement my-let using anonymous functions:
(clojure.walk/macroexpand-all '(my-let [a 3
b (+ a 1)]
(* a b)))
;; => ((fn* ([a] ((fn* ([b] (* a b))) (+ a 1)))) 3)
Note that the expansion doesn't rely on let and that the bound symbols become parameter names in the anonymous functions.
As others write, let is actually perfectly functional, but at times it can feel imperative. It's better to become fully comfortable with it.
You might, however, want to kick the tires of my little library tl;dr that lets you write code like for example
(compute
(+ a b c)
where
a (f b)
c (+ 100 b))

Using let style destructuring for def

Is there a reasonable way to have multiple def statements happen with destructing the same way that let does it? For Example:
(let [[rtgs pcts] (->> (sort-by second row)
(apply map vector))]
.....)
What I want is something like:
(defs [rtgs pcts] (->> (sort-by second row)
(apply map vector)))
This comes up a lot in the REPL, notebooks and when debugging. Seriously feels like a missing feature so I'd like guidance on one of:
This exists already and I'm missing it
This is a bad idea because... (variable capture?, un-idiomatic?, Rich said so?)
It's just un-needed and I must be suffering from withdrawals from an evil language. (same as: don't mess up our language with your macros)
A super short experiment give me something like:
(defmacro def2 [[name1 name2] form]
`(let [[ret1# ret2#] ~form]
(do (def ~name1 ret1#)
(def ~name2 ret2#))))
And this works as in:
(def2 [three five] ((juxt dec inc) 4))
three ;; => 3
five ;; => 5
Of course and "industrial strength" version of that macro might be:
checking that number of names matches the number of inputs. (return from form)
recursive call to handle more names (can I do that in a macro like this?)
While I agree with Josh that you probably shouldn't have this running in production, I don't see any harm in having it as a convenience at the repl (in fact I think I'll copy this into my debug-repl kitchen-sink library).
I enjoy writing macros (although they're usually not needed) so I whipped up an implementation. It accepts any binding form, like in let.
(I wrote this specs-first, but if you're on clojure < 1.9.0-alpha17, you can just remove the spec stuff and it'll work the same.)
(ns macro-fun
(:require
[clojure.spec.alpha :as s]
[clojure.core.specs.alpha :as core-specs]))
(s/fdef syms-in-binding
:args (s/cat :b ::core-specs/binding-form)
:ret (s/coll-of simple-symbol? :kind vector?))
(defn syms-in-binding
"Returns a vector of all symbols in a binding form."
[b]
(letfn [(step [acc coll]
(reduce (fn [acc x]
(cond (coll? x) (step acc x)
(symbol? x) (conj acc x)
:else acc))
acc, coll))]
(if (symbol? b) [b] (step [] b))))
(s/fdef defs
:args (s/cat :binding ::core-specs/binding-form, :body any?))
(defmacro defs
"Like def, but can take a binding form instead of a symbol to
destructure the results of the body.
Doesn't support docstrings or other metadata."
[binding body]
`(let [~binding ~body]
~#(for [sym (syms-in-binding binding)]
`(def ~sym ~sym))))
;; Usage
(defs {:keys [foo bar]} {:foo 42 :bar 36})
foo ;=> 42
bar ;=> 36
(defs [a b [c d]] [1 2 [3 4]])
[a b c d] ;=> [1 2 3 4]
(defs baz 42)
baz ;=> 42
About your REPL-driven development comment:
I don't have any experience with Ipython, but I'll give a brief explanation of my REPL workflow and you can maybe comment about any comparisons/contrasts with Ipython.
I never use my repl like a terminal, inputting a command and waiting for a reply. My editor supports (emacs, but any clojure editor should do) putting the cursor at the end of any s-expression and sending that to the repl, "printing" the result after the cursor.
I usually have a comment block in the file where I start working, just typing whatever and evaluating it. Then, when I'm reasonably happy with a result, I pull it out of the "repl-area" and into the "real-code".
(ns stuff.core)
;; Real code is here.
;; I make sure that this part always basically works,
;; ie. doesn't blow up when I evaluate the whole file
(defn foo-fn [x]
,,,)
(comment
;; Random experiments.
;; I usually delete this when I'm done with a coding session,
;; but I copy some forms into tests.
;; Sometimes I leave it for posterity though,
;; if I think it explains something well.
(def some-data [,,,])
;; Trying out foo-fn, maybe copy this into a test when I'm done.
(foo-fn some-data)
;; Half-finished other stuff.
(defn bar-fn [x] ,,,)
(keys 42) ; I wonder what happens if...
)
You can see an example of this in the clojure core source code.
The number of defs that any piece of clojure will have will vary per project, but I'd say that in general, defs are not often the result of some computation, let alone the result of a computation that needs to be destructured. More often defs are the starting point for some later computation that will depend on this value.
Usually functions are better for computing a value; and if the computation is expensive, then you can memoize the function. If you feel you really need this functionality, then by all means, use your macro -- that's one of the sellings points of clojure, namely, extensibility! But in general, if you feel you need this construct, consider the possibility that you're relying too much on global state.
Just to give some real examples, I just referenced my main project at work, which is probably 2K-3K lines of clojure, in about 20 namespaces. We have about 20 defs, most of which are marked private and among them, none are actually computing anything. We have things like:
(def path-prefix "/some-path")
(def zk-conn (atom nil))
(def success? #{200})
(def compile* (clojure.core.memoize/ttl compiler {} ...)))
(def ^:private nashorn-factory (NashornScriptEngineFactory.))
(def ^:private read-json (comp json/read-str ... ))
Defining functions (using comp and memoize), enumerations, state via atom -- but no real computation.
So I'd say, based on your bullet points above, this falls somewhere between 2 and 3: it's definitely not a common use case that's needed (you're the first person I've ever heard who wants this, so it's uncommon to me anyway); and the reason it's uncommon is because of what I said above, i.e., it may be a code smell that indicates reliance on too much global state, and hence, would not be very idiomatic.
One litmus test I have for much of my code is: if I pull this function out of this namespace and paste it into another, does it still work? Removing dependencies on external vars allows for easier testing and more modular code. Sometimes we need it though, so see what your requirements are and proceed accordingly. Best of luck!

Could Clojure do without let?

I find I very rarely use let in Clojure. For some reason I took a dislike to it when I started learning and have avoided using it ever since. It feels like the flow has stopped when let comes along. I was wondering, do you think we could do without it altogether ?
You can replace any occurrence of (let [a1 b1 a2 b2...] ...) by ((fn [a1 a2 ...] ...) b1 b2 ...) so yes, we could. I am using let a lot though, and I'd rather not do without it.
Let offers a few benefits. First, it allows value binding in a functional context. Second, it confers readability benefits. So while technically, one could do away with it (in the sense that you could still program without it), the language would be impoverished without a valuable tool.
One of the nice things about let is that it helps formalize a common (mathematical) way of specifying a computation, in which you introduce convenient bindings and then a simplified formula as a result. It's clear the bindings only apply to that "scope" and it's tie in with a more mathematical formulation is useful, especially for more functional programmers.
It's not a coincidence that let blocks occur in other languages like Haskell.
Let is indispensable to me in preventing multiple execution in macros:
(defmacro print-and-run [s-exp]
`(do (println "running " (quote ~s-exp) "produced " ~s-exp)
s-exp))
would run s-exp twice, which is not what we want:
(defmacro print-and-run [s-exp]
`(let [result# s-exp]
(do (println "running " (quote ~s-exp) "produced " result#)
result#))
fixes this by binding the result of the expression to a name and referring to that result twice.
because the macro is returning an expression that will become part of another expression (macros are function that produce s-expressions) they need to produce local bindings to prevent multiple execution and avoid symbol capture.
I think I understand your question. Correct me if it's wrong. Some times "let" is used for imperative programming style. For example,
... (let [x (...)
y (...x...)
z (...x...y...)
....x...y...z...] ...
This pattern comes from imperative languages:
... { x = ...;
y = ...x...;
...x...y...;} ...
You avoid this style and that's why you also avoid "let", don't you?
In some problems imperative style reduces amount of code. Furthermore, some times It's more efficient to write in java or c.
Also in some cases "let" just holds values of subexpressions regardless of evaluation order. For example,
(... (let [a (...)
b (...)...]
(...a...b...a...b...) ;; still fp style
There are at least two important use cases for let-bindings:
First, using let properly can make your code clearer and shorter. If you have an expression that you use more than once, binding it in a let is very nice. Here's a portion of the standard function map that uses let:
...
(let [s1 (seq c1) s2 (seq c2)]
(when (and s1 s2)
(cons (f (first s1) (first s2))
(map f (rest s1) (rest s2)))))))
...
Even if you use an expression only once, it can still be helpful (to future readers of the code) to give it a semantically meaningful name.
Second, as Arthur mentioned, if you want to use the value of an expression more than once, but only want it evaluated once, you can't simply type out the entire expression twice: you need some kind of binding. This would be merely wasteful if you have a pure expression:
user=> (* (+ 3 2) (+ 3 2))
25
but actually changes the meaning of the program if the expression has side-effects:
user=> (* (+ 3 (do (println "hi") 2))
(+ 3 (do (println "hi") 2)))
hi
hi
25
user=> (let [x (+ 3 (do (println "hi") 2))]
(* x x))
hi
25
Stumbled upon this recently so ran some timings:
(testing "Repeat vs Let vs Fn"
(let [start (System/currentTimeMillis)]
(dotimes [x 1000000]
(* (+ 3 2) (+ 3 2)))
(prn (- (System/currentTimeMillis) start)))
(let [start (System/currentTimeMillis)
n (+ 3 2)]
(dotimes [x 1000000]
(* n n))
(prn (- (System/currentTimeMillis) start)))
(let [start (System/currentTimeMillis)]
(dotimes [x 1000000]
((fn [x] (* x x)) (+ 3 2)))
(prn (- (System/currentTimeMillis) start)))))
Output
Testing Repeat vs Let vs Fn
116
18
60
'let' wins over 'pure' functional.

How to break out from nested doseqs

I have a question regarding nested doseq loops. In the start function, once I find an answer I set the atom to true, so that the outer loop validation with :while fails. However it seems that it doesn't break it, and the loops keep on going. What's wrong with it?
I am also quite confused with the usage of atoms, refs, agents (Why do they have different names for the update functions when then the mechanism is almost the same?) etc.
Is it okay to use an atom in this situation as a flag? Obviously I need a a variable like object to store a state.
(def pentagonal-list (map (fn [a] (/ (* a (dec (* 3 a))) 2)) (iterate inc 1)))
(def found (atom false))
(defn pentagonal? [a]
(let [y (/ (inc (Math/sqrt (inc (* 24 a)))) 6)
x (mod (* 10 y) 10)]
(if (zero? x)
true
false)))
(defn both-pent? [a b]
(let [sum (+ b a)
diff (- a b)]
(if (and (pentagonal? sum) (pentagonal? diff))
true
false)))
(defn start []
(doseq [x pentagonal-list :while (false? #found)]
(doseq [y pentagonal-list :while (<= y x)]
(if (both-pent? x y)
(do
(reset! found true)
(println (- x y)))))))
Even once the atom is set to true, your version can't stop running until the inner doseq finishes (until y > x). It will terminate the outer loop once the inner loop finishes though. It does terminate eventually when I run it. Not sure what you're seeing.
You don't need two doseqs to do this. One doseq can handle two seqs at once.
user> (doseq [x (range 0 2) y (range 3 6)] (prn [x y]))
[0 3]
[0 4]
[0 5]
[1 3]
[1 4]
[1 5]
(The same is true of for.) There is no mechanism for "breaking out" of nested doseqs that I know of, except throw/catch, but that's rather un-idiomatic. You don't need atoms or doseq for this at all though.
(def answers (filter (fn [[x y]] (both-pent? x y))
(for [x pentagonal-list
y pentagonal-list :while (<= y x)]
[x y])))
Your code is very imperative in style. "Loop over these lists, then test the values, then print something, then stop looping." Using atoms for control like this is not very idiomatic in Clojure.
A more functional way is to take a seq (pentagonal-list) and wrap it in functions that turn it into other seqs until you get a seq that gives you what you want. First I use for to turn two copies of this seqs into one seq of pairs where y <= x. Then I use filter to turn that seq into one that filters out the values we don't care about.
filter and for are lazy, so this will stop running once it finds the first valid value, if one is all you want. This returns the two numbers you want, and then you can subtract them.
(apply - (first answers))
Or you can further wrap the function in another map to calculate the differences for you.
(def answers2 (map #(apply - %) answers))
(first answers2)
There are some advantages to programming functionally in this way. The seq is cached (if you hold onto the head like I do here), so once a value is calculated, it remembers it and you can access it instantly from then on. Your version can't be run again without resetting the atom, and then would have to re-calculate everything. With my version you can (take 5 answers) to get the first 5 results, or map over the result to do other things if you want. You can doseq over it and print the values. Etc. etc.
I'm sure there are other (maybe better) ways to do this still without using an atom. You should usually avoid mutating references unless it's 100% necessary in Clojure.
The function names for altering atoms/agents/refs are different probably because the mechanics are not the same. Refs are synchronous and coordinated via transactions. Agents are asynchronous. Atoms are synchronous and uncoordinated. They all kind of "change a reference", and probably some kind of super-function or macro could wrap them all in one name, but that would obscure the fact that they're doing drastically different things under the hood. Explaining the differences fully is probably beyond the scope of an SO post to explain, but http://clojure.org fully explains all of the nuances of the differences.

Common programming mistakes for Clojure developers to avoid [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
What are some common mistakes made by Clojure developers, and how can we avoid them?
For example; newcomers to Clojure think that the contains? function works the same as java.util.Collection#contains. However, contains? will only work similarly when used with indexed collections like maps and sets and you're looking for a given key:
(contains? {:a 1 :b 2} :b)
;=> true
(contains? {:a 1 :b 2} 2)
;=> false
(contains? #{:a 1 :b 2} :b)
;=> true
When used with numerically indexed collections (vectors, arrays) contains? only checks that the given element is within the valid range of indexes (zero-based):
(contains? [1 2 3 4] 4)
;=> false
(contains? [1 2 3 4] 0)
;=> true
If given a list, contains? will never return true.
Literal Octals
At one point I was reading in a matrix which used leading zeros to maintain proper rows and columns. Mathematically this is correct, since leading zero obviously don't alter the underlying value. Attempts to define a var with this matrix, however, would fail mysteriously with:
java.lang.NumberFormatException: Invalid number: 08
which totally baffled me. The reason is that Clojure treats literal integer values with leading zeros as octals, and there is no number 08 in octal.
I should also mention that Clojure supports traditional Java hexadecimal values via the 0x prefix. You can also use any base between 2 and 36 by using the "base+r+value" notation, such as 2r101010 or 36r16 which are 42 base ten.
Trying to return literals in an anonymous function literal
This works:
user> (defn foo [key val]
{key val})
#'user/foo
user> (foo :a 1)
{:a 1}
so I believed this would also work:
(#({%1 %2}) :a 1)
but it fails with:
java.lang.IllegalArgumentException: Wrong number of args passed to: PersistentArrayMap
because the #() reader macro gets expanded to
(fn [%1 %2] ({%1 %2}))
with the map literal wrapped in parenthesis. Since it's the first element, it's treated as a function (which a literal map actually is), but no required arguments (such as a key) are provided. In summary, the anonymous function literal does not expand to
(fn [%1 %2] {%1 %2}) ; notice the lack of parenthesis
and so you can't have any literal value ([], :a, 4, %) as the body of the anonymous function.
Two solutions have been given in the comments. Brian Carper suggests using sequence implementation constructors (array-map, hash-set, vector) like so:
(#(array-map %1 %2) :a 1)
while Dan shows that you can use the identity function to unwrap the outer parenthesis:
(#(identity {%1 %2}) :a 1)
Brian's suggestion actually brings me to my next mistake...
Thinking that hash-map or array-map determine the unchanging concrete map implementation
Consider the following:
user> (class (hash-map))
clojure.lang.PersistentArrayMap
user> (class (hash-map :a 1))
clojure.lang.PersistentHashMap
user> (class (assoc (apply array-map (range 2000)) :a :1))
clojure.lang.PersistentHashMap
While you generally won't have to worry about the concrete implementation of a Clojure map, you should know that functions which grow a map - like assoc or conj - can take a PersistentArrayMap and return a PersistentHashMap, which performs faster for larger maps.
Using a function as the recursion point rather than a loop to provide initial bindings
When I started out, I wrote a lot of functions like this:
; Project Euler #3
(defn p3
([] (p3 775147 600851475143 3))
([i n times]
(if (and (divides? i n) (fast-prime? i times)) i
(recur (dec i) n times))))
When in fact loop would have been more concise and idiomatic for this particular function:
; Elapsed time: 387 msecs
(defn p3 [] {:post [(= % 6857)]}
(loop [i 775147 n 600851475143 times 3]
(if (and (divides? i n) (fast-prime? i times)) i
(recur (dec i) n times))))
Notice that I replaced the empty argument, "default constructor" function body (p3 775147 600851475143 3) with a loop + initial binding. The recur now rebinds the loop bindings (instead of the fn parameters) and jumps back to the recursion point (loop, instead of fn).
Referencing "phantom" vars
I'm speaking about the type of var you might define using the REPL - during your exploratory programming - then unknowingly reference in your source. Everything works fine until you reload the namespace (perhaps by closing your editor) and later discover a bunch of unbound symbols referenced throughout your code. This also happens frequently when you're refactoring, moving a var from one namespace to another.
Treating the for list comprehension like an imperative for loop
Essentially you're creating a lazy list based on existing lists rather than simply performing a controlled loop. Clojure's doseq is actually more analogous to imperative foreach looping constructs.
One example of how they're different is the ability to filter which elements they iterate over using arbitrary predicates:
user> (for [n '(1 2 3 4) :when (even? n)] n)
(2 4)
user> (for [n '(4 3 2 1) :while (even? n)] n)
(4)
Another way they're different is that they can operate on infinite lazy sequences:
user> (take 5 (for [x (iterate inc 0) :when (> (* x x) 3)] (* 2 x)))
(4 6 8 10 12)
They also can handle more than one binding expression, iterating over the rightmost expression first and working its way left:
user> (for [x '(1 2 3) y '(\a \b \c)] (str x y))
("1a" "1b" "1c" "2a" "2b" "2c" "3a" "3b" "3c")
There's also no break or continue to exit prematurely.
Overuse of structs
I come from an OOPish background so when I started Clojure my brain was still thinking in terms of objects. I found myself modeling everything as a struct because its grouping of "members", however loose, made me feel comfortable. In reality, structs should mostly be considered an optimization; Clojure will share the keys and some lookup information to conserve memory. You can further optimize them by defining accessors to speed up the key lookup process.
Overall you don't gain anything from using a struct over a map except for performance, so the added complexity might not be worth it.
Using unsugared BigDecimal constructors
I needed a lot of BigDecimals and was writing ugly code like this:
(let [foo (BigDecimal. "1") bar (BigDecimal. "42.42") baz (BigDecimal. "24.24")]
when in fact Clojure supports BigDecimal literals by appending M to the number:
(= (BigDecimal. "42.42") 42.42M) ; true
Using the sugared version cuts out a lot of the bloat. In the comments, twils mentioned that you can also use the bigdec and bigint functions to be more explicit, yet remain concise.
Using the Java package naming conversions for namespaces
This isn't actually a mistake per se, but rather something that goes against the idiomatic structure and naming of a typical Clojure project. My first substantial Clojure project had namespace declarations - and corresponding folder structures - like this:
(ns com.14clouds.myapp.repository)
which bloated up my fully-qualified function references:
(com.14clouds.myapp.repository/load-by-name "foo")
To complicate things even more, I used a standard Maven directory structure:
|-- src/
| |-- main/
| | |-- java/
| | |-- clojure/
| | |-- resources/
| |-- test/
...
which is more complex than the "standard" Clojure structure of:
|-- src/
|-- test/
|-- resources/
which is the default of Leiningen projects and Clojure itself.
Maps utilize Java's equals() rather than Clojure's = for key matching
Originally reported by chouser on IRC, this usage of Java's equals() leads to some unintuitive results:
user> (= (int 1) (long 1))
true
user> ({(int 1) :found} (int 1) :not-found)
:found
user> ({(int 1) :found} (long 1) :not-found)
:not-found
Since both Integer and Long instances of 1 are printed the same by default, it can be difficult to detect why your map isn't returning any values. This is especially true when you pass your key through a function which, perhaps unbeknownst to you, returns a long.
It should be noted that using Java's equals() instead of Clojure's = is essential for maps to conform to the java.util.Map interface.
I'm using Programming Clojure by Stuart Halloway, Practical Clojure by Luke VanderHart, and the help of countless Clojure hackers on IRC and the mailing list to help along my answers.
Forgetting to force evaluation of lazy seqs
Lazy seqs aren't evaluated unless you ask them to be evaluated. You might expect this to print something, but it doesn't.
user=> (defn foo [] (map println [:foo :bar]) nil)
#'user/foo
user=> (foo)
nil
The map is never evaluated, it's silently discarded, because it's lazy. You have to use one of doseq, dorun, doall etc. to force evaluation of lazy sequences for side-effects.
user=> (defn foo [] (doseq [x [:foo :bar]] (println x)) nil)
#'user/foo
user=> (foo)
:foo
:bar
nil
user=> (defn foo [] (dorun (map println [:foo :bar])) nil)
#'user/foo
user=> (foo)
:foo
:bar
nil
Using a bare map at the REPL kind of looks like it works, but it only works because the REPL forces evaluation of lazy seqs itself. This can make the bug even harder to notice, because your code works at the REPL and doesn't work from a source file or inside a function.
user=> (map println [:foo :bar])
(:foo
:bar
nil nil)
I'm a Clojure noob. More advanced users may have more interesting problems.
trying to print infinite lazy sequences.
I knew what I was doing with my lazy sequences, but for debugging purposes I inserted some print/prn/pr calls, temporarily having forgotten what it is I was printing. Funny, why's my PC all hung up?
trying to program Clojure imperatively.
There is some temptation to create a whole lot of refs or atoms and write code that constantly mucks with their state. This can be done, but it's not a good fit. It may also have poor performance, and rarely benefit from multiple cores.
trying to program Clojure 100% functionally.
A flip side to this: Some algorithms really do want a bit of mutable state. Religiously avoiding mutable state at all costs may result in slow or awkward algorithms. It takes judgement and a bit of experience to make the decision.
trying to do too much in Java.
Because it's so easy to reach out to Java, it's sometimes tempting to use Clojure as a scripting language wrapper around Java. Certainly you'll need to do exactly this when using Java library functionality, but there's little sense in (e.g.) maintaining data structures in Java, or using Java data types such as collections for which there are good equivalents in Clojure.
Lots of things already mentioned. I'll just add one more.
Clojure if treats Java Boolean objects always as true even if it's value is false. So if you have a java land function that returns a java Boolean value, make sure you do not check it directly
(if java-bool "Yes" "No")
but rather
(if (boolean java-bool) "Yes" "No").
I got burned by this with clojure.contrib.sql library that returns database boolean fields as java Boolean objects.
Keeping your head in loops.
You risk running out of memory if you loop over the elements of a potentially very large, or infinite, lazy sequence while keeping a reference to the first element.
Forgetting there's no TCO.
Regular tail-calls consume stack space, and they will overflow if you're not careful. Clojure has 'recur and 'trampoline to handle many of the cases where optimized tail-calls would be used in other languages, but these techniques have to be intentionally applied.
Not-quite-lazy sequences.
You may build a lazy sequence with 'lazy-seq or 'lazy-cons (or by building upon higher level lazy APIs), but if you wrap it in 'vec or pass it through some other function that realizes the sequence, then it will no longer be lazy. Both the stack and the heap can be overflown by this.
Putting mutable things in refs.
You can technically do it, but only the object reference in the ref itself is governed by the STM - not the referred object and its fields (unless they are immutable and point to other refs). So whenever possible, prefer to only immutable objects in refs. Same thing goes for atoms.
using loop ... recur to process sequences when map will do.
(defn work [data]
(do-stuff (first data))
(recur (rest data)))
vs.
(map do-stuff data)
The map function (in the latest branch) uses chunked sequences and many other optimizations. Also, because this function is frequently run, the Hotspot JIT usually has it optimized and ready to go with out any "warm up time".
Collection types have different behaviors for some operations:
user=> (conj '(1 2 3) 4)
(4 1 2 3) ;; new element at the front
user=> (conj [1 2 3] 4)
[1 2 3 4] ;; new element at the back
user=> (into '(3 4) (list 5 6 7))
(7 6 5 3 4)
user=> (into [3 4] (list 5 6 7))
[3 4 5 6 7]
Working with strings can be confusing (I still don't quite get them). Specifically, strings are not the same as sequences of characters, even though sequence functions work on them:
user=> (filter #(> (int %) 96) "abcdABCDefghEFGH")
(\a \b \c \d \e \f \g \h)
To get a string back out, you'd need to do:
user=> (apply str (filter #(> (int %) 96) "abcdABCDefghEFGH"))
"abcdefgh"
too many parantheses, especially with void java method call inside which results in NPE:
public void foo() {}
((.foo))
results in NPE from outer parantheses because inner parantheses evaluate to nil.
public int bar() { return 5; }
((.bar))
results in the easier to debug:
java.lang.Integer cannot be cast to clojure.lang.IFn
[Thrown class java.lang.ClassCastException]