Clojure.typed basics - clojure

I've started playing with the seemingly quite impressive clojure.typed library, but very shortly after I run into problems, even when trying to apply it to simple functions. Does anyone have experience with the library?
Problem 1
(typed/ann square [Double -> Double])
(defn square "Square of"
[num]
(* num num))
Type Error (clojure_study/ideas/swarm/vector_algebra.clj:15:3) Return type of static method clojure.lang.Numbers/multiply is java.lang.Long, expected java.lang.Double.
Problem 2
(typed/defalias CartesianVector '{:x Double :y Double})
(typed/ann v+ [CartesianVector * -> CartesianVector])
(defn v+ "Sum vector of vectors"
[& vectors]
(apply merge-with + vectors))
Type Error (clojure_study/ideas/swarm/vector_algebra.clj:28:3) Bad arguments to polymorphic function in apply
in: (apply merge-with + vectors)
Problem 3
(typed/ann v- [CartesianVector CartesianVector -> CartesianVector])
(defn v- "Diff vector of vectors"
[v1 v2]
(merge-with - v1 v2))
Type Error (clojure_study/ideas/swarm/vector_algebra.clj:33:3) Polymorphic function merge-with could not be applied to arguments:
Polymorphic Variables:
k
v
Thanks for any help offered.

Your answer is now 3 years old, so this may not be much help, but I was using Typed Clojure in a big production codebase around the same time and have some experience with it. Also, the answers that weavejester provided in your Reddit thread on the topic are pretty much spot-on, so I'm just going to re-summarize them here to save future visitors the inconvenience of having to click another link.
In general your approach is correct at a high level but you're running into areas in which core.typed simply didn't (and maybe still doesn't) know how to behave smartly.
Here's what's going on:
Problem 1
This should probably be considered a bug on the behalf of core.typed, because there is a function signature supporting Double as a return type. You can circumvent this by using clojure.lang.Number or clojure.core.typed/Num instead, both of which encompass both Long and Double.
Problem 2
This is just a syntax error - that's not how you specify maps to core.typed. You should be using an HMap instead:
(t/defalias CartesianVector
(t/HMap :mandatory {:x t/Num, :y t/Num} :complete? true))
Problem 3
Unfortunately core.typed cannot successfully infer that merge-with (a core function) when applied to two maps of the same type will return a map of the same type. This is a limit of the type-checker. You can get around this by re-writing your function to explicitly merge rather than relying on merge-with:
(defn v-
[{x1 :x, y1 :y} {x2 :x, y2 :y}]
{:x (- x1 x2), :y (- y1 y2)})

Related

How can you destructure in the REPL?

Suppose I've got a function (remove-bad-nodes g) that returns a sequence like this:
[updated-g bad-nodes]
where updated-g is a graph with its bad nodes removed, and bad-nodes is a collection containing the removed nodes.
As an argument to a function or inside a let, I could destructure it like this:
(let [[g bads] (remove-bad-nodes g)]
...)
but that only defines local variables. How could I do that in the REPL, so that in future commands I can refer to the updated graph as g and the removed nodes as bads? The first thing that comes to mind is this:
(def [g bads] (remove-bad-nodes g)
but that doesn't work, because def needs its first argument to be a Symbol.
Note that I'm not asking why def doesn't have syntax like let; there's already a question about that. I'm wondering what is a convenient, practical way to work in the REPL with functions that return "multiple values". If there's some reason why in normal Clojure practice there's no need to destructure in the REPL, because you do something else instead, explaining that might make a useful answer. I've been running into this a lot lately, which is why I'm asking. Usually, but not always, these functions return an updated version of something along with some other information. In side-effecting code, the function would modify the object and return only one value (the removed nodes, in the example), but obviously that's not the Clojurely way to do it.
I think the way to work with such functions in the repl is just to not def your intermediate results unless they are particularly interesting; for interesting-enough intermediate results it's not a big hassle to either def them to a single name, or to write multiple defs inside a destructuring form.
For example, instead of
(def [x y] (foo))
(def [a b] (bar x y))
you could write
(let [[x y] (foo),
[x' y'] (bar x y)])
(def a x') ; or maybe leave this out if `a` isn't that interesting
(def b y'))
A nice side effect of this is that the code you write while playing around in the repl will look much more similar to the code you will one day add to your source file, where you will surely not be defing things over and over, but rather destructuring them, passing them to functions, and so on. It will be easier to adapt the information you learned at the repl into a real program.
There's nothing unique about destructuring w/r/t the REPL. The answer to your question is essentially the same as this question. I think your options are:
let:
(let [[light burnt just-right] (classify-toasts (make-lots-of-toast))]
(prn light burnt just-right))
def the individual values:
(def result (classify-toasts (make-lots-of-toast)))
(def light (nth result 0))
(def burnt (nth result 1))
(def just-right (nth result 2))
Or write a macro to do that def work for you.
You could also consider a different representation if your function is always returning a 3-tuple/vector e.g. you could alternatively return a map from classify-toasts:
{:light 1, :burnt 2, :just-right 3}
And then when you need one of those values, destructure the map using the keywords wherever you need:
(:light the-map) => 1
Observe:
user=> (def results [1 2 3])
#'user/results
user=> (let [[light burnt just-right] results] (def light light) (def burnt burnt) (def just-right just-right))
#'user/just-right
user=> light
1
user=> burnt
2
user=> just-right
3

Functionality of update vs update-in for non-nested structures

I was looking over some Quil examples, and noticed that 2 different authors (for the "Hyper" and "Equilibrium" examples) used:
(update-in s [:x] + dx vx)
instead of simply
(update s :x + dx vx)
Is there a reason for this? If s was a deeply nested structure, ya, it would make sense. In both cases though, the list of keys only has 1 entry, so to me, the 2 snippets above seem equivalent:
(let [dx 1
vx 2
s {:x 5}]
(println (update-in s [:x] + dx vx))
(println (update s :x + dx vx)))
{:x 8}
{:x 8}
Except that update-in will probably have a little bit more overhead.
The only reason I could think of is if they make the state nested in the future, it will ease the transition. For such a simple example though, this seemed unlikely, especially given there are magic constants everywhere.
Is there any reason to use update-in over update when the structure isn't nested?
There is no reason to use update-in for a non-nested structure and update is preferred.
If you look at the source code, you will see they both use assoc under the covers. There is no reason to prefer one over the other except for style and code clarity, taking nearby and related code into account.
Also, update was not added until Clojure 1.7, which may play a role in the choice.
P.S. If you are ever looking for the missing function dissoc-in you can find it in the Tupelo library.

How does Clojure ^:const work?

I'm trying to understand what ^:const does in clojure. This is what the dev docs say. http://dev.clojure.org/display/doc/1.3
(def constants
{:pi 3.14
:e 2.71})
(def ^:const pi (:pi constants))
(def ^:const e (:e constants))
The overhead of looking up :e and :pi in the map happens at compile time, as (:pi constants) and (:e constants) are evaluated when their parent def forms are evaluated.
This is confusing since the metadata is for the var bound to symbol pi, and the var bound to symbol e, yet the sentence below says it helps speed up the map lookups, not the var lookups.
Can someone explain the what ^:const is doing and the rationale behind using it? How does this compare to using a giant let block or using a macro like (pi) and (e)?
That looks like a bad example to me, since the stuff about map-lookup just confuses the issue.
A more realistic example would be:
(def pi 3.14)
(defn circ [r] (* 2 pi r))
In this case, the body of circumference is compiled into code that dereferences pi at runtime (by calling Var.getRawRoot), each time circumference is called.
(def ^:const pi 3.14)
(defn circ2 [r] (* 2 pi r))
In this case, circ2 is compiled into exactly the same code as if it had been written like this:
(defn circ2 [r] (* 2 3.14 r))
That is, the call to Var.getRawRoot is skipped, which saves a little bit of time. Here is a quick measurement, where circ is the first version above, and circ2 is the second:
user> (time (dotimes [_ 1e5] (circ 1)))
"Elapsed time: 16.864154 msecs"
user> (time (dotimes [_ 1e5] (circ2 1)))
"Elapsed time: 6.854782 msecs"
Besides the efficiency aspect described above, there is a safety aspect that is also useful. Consider the following code:
(def two 2)
(defn times2 [x] (* two x))
(assert (= 4 (times2 2))) ; Expected result
(def two 3) ; Ooops! The value of the "constant" changed
(assert (= 6 (times2 2))) ; Used the new (incorrect) value
(def ^:const const-two 2)
(defn times2 [x] (* const-two x))
(assert (= 4 (times2 2))) ; Still works
(def const-two 3) ; No effect!
(assert (= 3 const-two )) ; It did change...
(assert (= 4 (times2 2))) ; ...but the function did not.
So, by using the ^:const metadata when defining vars, the vars are effectively "inlined" into every place they are used. Any subsequent changes to the var, therefore, do not affect any code where the "old" value has already been inlined.
The use of ^:const also serves a documentation function. When one reads (def ^:const pi 3.14159) is tells the reader that the var pi is not ever intended to change, that it is simply a convenient (& hopefully descriptive) name for the value 3.14159.
Having said all the above, note that I never use ^:const in my code, since it is deceptive and provides "false assurance" that a var will never change. The problem is that ^:const implies one cannot redefine a var, but as we saw with const-two it does not prevent the var from being changed. Instead, ^:const hides the fact that the var has a new value, since const-two has been copied/inlined (at compile-time) to each place of use before the var is changed (at run-time).
A much better solution would be to throw an Exception upon attempting to change a ^:const var.
In the example docs they are trying to show that in most cases if you def a var to be the result of looking something up in a map without using const, then the lookup will happen when the class loads. so you pay the cost once each time you run the program (not at every lookup, just when the class loads). And pay the cost of looking up the value in the var each time it is read.
If you instead make it const then the compiler will preform the lookup at compile time and then emit a simple java final variable and you will pay the lookup cost only once total at the time you compile the program.
This is a contrived example because one map lookup at class load time and some var lookups at runtime are basically nothing, though it illustrates the point that some work happens at compile time, some at load time, and the rest well ... the rest of the time

Clojure: reduce vs. apply

I understand the conceptual difference between reduce and apply:
(reduce + (list 1 2 3 4 5))
; translates to: (+ (+ (+ (+ 1 2) 3) 4) 5)
(apply + (list 1 2 3 4 5))
; translates to: (+ 1 2 3 4 5)
However, which one is more idiomatic clojure? Does it make much difference one way or the other? From my (limited) performance testing, it seems reduce is a bit faster.
reduce and apply are of course only equivalent (in terms of the ultimate result returned) for associative functions which need to see all their arguments in the variable-arity case. When they are result-wise equivalent, I'd say that apply is always perfectly idiomatic, while reduce is equivalent -- and might shave off a fraction of a blink of an eye -- in a lot of the common cases. What follows is my rationale for believing this.
+ is itself implemented in terms of reduce for the variable-arity case (more than 2 arguments). Indeed, this seems like an immensely sensible "default" way to go for any variable-arity, associative function: reduce has the potential to perform some optimisations to speed things up -- perhaps through something like internal-reduce, a 1.2 novelty recently disabled in master, but hopefully to be reintroduced in the future -- which it would be silly to replicate in every function which might benefit from them in the vararg case. In such common cases, apply will just add a little overhead. (Note it's nothing to be really worried about.)
On the other hand, a complex function might take advantage of some optimisation opportunities which aren't general enough to be built into reduce; then apply would let you take advantage of those while reduce might actually slow you down. A good example of the latter scenario occuring in practice is provided by str: it uses a StringBuilder internally and will benefit significantly from the use of apply rather than reduce.
So, I'd say use apply when in doubt; and if you happen to know that it's not buying you anything over reduce (and that this is unlikely to change very soon), feel free to use reduce to shave off that diminutive unnecessary overhead if you feel like it.
For newbies looking at this answer,
be careful, they are not the same:
(apply hash-map [:a 5 :b 6])
;= {:a 5, :b 6}
(reduce hash-map [:a 5 :b 6])
;= {{{:a 5} :b} 6}
It doesn't make a difference in this case, because + is a special case that can apply to any number of arguments. Reduce is a way to apply a function that expects a fixed number of arguments (2) to an arbitrarily long list of arguments.
Opinions vary- In the greater Lisp world, reduce is definitely considered more idiomatic. First, there is the variadic issues already discussed. Also, some Common Lisp compilers will actually fail when apply is applied against very long lists because of how they handle argument lists.
Amongst Clojurists in my circle, though, using apply in this case seems more common. I find it easier to grok and prefer it also.
I normally find myself preferring reduce when acting on any kind of collection - it performs well, and is a pretty useful function in general.
The main reason I would use apply is if the parameters mean different things in different positions, or if you have a couple of initial parameters but want to get the rest from a collection, e.g.
(apply + 1 2 other-number-list)
In this specific case I prefer reduce because it's more readable: when I read
(reduce + some-numbers)
I know immediately that you're turning a sequence into a value.
With apply I have to consider which function is being applied: "ah, it's the + function, so I'm getting... a single number". Slightly less straightforward.
When using a simple function like +, it really doesn't matter which one you use.
In general, the idea is that reduce is an accumulating operation. You present the current accumulation value and one new value to your accumulating function The result of the function is the cumulative value for the next iteration. So, your iterations look like:
cum-val[i+1] = F( cum-val[i], input-val[i] ) ; please forgive the java-like syntax!
For apply, the idea is that you are attempting to call a function expecting a number of scalar arguments, but they are currently in a collection and need to be pulled out. So, instead of saying:
vals = [ val1 val2 val3 ]
(some-fn (vals 0) (vals 1) (vals 2))
we can say:
(apply some-fn vals)
and it is converted to be equivalent to:
(some-fn val1 val2 val3)
So, using "apply" is like "removing the parentheses" around the sequence.
Bit late on the topic but I did a simple experiment after reading this example. Here is result from my repl, I just can't deduce anything from the response, but seems there is some sort of caching kick in between reduce and apply.
user=> (time (reduce + (range 1e3)))
"Elapsed time: 5.543 msecs"
499500
user=> (time (apply + (range 1e3)))
"Elapsed time: 5.263 msecs"
499500
user=> (time (apply + (range 1e4)))
"Elapsed time: 19.721 msecs"
49995000
user=> (time (reduce + (range 1e4)))
"Elapsed time: 1.409 msecs"
49995000
user=> (time (reduce + (range 1e5)))
"Elapsed time: 17.524 msecs"
4999950000
user=> (time (apply + (range 1e5)))
"Elapsed time: 11.548 msecs"
4999950000
Looking at source code of clojure reduce its pretty clean recursion with internal-reduce, didn't found anything on implementation of apply though. Clojure implementation of + for apply internally invoke reduce, which is cached by repl, which seem to explain the 4th call. Can someone clarify whats really happening here?
The beauty of apply is given function (+ in this case) can be applied to argument list formed by pre-pending intervening arguments with an ending collection. Reduce is an abstraction to process collection items applying the function for each and doesn't work with variable args case.
(apply + 1 2 3 [3 4])
=> 13
(reduce + 1 2 3 [3 4])
ArityException Wrong number of args (5) passed to: core/reduce clojure.lang.AFn.throwArity (AFn.java:429)
A bit late, but...
In this case, there is not a big difference. But in general they are not equivalent. Further more reduce can be more performant. Why?
reduce checks if a collection or type implements IReduced interface. That means a type knows how provide its values to the reducing function in the most performant why.
reduce can be stopped prematurely by returning a Reduced value.
Apply on the other hand, is invoked by applyToHelper. Which dispatches to the right arity by counting the args, unpacking the values from the collection.
Is it a big performance impact? Probably not.
My opinion is as others already pointed out. Use reduce if you want to semantically "reduce" a collection to a single value. Otherwise use apply.

Common programming mistakes for Clojure developers to avoid [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
What are some common mistakes made by Clojure developers, and how can we avoid them?
For example; newcomers to Clojure think that the contains? function works the same as java.util.Collection#contains. However, contains? will only work similarly when used with indexed collections like maps and sets and you're looking for a given key:
(contains? {:a 1 :b 2} :b)
;=> true
(contains? {:a 1 :b 2} 2)
;=> false
(contains? #{:a 1 :b 2} :b)
;=> true
When used with numerically indexed collections (vectors, arrays) contains? only checks that the given element is within the valid range of indexes (zero-based):
(contains? [1 2 3 4] 4)
;=> false
(contains? [1 2 3 4] 0)
;=> true
If given a list, contains? will never return true.
Literal Octals
At one point I was reading in a matrix which used leading zeros to maintain proper rows and columns. Mathematically this is correct, since leading zero obviously don't alter the underlying value. Attempts to define a var with this matrix, however, would fail mysteriously with:
java.lang.NumberFormatException: Invalid number: 08
which totally baffled me. The reason is that Clojure treats literal integer values with leading zeros as octals, and there is no number 08 in octal.
I should also mention that Clojure supports traditional Java hexadecimal values via the 0x prefix. You can also use any base between 2 and 36 by using the "base+r+value" notation, such as 2r101010 or 36r16 which are 42 base ten.
Trying to return literals in an anonymous function literal
This works:
user> (defn foo [key val]
{key val})
#'user/foo
user> (foo :a 1)
{:a 1}
so I believed this would also work:
(#({%1 %2}) :a 1)
but it fails with:
java.lang.IllegalArgumentException: Wrong number of args passed to: PersistentArrayMap
because the #() reader macro gets expanded to
(fn [%1 %2] ({%1 %2}))
with the map literal wrapped in parenthesis. Since it's the first element, it's treated as a function (which a literal map actually is), but no required arguments (such as a key) are provided. In summary, the anonymous function literal does not expand to
(fn [%1 %2] {%1 %2}) ; notice the lack of parenthesis
and so you can't have any literal value ([], :a, 4, %) as the body of the anonymous function.
Two solutions have been given in the comments. Brian Carper suggests using sequence implementation constructors (array-map, hash-set, vector) like so:
(#(array-map %1 %2) :a 1)
while Dan shows that you can use the identity function to unwrap the outer parenthesis:
(#(identity {%1 %2}) :a 1)
Brian's suggestion actually brings me to my next mistake...
Thinking that hash-map or array-map determine the unchanging concrete map implementation
Consider the following:
user> (class (hash-map))
clojure.lang.PersistentArrayMap
user> (class (hash-map :a 1))
clojure.lang.PersistentHashMap
user> (class (assoc (apply array-map (range 2000)) :a :1))
clojure.lang.PersistentHashMap
While you generally won't have to worry about the concrete implementation of a Clojure map, you should know that functions which grow a map - like assoc or conj - can take a PersistentArrayMap and return a PersistentHashMap, which performs faster for larger maps.
Using a function as the recursion point rather than a loop to provide initial bindings
When I started out, I wrote a lot of functions like this:
; Project Euler #3
(defn p3
([] (p3 775147 600851475143 3))
([i n times]
(if (and (divides? i n) (fast-prime? i times)) i
(recur (dec i) n times))))
When in fact loop would have been more concise and idiomatic for this particular function:
; Elapsed time: 387 msecs
(defn p3 [] {:post [(= % 6857)]}
(loop [i 775147 n 600851475143 times 3]
(if (and (divides? i n) (fast-prime? i times)) i
(recur (dec i) n times))))
Notice that I replaced the empty argument, "default constructor" function body (p3 775147 600851475143 3) with a loop + initial binding. The recur now rebinds the loop bindings (instead of the fn parameters) and jumps back to the recursion point (loop, instead of fn).
Referencing "phantom" vars
I'm speaking about the type of var you might define using the REPL - during your exploratory programming - then unknowingly reference in your source. Everything works fine until you reload the namespace (perhaps by closing your editor) and later discover a bunch of unbound symbols referenced throughout your code. This also happens frequently when you're refactoring, moving a var from one namespace to another.
Treating the for list comprehension like an imperative for loop
Essentially you're creating a lazy list based on existing lists rather than simply performing a controlled loop. Clojure's doseq is actually more analogous to imperative foreach looping constructs.
One example of how they're different is the ability to filter which elements they iterate over using arbitrary predicates:
user> (for [n '(1 2 3 4) :when (even? n)] n)
(2 4)
user> (for [n '(4 3 2 1) :while (even? n)] n)
(4)
Another way they're different is that they can operate on infinite lazy sequences:
user> (take 5 (for [x (iterate inc 0) :when (> (* x x) 3)] (* 2 x)))
(4 6 8 10 12)
They also can handle more than one binding expression, iterating over the rightmost expression first and working its way left:
user> (for [x '(1 2 3) y '(\a \b \c)] (str x y))
("1a" "1b" "1c" "2a" "2b" "2c" "3a" "3b" "3c")
There's also no break or continue to exit prematurely.
Overuse of structs
I come from an OOPish background so when I started Clojure my brain was still thinking in terms of objects. I found myself modeling everything as a struct because its grouping of "members", however loose, made me feel comfortable. In reality, structs should mostly be considered an optimization; Clojure will share the keys and some lookup information to conserve memory. You can further optimize them by defining accessors to speed up the key lookup process.
Overall you don't gain anything from using a struct over a map except for performance, so the added complexity might not be worth it.
Using unsugared BigDecimal constructors
I needed a lot of BigDecimals and was writing ugly code like this:
(let [foo (BigDecimal. "1") bar (BigDecimal. "42.42") baz (BigDecimal. "24.24")]
when in fact Clojure supports BigDecimal literals by appending M to the number:
(= (BigDecimal. "42.42") 42.42M) ; true
Using the sugared version cuts out a lot of the bloat. In the comments, twils mentioned that you can also use the bigdec and bigint functions to be more explicit, yet remain concise.
Using the Java package naming conversions for namespaces
This isn't actually a mistake per se, but rather something that goes against the idiomatic structure and naming of a typical Clojure project. My first substantial Clojure project had namespace declarations - and corresponding folder structures - like this:
(ns com.14clouds.myapp.repository)
which bloated up my fully-qualified function references:
(com.14clouds.myapp.repository/load-by-name "foo")
To complicate things even more, I used a standard Maven directory structure:
|-- src/
| |-- main/
| | |-- java/
| | |-- clojure/
| | |-- resources/
| |-- test/
...
which is more complex than the "standard" Clojure structure of:
|-- src/
|-- test/
|-- resources/
which is the default of Leiningen projects and Clojure itself.
Maps utilize Java's equals() rather than Clojure's = for key matching
Originally reported by chouser on IRC, this usage of Java's equals() leads to some unintuitive results:
user> (= (int 1) (long 1))
true
user> ({(int 1) :found} (int 1) :not-found)
:found
user> ({(int 1) :found} (long 1) :not-found)
:not-found
Since both Integer and Long instances of 1 are printed the same by default, it can be difficult to detect why your map isn't returning any values. This is especially true when you pass your key through a function which, perhaps unbeknownst to you, returns a long.
It should be noted that using Java's equals() instead of Clojure's = is essential for maps to conform to the java.util.Map interface.
I'm using Programming Clojure by Stuart Halloway, Practical Clojure by Luke VanderHart, and the help of countless Clojure hackers on IRC and the mailing list to help along my answers.
Forgetting to force evaluation of lazy seqs
Lazy seqs aren't evaluated unless you ask them to be evaluated. You might expect this to print something, but it doesn't.
user=> (defn foo [] (map println [:foo :bar]) nil)
#'user/foo
user=> (foo)
nil
The map is never evaluated, it's silently discarded, because it's lazy. You have to use one of doseq, dorun, doall etc. to force evaluation of lazy sequences for side-effects.
user=> (defn foo [] (doseq [x [:foo :bar]] (println x)) nil)
#'user/foo
user=> (foo)
:foo
:bar
nil
user=> (defn foo [] (dorun (map println [:foo :bar])) nil)
#'user/foo
user=> (foo)
:foo
:bar
nil
Using a bare map at the REPL kind of looks like it works, but it only works because the REPL forces evaluation of lazy seqs itself. This can make the bug even harder to notice, because your code works at the REPL and doesn't work from a source file or inside a function.
user=> (map println [:foo :bar])
(:foo
:bar
nil nil)
I'm a Clojure noob. More advanced users may have more interesting problems.
trying to print infinite lazy sequences.
I knew what I was doing with my lazy sequences, but for debugging purposes I inserted some print/prn/pr calls, temporarily having forgotten what it is I was printing. Funny, why's my PC all hung up?
trying to program Clojure imperatively.
There is some temptation to create a whole lot of refs or atoms and write code that constantly mucks with their state. This can be done, but it's not a good fit. It may also have poor performance, and rarely benefit from multiple cores.
trying to program Clojure 100% functionally.
A flip side to this: Some algorithms really do want a bit of mutable state. Religiously avoiding mutable state at all costs may result in slow or awkward algorithms. It takes judgement and a bit of experience to make the decision.
trying to do too much in Java.
Because it's so easy to reach out to Java, it's sometimes tempting to use Clojure as a scripting language wrapper around Java. Certainly you'll need to do exactly this when using Java library functionality, but there's little sense in (e.g.) maintaining data structures in Java, or using Java data types such as collections for which there are good equivalents in Clojure.
Lots of things already mentioned. I'll just add one more.
Clojure if treats Java Boolean objects always as true even if it's value is false. So if you have a java land function that returns a java Boolean value, make sure you do not check it directly
(if java-bool "Yes" "No")
but rather
(if (boolean java-bool) "Yes" "No").
I got burned by this with clojure.contrib.sql library that returns database boolean fields as java Boolean objects.
Keeping your head in loops.
You risk running out of memory if you loop over the elements of a potentially very large, or infinite, lazy sequence while keeping a reference to the first element.
Forgetting there's no TCO.
Regular tail-calls consume stack space, and they will overflow if you're not careful. Clojure has 'recur and 'trampoline to handle many of the cases where optimized tail-calls would be used in other languages, but these techniques have to be intentionally applied.
Not-quite-lazy sequences.
You may build a lazy sequence with 'lazy-seq or 'lazy-cons (or by building upon higher level lazy APIs), but if you wrap it in 'vec or pass it through some other function that realizes the sequence, then it will no longer be lazy. Both the stack and the heap can be overflown by this.
Putting mutable things in refs.
You can technically do it, but only the object reference in the ref itself is governed by the STM - not the referred object and its fields (unless they are immutable and point to other refs). So whenever possible, prefer to only immutable objects in refs. Same thing goes for atoms.
using loop ... recur to process sequences when map will do.
(defn work [data]
(do-stuff (first data))
(recur (rest data)))
vs.
(map do-stuff data)
The map function (in the latest branch) uses chunked sequences and many other optimizations. Also, because this function is frequently run, the Hotspot JIT usually has it optimized and ready to go with out any "warm up time".
Collection types have different behaviors for some operations:
user=> (conj '(1 2 3) 4)
(4 1 2 3) ;; new element at the front
user=> (conj [1 2 3] 4)
[1 2 3 4] ;; new element at the back
user=> (into '(3 4) (list 5 6 7))
(7 6 5 3 4)
user=> (into [3 4] (list 5 6 7))
[3 4 5 6 7]
Working with strings can be confusing (I still don't quite get them). Specifically, strings are not the same as sequences of characters, even though sequence functions work on them:
user=> (filter #(> (int %) 96) "abcdABCDefghEFGH")
(\a \b \c \d \e \f \g \h)
To get a string back out, you'd need to do:
user=> (apply str (filter #(> (int %) 96) "abcdABCDefghEFGH"))
"abcdefgh"
too many parantheses, especially with void java method call inside which results in NPE:
public void foo() {}
((.foo))
results in NPE from outer parantheses because inner parantheses evaluate to nil.
public int bar() { return 5; }
((.bar))
results in the easier to debug:
java.lang.Integer cannot be cast to clojure.lang.IFn
[Thrown class java.lang.ClassCastException]