Today I've seen some references to tying the knot and circular data structures. I've been out reading some answers, and the solutions seem to involve using a ref to point back to the head of the list. One particular SO question showed a Haskell example, but I don't know Haskell well enough to know if the example was using a the Haskell equivalent of a ref.
Is there a way to make a Clojure data structure circular without using a ref or similar construct?
Thanks.
I straightforwardly translated the Haskell example into Clojure:
user> (def alternates
(letfn [(x [] (lazy-seq (cons 0 (y))))
(y [] (lazy-seq (cons 1 (x))))]
(x)))
#'user/alternates
user> (take 7 alternates)
(0 1 0 1 0 1 0)
It works as expected. However I prefer the cycle function to mutually recursive functions using letfn:
user> (take 7 (cycle [0 1]))
(0 1 0 1 0 1 0)
It's impossible to create a circular reference using the standard Clojure immutable data structures and standard Clojure functions. This is because whichever object is created second can never be added to the (immutable) object which is created first.
However there are are multiple ways that you can create a circular data structure if you are willing to use a bit of trickery:
Refs, atoms etc.
Mutable deftypes
Java objects
Reflection trickery
Lazy sequences
In general however, circular data structures are best avoided in Clojure.
I've used this before:
;; circular list operations
(defn rotate
([cl] (conj (into [](rest cl)) (first cl)))
([cl n] (nth (iterate rotate cl) (mod n (count cl)))))
The output is a vector but the input can be any sequence.
Related
For me as, a new Clojurian, some core functions seem rather counter-intuitive and confusing when it comes to arguments order/position, here's an example:
> (nthrest (range 10) 5)
=> (5 6 7 8 9)
> (take-last 5 (range 10))
=> (5 6 7 8 9)
Perhaps there is some rule/logic behind it that I don't see yet?
I refuse to believe that the Clojure core team made so many brilliant technical decisions and forgot about consistency in function naming/argument ordering.
Or should I just remember it as it is?
Thanks
Slightly offtopic:
rand&rand-int VS random-sample - another example where function naming seems inconsistent but that's a rather rarely used function so it's not a big deal.
There is an FAQ on Clojure.org for this question: https://clojure.org/guides/faq#arg_order
What are the rules of thumb for arg order in core functions?
Primary collection operands come first. That way one can write → and its ilk, and their position is independent of whether or not they have variable arity parameters. There is a tradition of this in OO languages and Common Lisp (slot-value, aref, elt).
One way to think about sequences is that they are read from the left, and fed from the right:
<- [1 2 3 4]
Most of the sequence functions consume and produce sequences. So one way to visualize that is as a chain:
map <- filter <- [1 2 3 4]
and one way to think about many of the seq functions is that they are parameterized in some way:
(map f) <- (filter pred) <- [1 2 3 4]
So, sequence functions take their source(s) last, and any other parameters before them, and partial allows for direct parameterization as above. There is a tradition of this in functional languages and Lisps.
Note that this is not the same as taking the primary operand last. Some sequence functions have more than one source (concat, interleave). When sequence functions are variadic, it is usually in their sources.
Adapted from comments by Rich Hickey.
Functions that work with seqs usually has the actual seq as last argument.
(map, filter, remote etc.)
Accessing and "changing" individual elements takes a collection as first element: conj, assoc, get, update
That way, you can use the (->>) macro with a collection consistenly,
as well as create transducers consistently.
Only rarely one has to resort to (as->) to change argument order. And if you have to do so, it might be an opportunity to check if your own functions follow that convention.
For some functions (especially functions that are "seq in, seq out"), the args are ordered so that one can use partial as follows:
(ns tst.demo.core
(:use tupelo.core tupelo.test))
(dotest
(let [dozen (range 12)
odds-1 (filterv odd? dozen)
filter-odd (partial filterv odd?)
odds-2 (filter-odd dozen) ]
(is= odds-1 odds-2
[1 3 5 7 9 11])))
For other functions, Clojure often follows the ordering of "biggest-first", or "most-important-first" (usually these have the same result). Thus, we see examples like:
(get <map> <key>)
(get <map> <key> <default-val>)
This also shows that any optional values must, by definition, be last (in order to use "rest" args). This is common in most languages (e.g. Java).
For the record, I really dislike using partial functions, since they have user-defined names (at best) or are used inline (more common). Consider this code:
(let [dozen (range 12)
odds (filterv odd? dozen)
evens-1 (mapv (partial + 1) odds)
evens-2 (mapv #(+ 1 %) odds)
add-1 (fn [arg] (+ 1 arg))
evens-3 (mapv add-1 odds)]
(is= evens-1 evens-2 evens-3
[2 4 6 8 10 12]))
Also
I personally find it really annoying trying to parse out code using partial as with evens-1, especially for the case of user-defined functions, or even standard functions that are not as simple as +.
This is especially so if the partial is used with 2 or more args.
For the 1-arg case, the function literal seen for evens-2 is much more readable to me.
If 2 or more args are present, please make a named function (either local, as shown for evens-3), or a regular (defn some-fn ...) global function.
I am trying to learn the programming language Clojure. I have a hard time understanding it though. I would like to try to implement something as simple as this (for example):
int[] array = new int[10];
array[0] =1;
int n = 10;
for(int i = 0; i <= n; i++){
array[i] = 0;
}
return array[n];
Since I am new to Clojure, I don't even understand how to implement this. It would be super helpful if someone could explain or give a similar example of how arrays/for-loops works in Clojure. I've tried to do some research, but as for my understanding Clojure doesn't really have for-loops.
The Clojure way to write the Java for loop you wrote is to do consider why you're looping in the first place. There are many options for porting a Java loop to Clojure. Choosing among them depends on what your goal is.
Ways to Make Ten Zeroes
As Carcigenicate posted, if you need ten zeros in a collection, consider:
(repeat 10 0)
That returns a sequence – a lazy one. Sequences are one of Clojure's central abstractions. If instead the ten zeros need to be accessible by index, put them in a vector with:
(vec (repeat 10 0))
or
(into [] (repeat 10 0))
Or, you could just write the vector literal directly in your code:
[0 0 0 0 0 0 0 0 0 0]
And if you specifically need a Java array for some reason, then you can do that with to-array:
(to-array (repeat 10 0))
But remember the advice from the Clojure reference docs on Java interop:
Clojure supports the creation, reading and modification of Java arrays [but] it is recommended that you limit use of arrays to interop with Java libraries that require them as arguments or use them as return values.
Those docs list some functions for working with Java arrays primarily when they're required for Java interop or "to support mutation or higher performance operations". In nearly all cases Clojurists just use a vector.
Looping in Clojure
What if you're doing something other than producing ten zeros? The Clojure way to "loop" depends on what you need.
You may need recursion, for which Clojure has loop/recur:
(loop [x 10]
(when-not (= x 0)
(recur (- x 2))))
You may need to calculate a value for every value in some collection(s):
(for [x coll])
(calculate-y x))
You might need to iterate over multiple collections, similar to nested loops in Java:
(for [x ['a 'b 'c]
y [1 2 3]]
(foo x y))
If you just need to produce a side effect some number of times, repeatedly is your jam:
(repeatedly 10 some-fn)
If you need to produce a side effect for each value in a collection, try doseq:
(doseq [x coll]
(do-some-side-effect! x))
If you need to produce a side effect for a range of integers, you could use doseq like this:
(doseq [x (range 10)]
(do-something! x))
...but dotimes is like doseq with a built-in range:
(dotimes [x 9]
(do-something! x))
Even more common than those loopy constructs are Clojure functions which produce a value for every element in a collection, such as map and its relatives, or which iterate over collections either for special purposes (like filter and remove) or to create some new value or collection (like reduce).
I think that last point in #Lee's answer should be heavily emphasized.
Unless you absolutely need arrays for performance reasons, you should really be using vectors here. That entire chunk of code suddenly becomes trivial once you start using native Clojure structures:
; Create 10 0s, then put them in a vector
(vec (repeat 10 0))
You can even skip the call to vec if you're ok using a lazy sequence instead of a strict vector.
It should also be noted though that pre-initializing the elements to 0 is unnecessary. Unlike arrays, vectors are variable length; they can be added to and expanded after creation. It's much cleaner to just have an empty vector and add elements to it as needed.
You can create and modify arrays in clojure, your example would look like:
(defn example []
(let [arr (int-array 10)
n 10]
(aset arr 0 1)
(doseq [i (range (inc n))]
(aset arr i 0))
(aget arr n)))
Be aware that this example will throw an exception since array[i] is out of bounds when i = 10.
You will usually use vectors in preference to arrays since they are immutable.
You can get a good start here:
Brave Clojure
The Clojure CheatSheet
I've started to learn clojure but I'm having trouble wrapping my mind around certain concepts. For instance, what I want to do here is to take this function and convert it so that it calls get-origlabels lazily.
(defn get-all-origlabels []
(set (flatten (map get-origlabels (range *song-count*)))))
My first attempt used recursion but blew up the stack (song-count is about 10,000). I couldn't figure out how to do it with tail recursion.
get-origlabels returns a set each time it is called, but values are often repeated between calls. What the get-origlabels function actually does is read a file (a different file for each value from 0 to song-count-1) and return the words stored in it in a set.
Any pointers would be greatly appreciated!
Thanks!
-Philip
You can get what you want by using mapcat. I believe putting it into an actual Clojure set is going to de-lazify it, as demonstrated by the fact that (take 10 (set (iterate inc 0))) tries to realize the whole set before taking 10.
(distinct (mapcat get-origlabels (range *song-count*)))
This will give you a lazy sequence. You can verify that by doing something like, starting with an infinite sequence:
(->> (iterate inc 0)
(mapcat vector)
(distinct)
(take 10))
You'll end up with a seq, rather than a set, but since it sounds like you really want laziness here, I think that's for the best.
This may have better stack behavior
(defn get-all-origlabels []
(reduce (fn (s x) (union s (get-origlabels x))) ${} (range *song-count*)))
I'd probably use something like:
(into #{} (mapcat get-origlabels (range *song-count*)))
In general, "into" is very helpful when constructing Clojure data structures. I have this mental image of a conveyor belt (a sequence) dropping a bunch of random objects into a large bucket (the target collection).
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
Which lesser-known but useful features of Clojure do you find yourselves using? Feel free to share little tricks and idioms, but try to restrict yourselves to Core and Contrib.
I found some really interesting information in answers to these similar questions:
Hidden features of Haskell
Hidden features of Python
Hidden features of Java
Hidden features of C
Hidden features of Perl
Hidden features of Ruby
There are many more "Hidden feature" questions for other languages, so I thought it would be nice to have one for Clojure, too.
Clojure has an immutable, persistent queue datatype, PersistentQueue, but it doesn't (yet?) have literal reader syntax or Clojure wrapper functions, so you have to create one via a Java call. Queues conj (push) onto the rear and pop from the front with good performance.
user> (-> (clojure.lang.PersistentQueue/EMPTY)
(conj 1 2 3)
pop)
(2 3)
Lists conj onto the front and pop from the front. Vectors conj onto the rear and pop from the rear. So queues are sometimes exactly what you need.
user> (-> ()
(conj 1 2 3)
pop)
(2 1)
user> (-> []
(conj 1 2 3)
pop)
[1 2]
(defn foo [a & [b c]] ...)
You can destructure the rest argument.
Update:
The latest commit to the git repo (29389970bcd41998359681d9a4a20ee391a1e07c) has made it possible to perform associative destructuring like so:
(defn foo [a & {b :b c :c}] ...)
The obvious use of this is for keyword arguments. Note that this approach prevents mixing keyword arguments with rest arguments (not that that's something one's likely to need very often).
(defn foo [a & {:keys [b c] :or {b "val1" c "val2"}] ...)
If you want default values for keyword arguments.
The read-eval reader macro: #=
(read-string "#=(println \"hello\")")
This macro can present a security risk if read is used on user input (which is perhaps a bad idea on its own). You can turn this macro off by setting *read-eval* to false.
You can apply functions to infinite argument sequences. For example
(apply concat (repeat '(1 2 3)))
produces a lazy sequence of 1,2,3,1,2,3... Of course for this to work the function also has to be lazy with respect to its argument list.
From the increasingly good ClojureDocs site an idiom using juxt
http://clojuredocs.org/clojure_core/clojure.core/juxt
;juxt is useful for forking result data to multiple termination functions
(->> "some text to print and save to a file"
((juxt
println
(partial spit "useful information.txt"))))
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
What are some common mistakes made by Clojure developers, and how can we avoid them?
For example; newcomers to Clojure think that the contains? function works the same as java.util.Collection#contains. However, contains? will only work similarly when used with indexed collections like maps and sets and you're looking for a given key:
(contains? {:a 1 :b 2} :b)
;=> true
(contains? {:a 1 :b 2} 2)
;=> false
(contains? #{:a 1 :b 2} :b)
;=> true
When used with numerically indexed collections (vectors, arrays) contains? only checks that the given element is within the valid range of indexes (zero-based):
(contains? [1 2 3 4] 4)
;=> false
(contains? [1 2 3 4] 0)
;=> true
If given a list, contains? will never return true.
Literal Octals
At one point I was reading in a matrix which used leading zeros to maintain proper rows and columns. Mathematically this is correct, since leading zero obviously don't alter the underlying value. Attempts to define a var with this matrix, however, would fail mysteriously with:
java.lang.NumberFormatException: Invalid number: 08
which totally baffled me. The reason is that Clojure treats literal integer values with leading zeros as octals, and there is no number 08 in octal.
I should also mention that Clojure supports traditional Java hexadecimal values via the 0x prefix. You can also use any base between 2 and 36 by using the "base+r+value" notation, such as 2r101010 or 36r16 which are 42 base ten.
Trying to return literals in an anonymous function literal
This works:
user> (defn foo [key val]
{key val})
#'user/foo
user> (foo :a 1)
{:a 1}
so I believed this would also work:
(#({%1 %2}) :a 1)
but it fails with:
java.lang.IllegalArgumentException: Wrong number of args passed to: PersistentArrayMap
because the #() reader macro gets expanded to
(fn [%1 %2] ({%1 %2}))
with the map literal wrapped in parenthesis. Since it's the first element, it's treated as a function (which a literal map actually is), but no required arguments (such as a key) are provided. In summary, the anonymous function literal does not expand to
(fn [%1 %2] {%1 %2}) ; notice the lack of parenthesis
and so you can't have any literal value ([], :a, 4, %) as the body of the anonymous function.
Two solutions have been given in the comments. Brian Carper suggests using sequence implementation constructors (array-map, hash-set, vector) like so:
(#(array-map %1 %2) :a 1)
while Dan shows that you can use the identity function to unwrap the outer parenthesis:
(#(identity {%1 %2}) :a 1)
Brian's suggestion actually brings me to my next mistake...
Thinking that hash-map or array-map determine the unchanging concrete map implementation
Consider the following:
user> (class (hash-map))
clojure.lang.PersistentArrayMap
user> (class (hash-map :a 1))
clojure.lang.PersistentHashMap
user> (class (assoc (apply array-map (range 2000)) :a :1))
clojure.lang.PersistentHashMap
While you generally won't have to worry about the concrete implementation of a Clojure map, you should know that functions which grow a map - like assoc or conj - can take a PersistentArrayMap and return a PersistentHashMap, which performs faster for larger maps.
Using a function as the recursion point rather than a loop to provide initial bindings
When I started out, I wrote a lot of functions like this:
; Project Euler #3
(defn p3
([] (p3 775147 600851475143 3))
([i n times]
(if (and (divides? i n) (fast-prime? i times)) i
(recur (dec i) n times))))
When in fact loop would have been more concise and idiomatic for this particular function:
; Elapsed time: 387 msecs
(defn p3 [] {:post [(= % 6857)]}
(loop [i 775147 n 600851475143 times 3]
(if (and (divides? i n) (fast-prime? i times)) i
(recur (dec i) n times))))
Notice that I replaced the empty argument, "default constructor" function body (p3 775147 600851475143 3) with a loop + initial binding. The recur now rebinds the loop bindings (instead of the fn parameters) and jumps back to the recursion point (loop, instead of fn).
Referencing "phantom" vars
I'm speaking about the type of var you might define using the REPL - during your exploratory programming - then unknowingly reference in your source. Everything works fine until you reload the namespace (perhaps by closing your editor) and later discover a bunch of unbound symbols referenced throughout your code. This also happens frequently when you're refactoring, moving a var from one namespace to another.
Treating the for list comprehension like an imperative for loop
Essentially you're creating a lazy list based on existing lists rather than simply performing a controlled loop. Clojure's doseq is actually more analogous to imperative foreach looping constructs.
One example of how they're different is the ability to filter which elements they iterate over using arbitrary predicates:
user> (for [n '(1 2 3 4) :when (even? n)] n)
(2 4)
user> (for [n '(4 3 2 1) :while (even? n)] n)
(4)
Another way they're different is that they can operate on infinite lazy sequences:
user> (take 5 (for [x (iterate inc 0) :when (> (* x x) 3)] (* 2 x)))
(4 6 8 10 12)
They also can handle more than one binding expression, iterating over the rightmost expression first and working its way left:
user> (for [x '(1 2 3) y '(\a \b \c)] (str x y))
("1a" "1b" "1c" "2a" "2b" "2c" "3a" "3b" "3c")
There's also no break or continue to exit prematurely.
Overuse of structs
I come from an OOPish background so when I started Clojure my brain was still thinking in terms of objects. I found myself modeling everything as a struct because its grouping of "members", however loose, made me feel comfortable. In reality, structs should mostly be considered an optimization; Clojure will share the keys and some lookup information to conserve memory. You can further optimize them by defining accessors to speed up the key lookup process.
Overall you don't gain anything from using a struct over a map except for performance, so the added complexity might not be worth it.
Using unsugared BigDecimal constructors
I needed a lot of BigDecimals and was writing ugly code like this:
(let [foo (BigDecimal. "1") bar (BigDecimal. "42.42") baz (BigDecimal. "24.24")]
when in fact Clojure supports BigDecimal literals by appending M to the number:
(= (BigDecimal. "42.42") 42.42M) ; true
Using the sugared version cuts out a lot of the bloat. In the comments, twils mentioned that you can also use the bigdec and bigint functions to be more explicit, yet remain concise.
Using the Java package naming conversions for namespaces
This isn't actually a mistake per se, but rather something that goes against the idiomatic structure and naming of a typical Clojure project. My first substantial Clojure project had namespace declarations - and corresponding folder structures - like this:
(ns com.14clouds.myapp.repository)
which bloated up my fully-qualified function references:
(com.14clouds.myapp.repository/load-by-name "foo")
To complicate things even more, I used a standard Maven directory structure:
|-- src/
| |-- main/
| | |-- java/
| | |-- clojure/
| | |-- resources/
| |-- test/
...
which is more complex than the "standard" Clojure structure of:
|-- src/
|-- test/
|-- resources/
which is the default of Leiningen projects and Clojure itself.
Maps utilize Java's equals() rather than Clojure's = for key matching
Originally reported by chouser on IRC, this usage of Java's equals() leads to some unintuitive results:
user> (= (int 1) (long 1))
true
user> ({(int 1) :found} (int 1) :not-found)
:found
user> ({(int 1) :found} (long 1) :not-found)
:not-found
Since both Integer and Long instances of 1 are printed the same by default, it can be difficult to detect why your map isn't returning any values. This is especially true when you pass your key through a function which, perhaps unbeknownst to you, returns a long.
It should be noted that using Java's equals() instead of Clojure's = is essential for maps to conform to the java.util.Map interface.
I'm using Programming Clojure by Stuart Halloway, Practical Clojure by Luke VanderHart, and the help of countless Clojure hackers on IRC and the mailing list to help along my answers.
Forgetting to force evaluation of lazy seqs
Lazy seqs aren't evaluated unless you ask them to be evaluated. You might expect this to print something, but it doesn't.
user=> (defn foo [] (map println [:foo :bar]) nil)
#'user/foo
user=> (foo)
nil
The map is never evaluated, it's silently discarded, because it's lazy. You have to use one of doseq, dorun, doall etc. to force evaluation of lazy sequences for side-effects.
user=> (defn foo [] (doseq [x [:foo :bar]] (println x)) nil)
#'user/foo
user=> (foo)
:foo
:bar
nil
user=> (defn foo [] (dorun (map println [:foo :bar])) nil)
#'user/foo
user=> (foo)
:foo
:bar
nil
Using a bare map at the REPL kind of looks like it works, but it only works because the REPL forces evaluation of lazy seqs itself. This can make the bug even harder to notice, because your code works at the REPL and doesn't work from a source file or inside a function.
user=> (map println [:foo :bar])
(:foo
:bar
nil nil)
I'm a Clojure noob. More advanced users may have more interesting problems.
trying to print infinite lazy sequences.
I knew what I was doing with my lazy sequences, but for debugging purposes I inserted some print/prn/pr calls, temporarily having forgotten what it is I was printing. Funny, why's my PC all hung up?
trying to program Clojure imperatively.
There is some temptation to create a whole lot of refs or atoms and write code that constantly mucks with their state. This can be done, but it's not a good fit. It may also have poor performance, and rarely benefit from multiple cores.
trying to program Clojure 100% functionally.
A flip side to this: Some algorithms really do want a bit of mutable state. Religiously avoiding mutable state at all costs may result in slow or awkward algorithms. It takes judgement and a bit of experience to make the decision.
trying to do too much in Java.
Because it's so easy to reach out to Java, it's sometimes tempting to use Clojure as a scripting language wrapper around Java. Certainly you'll need to do exactly this when using Java library functionality, but there's little sense in (e.g.) maintaining data structures in Java, or using Java data types such as collections for which there are good equivalents in Clojure.
Lots of things already mentioned. I'll just add one more.
Clojure if treats Java Boolean objects always as true even if it's value is false. So if you have a java land function that returns a java Boolean value, make sure you do not check it directly
(if java-bool "Yes" "No")
but rather
(if (boolean java-bool) "Yes" "No").
I got burned by this with clojure.contrib.sql library that returns database boolean fields as java Boolean objects.
Keeping your head in loops.
You risk running out of memory if you loop over the elements of a potentially very large, or infinite, lazy sequence while keeping a reference to the first element.
Forgetting there's no TCO.
Regular tail-calls consume stack space, and they will overflow if you're not careful. Clojure has 'recur and 'trampoline to handle many of the cases where optimized tail-calls would be used in other languages, but these techniques have to be intentionally applied.
Not-quite-lazy sequences.
You may build a lazy sequence with 'lazy-seq or 'lazy-cons (or by building upon higher level lazy APIs), but if you wrap it in 'vec or pass it through some other function that realizes the sequence, then it will no longer be lazy. Both the stack and the heap can be overflown by this.
Putting mutable things in refs.
You can technically do it, but only the object reference in the ref itself is governed by the STM - not the referred object and its fields (unless they are immutable and point to other refs). So whenever possible, prefer to only immutable objects in refs. Same thing goes for atoms.
using loop ... recur to process sequences when map will do.
(defn work [data]
(do-stuff (first data))
(recur (rest data)))
vs.
(map do-stuff data)
The map function (in the latest branch) uses chunked sequences and many other optimizations. Also, because this function is frequently run, the Hotspot JIT usually has it optimized and ready to go with out any "warm up time".
Collection types have different behaviors for some operations:
user=> (conj '(1 2 3) 4)
(4 1 2 3) ;; new element at the front
user=> (conj [1 2 3] 4)
[1 2 3 4] ;; new element at the back
user=> (into '(3 4) (list 5 6 7))
(7 6 5 3 4)
user=> (into [3 4] (list 5 6 7))
[3 4 5 6 7]
Working with strings can be confusing (I still don't quite get them). Specifically, strings are not the same as sequences of characters, even though sequence functions work on them:
user=> (filter #(> (int %) 96) "abcdABCDefghEFGH")
(\a \b \c \d \e \f \g \h)
To get a string back out, you'd need to do:
user=> (apply str (filter #(> (int %) 96) "abcdABCDefghEFGH"))
"abcdefgh"
too many parantheses, especially with void java method call inside which results in NPE:
public void foo() {}
((.foo))
results in NPE from outer parantheses because inner parantheses evaluate to nil.
public int bar() { return 5; }
((.bar))
results in the easier to debug:
java.lang.Integer cannot be cast to clojure.lang.IFn
[Thrown class java.lang.ClassCastException]