Namespace qualified record field accessors - clojure

I've made the same dumb mistake many many times:
(defrecord Record [field-name])
(let [field (:feld-name (->Record 1))] ; Whoops!
(+ 1 field))
Since I misspelled the field name keyword, this will cause a NPE.
The "obvious" solution to this would be to have defrecord emit namespaced keywords instead, since then, especially when working in a different file, the IDE will be able to immediately show what keywords are available as soon as I type ::n/.
I could probably with some creativity create a macro that wraps defrecord that creates the keywords for me, but this seems like overkill.
Is there a way to have defrecord emit namespaced field accessors, or is there any other good way to avoid this problem?

Because defrecords compile to java classes and fields on a java class don't have a concept of namespaces, I don't think there's a good way to have defrecord emit namespaced keywords.
One alternative, if the code is not performance sensitive and doesn't need to implement any protocols and similar, is to just use maps.
Another is, like Alan Thompson's solution, to make a safe-get funtion. The prismatic/plumbing util library also has an implementation of this.
(defn safe-get [m k]
(let [ret (get m k ::not-found)]
(if (= ::not-found ret)
(throw (ex-info "Key not found: " {:map m, :key k}))
ret)))
(defrecord x [foo])
(safe-get (->x 1) :foo) ;=> 1
(safe-get (->x 1) :fo) ;=>
;; 1. Unhandled clojure.lang.ExceptionInfo
;; Key not found:
;; {:map {:foo 1}, :key :fo}

I feel your pain. Thankfully I have a solution that saves me many times/week that I've been using a couple of years. It is the grab function from the Tupelo library. It does not provide the type of IDE integration you are hoping for, but it does provide fail-fast typo-detection, so you always be notified the very first time you try to use the non-existant key. Another benefit is that you'll get a stacktrace showing the line number with the misspelled keyword, not the line number (possibly far, far away) where the nil value causes a NPE.
It also works equally well for both records & plain-old maps (my usual use-case).
From the README:
Map Value Lookup
Maps are convenient, especially when keywords are used as functions to look up a value in a map. Unfortunately, attempting to look up a non-existent keyword in a map will return nil. While sometimes convenient, this means that a simple typo in the keyword name will silently return corrupted data (i.e. nil) instead of the desired value.
Instead, use the function grab for keyword/map lookup:
(grab k m)
"A fail-fast version of keyword/map lookup. When invoked as (grab :the-key the-map),
returns the value associated with :the-key as for (clojure.core/get the-map :the-key).
Throws an Exception if :the-key is not present in the-map."
(def sidekicks {:batman "robin" :clark "lois"})
(grab :batman sidekicks)
;=> "robin"
(grab :spiderman m)
;=> IllegalArgumentException Key not present in map:
map : {:batman "robin", :clark "lois"}
keys: [:spiderman]
The function grab should also be used in place of clojure.core/get. Simply reverse the order of arguments to match the "keyword-first, map-second" convention.
For looking up values in nested maps, the function fetch-in replaces clojure.core/get-in:
(fetch-in m ks)
"A fail-fast version of clojure.core/get-in. When invoked as (fetch-in the-map keys-vec),
returns the value associated with keys-vec as for (clojure.core/get-in the-map keys-vec).
Throws an Exception if the path keys-vec is not present in the-map."
(def my-map {:a 1
:b {:c 3}})
(fetch-in my-map [:b :c])
3
(fetch-in my-map [:b :z])
;=> IllegalArgumentException Key seq not present in map:
;=> map : {:b {:c 3}, :a 1}
;=> keys: [:b :z]
Your other option, using records, is to use the Java-interop style of accessor:
(.field-name myrec)
Since Clojure defrecord compiles into a simple Java class, your IDE may be able to recognize these names more easily. YMMV

Related

Create a map entry in Clojure

What is the built-in Clojure way (if any), to create a single map entry?
In other words, I would like something like (map-entry key value). In other words, the result should be more or less equivalent to (first {key value}).
Remarks:
Of course, I already tried googling, and only found map-entry? However, this document has no linked resources.
I know that (first {1 2}) returns [1 2], which seems a vector. However:
(class (first {1 2}))
; --> clojure.lang.MapEntry
(class [1 2])
; --> clojure.lang.PersistentVector
I checked in the source code, and I'm aware that both MapEntry and PersistentVector extend APersistentVector (so MapEntry is more-or-less also a vector). However, the question is still, whether I can create a MapEntry instance from Clojure code.
Last, but not least: "no, there is no built in way to do that in Clojure" is also a valid answer (which I strongly suspect is the case, just want to make sure that I did not accidentally miss something).
"no, there is no built in way to do that in Clojure" is also a valid answer
Yeah, unfortunately that's the answer. I'd say the best you can do is define a map-entry function yourself:
(defn map-entry [k v]
(clojure.lang.MapEntry/create k v))
Just specify a class name as follows
(clojure.lang.MapEntry. "key" "val")
or import the class to instantiate by a short name
(import (clojure.lang MapEntry))
(MapEntry. "key" "val")
As Rich Hickey says here: "I make no promises about the continued existence of MapEntry. Please don't use it." You should not attempt to directly instantiate an implementation class such clojure.lang.MapEntry. It's better to just use:
(defn map-entry [k v] (first {k v}))

In clojure, why does assoc require arguments in addition to a map, but dissoc not?

In clojure,
(assoc {})
throws an arity exception, but
(dissoc {})
does not. Why? I would have expected either both of them to throw an exception, or both to make no changes when no keys or values are provided.
EDIT: I see a rationale for allowing these forms; it means we can apply assoc or dissoc to a possibly empty list of arguments. I just don't see why one would be allowed and the other not, and I'm curious as to whether there's a good reason for this that I'm missing.
I personally think the lack of 1-arity assoc is an oversight: whenever a trailing list of parameters is expected (& stuff), the function should normally be capable of working with zero parameters in order to make it possible to apply it to an empty list.
Clojure has plenty of other functions that work correctly with zero arguments, e.g. + and merge.
On the other hand, Clojure has other functions that don't accept zero trailing parameters, e.g. conj.
So the Clojure API is a bit inconsistent in this regard.....
This is not an authoritative answer, but is based on my testing and looking at ClojureDocs:
dissoc 's arity includes your being able to pass in one argument, a map. No key/value is removed from the map, in that case.
(def test-map {:account-no 12345678 :lname "Jones" :fnam "Fred"})
(dissoc test-map)
{:account-no 12345678, :lname "Jones", :fnam "Fred"}
assoc has no similar arity. That is calling assoc requires a map, key, and value.
Now why this was designed this way is a different matter, and if you do not receive an answer with that information -- I hope you do -- then I suggest offering a bounty or go on Clojure's Google Groups and ask that question.
Here is the source.
(defn dissoc
"dissoc[iate]. Returns a new map of the same (hashed/sorted) type,
that does not contain a mapping for key(s)."
{:added "1.0"
:static true}
([map] map)
([map key]
(. clojure.lang.RT (dissoc map key)))
([map key & ks]
(let [ret (dissoc map key)]
(if ks
(recur ret (first ks) (next ks))
ret))))

What functions in Clojure core preserve meta?

Clojure meta is only preserved if a function takes care to do so and the Clojure core functions do not globally preserve meta. The general rule of thumb I've heard is that collection functions like conj, assoc, etc are supposed to preserve meta but sequence functions like map, filter, take, etc do not preserve meta.
Is there a list somewhere of what functions DO preserve meta?
It's all about the types. Sequence functions act like they call seq on their argument and thus doesn't always return the same type of object. Collection functions and type-specific functions doesn't call seq and return an object of the same type as what was given to them. It kind of make them give the illusion of returning the same object (this might be the reasoning for this behavior) even if that's really not the case. We could say that the rule of thumb is that a functions preserve the meta when it preserve the type.
user> (meta (seq (with-meta (list 1) {:a 1})))
{:a 1}
user> (meta (seq (with-meta (vector 1) {:a 1})))
nil
Be sure to be aware of when laziness is involved tough:
user> (type (list 1))
clojure.lang.PersistentList
user> (type (map identity (list 1)))
clojure.lang.LazySeq
user> (meta (seq (with-meta (map identity (list 1)) {:a 1})))
nil
For a list of functions that preserves meta on collection, see the data structures page. The ones not preserving meta are under the sequences page, with the exception of when they return an object of the same type.
Under the hood I'm not quite sure about the details since laziness and chunked sequence has been added, but you can look at the cons, seq and seqFrom methods from the RT class. The functions not preserving meta-data go through these methods. While the collection functions end up using methods specific to their types.

Common programming mistakes for Clojure developers to avoid [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
What are some common mistakes made by Clojure developers, and how can we avoid them?
For example; newcomers to Clojure think that the contains? function works the same as java.util.Collection#contains. However, contains? will only work similarly when used with indexed collections like maps and sets and you're looking for a given key:
(contains? {:a 1 :b 2} :b)
;=> true
(contains? {:a 1 :b 2} 2)
;=> false
(contains? #{:a 1 :b 2} :b)
;=> true
When used with numerically indexed collections (vectors, arrays) contains? only checks that the given element is within the valid range of indexes (zero-based):
(contains? [1 2 3 4] 4)
;=> false
(contains? [1 2 3 4] 0)
;=> true
If given a list, contains? will never return true.
Literal Octals
At one point I was reading in a matrix which used leading zeros to maintain proper rows and columns. Mathematically this is correct, since leading zero obviously don't alter the underlying value. Attempts to define a var with this matrix, however, would fail mysteriously with:
java.lang.NumberFormatException: Invalid number: 08
which totally baffled me. The reason is that Clojure treats literal integer values with leading zeros as octals, and there is no number 08 in octal.
I should also mention that Clojure supports traditional Java hexadecimal values via the 0x prefix. You can also use any base between 2 and 36 by using the "base+r+value" notation, such as 2r101010 or 36r16 which are 42 base ten.
Trying to return literals in an anonymous function literal
This works:
user> (defn foo [key val]
{key val})
#'user/foo
user> (foo :a 1)
{:a 1}
so I believed this would also work:
(#({%1 %2}) :a 1)
but it fails with:
java.lang.IllegalArgumentException: Wrong number of args passed to: PersistentArrayMap
because the #() reader macro gets expanded to
(fn [%1 %2] ({%1 %2}))
with the map literal wrapped in parenthesis. Since it's the first element, it's treated as a function (which a literal map actually is), but no required arguments (such as a key) are provided. In summary, the anonymous function literal does not expand to
(fn [%1 %2] {%1 %2}) ; notice the lack of parenthesis
and so you can't have any literal value ([], :a, 4, %) as the body of the anonymous function.
Two solutions have been given in the comments. Brian Carper suggests using sequence implementation constructors (array-map, hash-set, vector) like so:
(#(array-map %1 %2) :a 1)
while Dan shows that you can use the identity function to unwrap the outer parenthesis:
(#(identity {%1 %2}) :a 1)
Brian's suggestion actually brings me to my next mistake...
Thinking that hash-map or array-map determine the unchanging concrete map implementation
Consider the following:
user> (class (hash-map))
clojure.lang.PersistentArrayMap
user> (class (hash-map :a 1))
clojure.lang.PersistentHashMap
user> (class (assoc (apply array-map (range 2000)) :a :1))
clojure.lang.PersistentHashMap
While you generally won't have to worry about the concrete implementation of a Clojure map, you should know that functions which grow a map - like assoc or conj - can take a PersistentArrayMap and return a PersistentHashMap, which performs faster for larger maps.
Using a function as the recursion point rather than a loop to provide initial bindings
When I started out, I wrote a lot of functions like this:
; Project Euler #3
(defn p3
([] (p3 775147 600851475143 3))
([i n times]
(if (and (divides? i n) (fast-prime? i times)) i
(recur (dec i) n times))))
When in fact loop would have been more concise and idiomatic for this particular function:
; Elapsed time: 387 msecs
(defn p3 [] {:post [(= % 6857)]}
(loop [i 775147 n 600851475143 times 3]
(if (and (divides? i n) (fast-prime? i times)) i
(recur (dec i) n times))))
Notice that I replaced the empty argument, "default constructor" function body (p3 775147 600851475143 3) with a loop + initial binding. The recur now rebinds the loop bindings (instead of the fn parameters) and jumps back to the recursion point (loop, instead of fn).
Referencing "phantom" vars
I'm speaking about the type of var you might define using the REPL - during your exploratory programming - then unknowingly reference in your source. Everything works fine until you reload the namespace (perhaps by closing your editor) and later discover a bunch of unbound symbols referenced throughout your code. This also happens frequently when you're refactoring, moving a var from one namespace to another.
Treating the for list comprehension like an imperative for loop
Essentially you're creating a lazy list based on existing lists rather than simply performing a controlled loop. Clojure's doseq is actually more analogous to imperative foreach looping constructs.
One example of how they're different is the ability to filter which elements they iterate over using arbitrary predicates:
user> (for [n '(1 2 3 4) :when (even? n)] n)
(2 4)
user> (for [n '(4 3 2 1) :while (even? n)] n)
(4)
Another way they're different is that they can operate on infinite lazy sequences:
user> (take 5 (for [x (iterate inc 0) :when (> (* x x) 3)] (* 2 x)))
(4 6 8 10 12)
They also can handle more than one binding expression, iterating over the rightmost expression first and working its way left:
user> (for [x '(1 2 3) y '(\a \b \c)] (str x y))
("1a" "1b" "1c" "2a" "2b" "2c" "3a" "3b" "3c")
There's also no break or continue to exit prematurely.
Overuse of structs
I come from an OOPish background so when I started Clojure my brain was still thinking in terms of objects. I found myself modeling everything as a struct because its grouping of "members", however loose, made me feel comfortable. In reality, structs should mostly be considered an optimization; Clojure will share the keys and some lookup information to conserve memory. You can further optimize them by defining accessors to speed up the key lookup process.
Overall you don't gain anything from using a struct over a map except for performance, so the added complexity might not be worth it.
Using unsugared BigDecimal constructors
I needed a lot of BigDecimals and was writing ugly code like this:
(let [foo (BigDecimal. "1") bar (BigDecimal. "42.42") baz (BigDecimal. "24.24")]
when in fact Clojure supports BigDecimal literals by appending M to the number:
(= (BigDecimal. "42.42") 42.42M) ; true
Using the sugared version cuts out a lot of the bloat. In the comments, twils mentioned that you can also use the bigdec and bigint functions to be more explicit, yet remain concise.
Using the Java package naming conversions for namespaces
This isn't actually a mistake per se, but rather something that goes against the idiomatic structure and naming of a typical Clojure project. My first substantial Clojure project had namespace declarations - and corresponding folder structures - like this:
(ns com.14clouds.myapp.repository)
which bloated up my fully-qualified function references:
(com.14clouds.myapp.repository/load-by-name "foo")
To complicate things even more, I used a standard Maven directory structure:
|-- src/
| |-- main/
| | |-- java/
| | |-- clojure/
| | |-- resources/
| |-- test/
...
which is more complex than the "standard" Clojure structure of:
|-- src/
|-- test/
|-- resources/
which is the default of Leiningen projects and Clojure itself.
Maps utilize Java's equals() rather than Clojure's = for key matching
Originally reported by chouser on IRC, this usage of Java's equals() leads to some unintuitive results:
user> (= (int 1) (long 1))
true
user> ({(int 1) :found} (int 1) :not-found)
:found
user> ({(int 1) :found} (long 1) :not-found)
:not-found
Since both Integer and Long instances of 1 are printed the same by default, it can be difficult to detect why your map isn't returning any values. This is especially true when you pass your key through a function which, perhaps unbeknownst to you, returns a long.
It should be noted that using Java's equals() instead of Clojure's = is essential for maps to conform to the java.util.Map interface.
I'm using Programming Clojure by Stuart Halloway, Practical Clojure by Luke VanderHart, and the help of countless Clojure hackers on IRC and the mailing list to help along my answers.
Forgetting to force evaluation of lazy seqs
Lazy seqs aren't evaluated unless you ask them to be evaluated. You might expect this to print something, but it doesn't.
user=> (defn foo [] (map println [:foo :bar]) nil)
#'user/foo
user=> (foo)
nil
The map is never evaluated, it's silently discarded, because it's lazy. You have to use one of doseq, dorun, doall etc. to force evaluation of lazy sequences for side-effects.
user=> (defn foo [] (doseq [x [:foo :bar]] (println x)) nil)
#'user/foo
user=> (foo)
:foo
:bar
nil
user=> (defn foo [] (dorun (map println [:foo :bar])) nil)
#'user/foo
user=> (foo)
:foo
:bar
nil
Using a bare map at the REPL kind of looks like it works, but it only works because the REPL forces evaluation of lazy seqs itself. This can make the bug even harder to notice, because your code works at the REPL and doesn't work from a source file or inside a function.
user=> (map println [:foo :bar])
(:foo
:bar
nil nil)
I'm a Clojure noob. More advanced users may have more interesting problems.
trying to print infinite lazy sequences.
I knew what I was doing with my lazy sequences, but for debugging purposes I inserted some print/prn/pr calls, temporarily having forgotten what it is I was printing. Funny, why's my PC all hung up?
trying to program Clojure imperatively.
There is some temptation to create a whole lot of refs or atoms and write code that constantly mucks with their state. This can be done, but it's not a good fit. It may also have poor performance, and rarely benefit from multiple cores.
trying to program Clojure 100% functionally.
A flip side to this: Some algorithms really do want a bit of mutable state. Religiously avoiding mutable state at all costs may result in slow or awkward algorithms. It takes judgement and a bit of experience to make the decision.
trying to do too much in Java.
Because it's so easy to reach out to Java, it's sometimes tempting to use Clojure as a scripting language wrapper around Java. Certainly you'll need to do exactly this when using Java library functionality, but there's little sense in (e.g.) maintaining data structures in Java, or using Java data types such as collections for which there are good equivalents in Clojure.
Lots of things already mentioned. I'll just add one more.
Clojure if treats Java Boolean objects always as true even if it's value is false. So if you have a java land function that returns a java Boolean value, make sure you do not check it directly
(if java-bool "Yes" "No")
but rather
(if (boolean java-bool) "Yes" "No").
I got burned by this with clojure.contrib.sql library that returns database boolean fields as java Boolean objects.
Keeping your head in loops.
You risk running out of memory if you loop over the elements of a potentially very large, or infinite, lazy sequence while keeping a reference to the first element.
Forgetting there's no TCO.
Regular tail-calls consume stack space, and they will overflow if you're not careful. Clojure has 'recur and 'trampoline to handle many of the cases where optimized tail-calls would be used in other languages, but these techniques have to be intentionally applied.
Not-quite-lazy sequences.
You may build a lazy sequence with 'lazy-seq or 'lazy-cons (or by building upon higher level lazy APIs), but if you wrap it in 'vec or pass it through some other function that realizes the sequence, then it will no longer be lazy. Both the stack and the heap can be overflown by this.
Putting mutable things in refs.
You can technically do it, but only the object reference in the ref itself is governed by the STM - not the referred object and its fields (unless they are immutable and point to other refs). So whenever possible, prefer to only immutable objects in refs. Same thing goes for atoms.
using loop ... recur to process sequences when map will do.
(defn work [data]
(do-stuff (first data))
(recur (rest data)))
vs.
(map do-stuff data)
The map function (in the latest branch) uses chunked sequences and many other optimizations. Also, because this function is frequently run, the Hotspot JIT usually has it optimized and ready to go with out any "warm up time".
Collection types have different behaviors for some operations:
user=> (conj '(1 2 3) 4)
(4 1 2 3) ;; new element at the front
user=> (conj [1 2 3] 4)
[1 2 3 4] ;; new element at the back
user=> (into '(3 4) (list 5 6 7))
(7 6 5 3 4)
user=> (into [3 4] (list 5 6 7))
[3 4 5 6 7]
Working with strings can be confusing (I still don't quite get them). Specifically, strings are not the same as sequences of characters, even though sequence functions work on them:
user=> (filter #(> (int %) 96) "abcdABCDefghEFGH")
(\a \b \c \d \e \f \g \h)
To get a string back out, you'd need to do:
user=> (apply str (filter #(> (int %) 96) "abcdABCDefghEFGH"))
"abcdefgh"
too many parantheses, especially with void java method call inside which results in NPE:
public void foo() {}
((.foo))
results in NPE from outer parantheses because inner parantheses evaluate to nil.
public int bar() { return 5; }
((.bar))
results in the easier to debug:
java.lang.Integer cannot be cast to clojure.lang.IFn
[Thrown class java.lang.ClassCastException]

safely parsing maps in clojure

I'm looking for an easy and safe way to parse a map, and only a map, from a string supplied by an untrusted source. The map contains keywords and numbers. What are the security concerns of using read to do this?
read is by default totally unsafe, it allows arbitrary code execution. Try (read-string "#=(println \"hello\")") as an example.
You can make it safer by binding *read-eval* to false. This will cause an exception to be triggered if there #= notation is used. For example:
(binding [*read-eval* false] (read-string "#=(println \"hello\")"))
Finally, depending on how you are using it there is a potential denial of service attack by supplying a large number of keywords (:foo, :bar). Keywords are interned and never freed so if enough are used the process will run out of memory. There's some discussion about that on the clojure-dev list.
If you want to be safe I think you basically need to parse it by hand without doing an eval. Here is an example of one way to do it:
(apply hash-map
(map #(%1 %2)
(cycle [#(keyword (apply str (drop 1 %)))
#(Integer/parseInt %)])
(string/split ":a 23 :b 32 :c 32" #" ")))
Depending on the types of numbers you want to support and how much error checking you want to do you will want to modify the two functions that are being cycled over to process every map every other value to a keyword or number.