I'm starting to use clojure to test a java library. So my question is more "what is the clojure way of doing this".
I have a lot of code that looks like the following
...
(with-open [fs (create-filesystem)]
(let [root (.getFile fs path)]
(is (.isDirectory root))
(is (.isComplete root)))
(let [suc (.getFile fs (str path "/_SUCCESS"))]
(is (.isFile suc))
(is (.isComplete suc))))
Since this is a java object, I need to verify a set of properties are true on the object. I know about doto that will let me do things like
(doto (.getFile fs path)
(.setPath path2)
(.setName name2))
and --> will let me have a list of partial functions and have each result pass through each function. Been thinking that something like --> but keeps passing the same object like doto would help with these tests. Would something like this be a good way to do this, or am I not really doing this the clojure way?
Thanks for your time!
You can use doto:
(doto (.getFile fs path) (.setPath path2) (.setPath name2))
You may learn about other threading macros like cond->, as->, some-> etc... for "the Clojure way" purpose.
There's no macro that does exactly what you wanted. Only .. and doto are object-specific macros.
Because your question is more "the Clojure way" thing, I recommend this:
https://github.com/bbatsov/clojure-style-guide
Yes, you can use the .. macro for java interop with threading semantics.
Example:
(.. (SomeObject/newBuilder)
(withId 1)
(withFooEnabled true)
(build))
There is nothing wrong with using a let as you have; it is a very direct and clear implementation of what you want to do. I think it is the best approach :)
But it is interesting to examine how you could achieve this with Clojure built in macros too:
(doto (.getFile fs path)
(-> (.isDirectory) (is))
(-> (.isComplete) (is)))
doto is special in that it propagates the initial value. It takes subsequent forms and inserts the initial value into the second position of the forms.
-> allows you to arrange forms inside out... which means you can combine it with doto such that the initial value is threaded through whatever layers of forms you would like to execute on it.
Now doto and -> are macros, so the syntax can be a little confusing... if you prefer to use functions then juxt is handy for calling multiple functions:
((juxt #(is (.isDirectory %)) #(is (.isComplete))) (.getFile fs path))
But for this situation it's quite ugly :)
Coming back to your original question, you can of course also create your own syntax with a macro... I describe one approach here: http://timothypratley.blogspot.pt/2016/05/composing-test-assertions-in-pipeline.html which chains multiple assertions in a convenient way.
I think for the example you describe sticking with a let form is the best way as it is most obvious... the other forms are shown here to discuss the options available, some of which might be appropriate in other circumstances.
Related
Visually, with-open looks similar to let. I know with-open is for a different purpose but I cannot find a clear answer as to what with-open is doing. And, what is the first argument in with-open?
The documentation says this:
"bindings => [name init ...]
Evaluates body in a try expression with names bound to the values
of the inits, and a finally clause that calls (.close name) on each
name in reverse order."
I do not understand this. I would really appreciate if someone explained what with-open actually does?
what does it do?
A macroexpanded example, with some formatting and after removing unnecessary explicit usages of clojure.core/...:
(macroexpand-1
'(with-open [reader (some-fn-that-creates-a-reader)]
(read-stuff reader)))
=>
(let [reader (some-fn-that-creates-a-reader)]
(try
;; This `with-let` is just a no-op, like `(let [] ...)` - an artifact of the implementation.
(with-open [] (read-stuff reader))
(finally (. reader close))))
As you can see, it's exactly the same as let, but it wraps the body in a try form and closes the values provided in the binding vector in the finally form.
when should I use it?
When you need for something to be closed at the end, regardless of whether the code in the body was successful or not.
It's a common pattern for reading/writing files with an explicit reader/writer or for using other IO that needs to be opened and closed explicitly.
I'm quite new to functional programming and clojure. I like it.
I'd like to know what the community thinks about following fn-naming approach:
Is it a feasible way to go with such naming, or is it for some reason something to avoid?
Example:
(defn coll<spec>?
"Checks wether the given thing is a collection of elements of ::spec-type"
[spec-type thing]
(and (coll? thing)
(every? #(spec/valid? spec-type %)
thing)))
(defn coll<spec>?nil
"Checks if the given thing is a collection of elements of ::spec-type or nil"
[spec-type thing]
(or (nil? thing)
(coll<spec>? spec-type thing)))
now ... somewhere else I'm using partial applications in clojure defrecrod/specs ...
similar to this:
; somewhere inside a defrecord
^{:spec (partial p?/coll<spec>?nil ::widgets)} widgets component-widgets
now, I created this for colls, sets, hash-maps with a generic-like form for specs (internal application of spec/valid? ...) and with a direct appication of predicates and respectively a <p> instead of the <spec>
currently I'm discussing with my colleagues wether this is a decent and meaningful and useful approach - since it is at least valid to name function like this in clojure - or wether the community thinks this is rather a no-go.
I'd very much like your educated opinions on that.
It's a weird naming convention, but a lot of projects use weird naming conventions because their authors believe they help. I imagine if I read your codebase, variable naming conventions would probably not be the thing that surprises me most. This is not an insult: every project has some stuff that surprises me.
But I don't see why you need these specific functions at all. As Sean Corfield says in a comment, there are good spec combinator functions to build fancier specs out of simpler ones. Instead of writing coll<spec>?, you could just combine s/valid? with s/coll-of:
(s/valid? (s/coll-of spec-type) thing)
Likewise you can use s/or and nil? to come up with coll<spec>?nil.
(s/valid? (s/or nil? (s/coll-of spec-type) thing))
This is obviously more to write at each call site than a mere coll<spec>?nil, so you may dismiss my advice. But the point is you can extract functions for defining these specs, and use those specs.
(defn nil-or-coll-of [spec]
(s/or nil? (s/coll-of spec)))
(s/valid? (nil-or-coll-of spec-type) thing)
Importantly, this means you could pass the result of nil-or-coll-of to some other function that expects a spec, perhaps to build a larger spec out of it, or to try to conform it. You can't do that if all your functions have s/valid? baked into them.
Think in terms of composing general functions, not defining a new function from scratch for each thing you want to do.
I want to 1) create a list of symbols with the function below; then 2) create atoms with these symbols/names so that the atoms can be modified from other functions. This is the function to generate symbols/names:
(defn genVars [ dist ]
(let [ nms (map str (range dist)) neigs (map #(apply str "neig" %) nms) ]
(doseq [ v neigs ]
(intern *ns* (symbol v) [ ] ))
))
If dist=3, then 3 symbols, neig0, ... neig2 are created each bound with an empty vector. If it is possible to functionally create atoms with these symbols so that they are accessible from other functions. Any help is much appreciated, even if there are other ways to accomplish this.
your function seems to be correct, just wrap the value in the intern call with atom call. Also I would rather use dotimes.
user>
(defn gen-atoms [amount prefix]
(dotimes [i amount]
(intern *ns* (symbol (str prefix i)) (atom []))))
#'user/gen-atoms
user> (gen-atoms 2 "x")
nil
user> x0
#atom[[] 0x30f1a7b]
user> x1
#atom[[] 0x2149efef]
The desire to generate names suggests you would be better served by a single map instead:
(def neighbours (atom (make-neighbours)))
Where the definition of make-neigbours might look something like this:
(defn make-neighbours []
(into {} (for [i (range 10)]
[(str "neig" i) {:age i}])))
Where the other namespace would look values up using something like:
(get-in #data/neighbours ["neig0" :age])
Idiomatic Clojure tends to avoid creating many named global vars, preferring instead to collocating state into one or a few vars governed by Clojure's concurrency primitives (atom/ref/agent). I encourage you to think about whether your problem can be solved with a single atom in this way instead of requiring defining multiple vars.
Having said that, if you really really need multiple atoms, consider storing them all in a single map var instead of creating many global vars. Personally, I have never encountered a situation where creating many atoms was better than a single big atom (so I would be interested to hear about situations where this would be important).
If you really really need many vars, be aware that defining vars inside a function is actually bad style (https://github.com/bbatsov/clojure-style-guide#dont-def-vars-inside-fns). With good reason too! The beauty of using functions and data comes from the purity of the functions. def inside a function is particularly nasty as it is not only a side-effect, but is an potentially execution flow altering side-effect.
Of course yes there is a way to achieve it, as another answer points out.
Where it comes to defining things that goes beyond def and defn, there is quite a lot of precedence to using macros. For example defroutes from compojure, defschema from Schema, deftest from clojure.test. Generally anything that is a convenience form for creating vars. You could use a macro solution to create defs for your atoms:
(defmacro defneighbours [n]
`(do
~#(for [sym (for [i (range n)]
(symbol (str "neig" i)))]
`(def ~sym (atom {}))))
In my opinion this is actually less offensive than a functional version, only because it is creating global defs. It is a little more obvious about creating global defs by using the regular def syntax. But I only bring it up as a strawman, because this is still bad.
The reason functions and data work best is because they compose.
There are tangible considerations that make a single atom governing state very convenient. You can iterate over all neighbors conveniently, you can add new ones dynamically. Also you can do things like concatenating neighbors with other neighbors etc. Basically there are lots of function/data abstractions that you lock yourself out of if you create many global vars.
This is the reason that macros are generally considered useful for syntactic tricks, but best avoided in favor of functions and data. And it has a real impact on the flexibility of your code. For example going back to compojure; the macro syntax is actually very limiting, and for that reason I prefer not to use defroutes at all.
In summary:
Don't make lots of global defs if you can avoid it.
Prefer 1 atom over many atoms where possible.
Don't def inside a function.
Macros are best avoided in favor of functions and data.
Regardless of these guidelines, it is always good to explore what is possible, and I can't know your circumstances, so above all I hope you overcome your immediate problem and find Clojure a pleasant language to use.
I have a defrecord called a bag. It behaves like a list of item to count. This is sometimes called a frequency or a census. I want to be able to do the following
(def b (bag/create [:k 1 :k2 3])
(keys bag)
=> (:k :k1)
I tried the following:
(defrecord MapBag [state]
Bag
(put-n [self item n]
(let [new-n (+ n (count self item))]
(MapBag. (assoc state item new-n))))
;... some stuff
java.util.Map
(getKeys [self] (keys state)) ;TODO TEST
Object
(toString [self]
(str ("Bag: " (:state self)))))
When I try to require it in a repl I get:
java.lang.ClassFormatError: Duplicate interface name in class file compile__stub/techne/bag/MapBag (bag.clj:12)
What is going on? How do I get a keys function on my bag? Also am I going about this the correct way by assuming clojure's keys function eventually calls getKeys on the map that is its argument?
Defrecord automatically makes sure that any record it defines participates in the ipersistentmap interface. So you can call keys on it without doing anything.
So you can define a record, and instantiate and call keys like this:
user> (defrecord rec [k1 k2])
user.rec
user> (def a-rec (rec. 1 2))
#'user/a-rec
user> (keys a-rec)
(:k1 :k2)
Your error message indicates that one of your declarations is duplicating an interface that defrecord gives you for free. I think it might actually be both.
Is there some reason why you cant just use a plain vanilla map for your purposes? With clojure, you often want to use plain vanilla data structures when you can.
Edit: if for whatever reason you don't want the ipersistentmap included, look into deftype.
Rob's answer is of course correct; I'm posting this one in response to the OP's comment on it -- perhaps it might be helpful in implementing the required functionality with deftype.
I have once written an implementation of a "default map" for Clojure, which acts just like a regular map except it returns a fixed default value when asked about a key not present inside it. The code is in this Gist.
I'm not sure if it will suit your use case directly, although you can use it to do things like
user> (:earth (assoc (DefaultMap. 0 {}) :earth 8000000000))
8000000000
user> (:mars (assoc (DefaultMap. 0 {}) :earth 8000000000))
0
More importantly, it should give you an idea of what's involved in writing this sort of thing with deftype.
Then again, it's based on clojure.core/emit-defrecord, so you might look at that part of Clojure's sources instead... It's doing a lot of things which you won't have to (because it's a function for preparing macro expansions -- there's lots of syntax-quoting and the like inside it which you have to strip away from it to use the code directly), but it is certainly the highest quality source of information possible. Here's a direct link to that point in the source for the 1.2.0 release of Clojure.
Update:
One more thing I realised might be important. If you rely on a special map-like type for implementing this sort of thing, the client might merge it into a regular map and lose the "defaulting" functionality (and indeed any other special functionality) in the process. As long as the "map-likeness" illusion maintained by your type is complete enough for it to be used as a regular map, passed to Clojure's standard function etc., I think there might not be a way around that.
So, at some level the client will probably have to know that there's some "magic" involved; if they get correct answers to queries like (:mars {...}) (with no :mars in the {...}), they'll have to remember not to merge this into a regular map (merge-ing the other way around would work fine).
I've found myself using the following idiom lately in clojure code.
(def *some-global-var* (ref {}))
(defn get-global-var []
#*global-var*)
(defn update-global-var [val]
(dosync (ref-set *global-var* val)))
Most of the time this isn't even multi-threaded code that might need the transactional semantics that refs give you. It just feels like refs are for more than threaded code but basically for any global that requires immutability. Is there a better practice for this? I could try to refactor the code to just use binding or let but that can get particularly tricky for some applications.
I always use an atom rather than a ref when I see this kind of pattern - if you don't need transactions, just a shared mutable storage location, then atoms seem to be the way to go.
e.g. for a mutable map of key/value pairs I would use:
(def state (atom {}))
(defn get-state [key]
(#state key))
(defn update-state [key val]
(swap! state assoc key val))
Your functions have side effects. Calling them twice with the same inputs may give different return values depending on the current value of *some-global-var*. This makes things difficult to test and reason about, especially once you have more than one of these global vars floating around.
People calling your functions may not even know that your functions are depending on the value of the global var, without inspecting the source. What if they forget to initialize the global var? It's easy to forget. What if you have two sets of code both trying to use a library that relies on these global vars? They are probably going to step all over each other, unless you use binding. You also add overheads every time you access data from a ref.
If you write your code side-effect free, these problems go away. A function stands on its own. It's easy to test: pass it some inputs, inspect the outputs, they'll always be the same. It's easy to see what inputs a function depends on: they're all in the argument list. And now your code is thread-safe. And probably runs faster.
It's tricky to think about code this way if you're used to the "mutate a bunch of objects/memory" style of programming, but once you get the hang of it, it becomes relatively straightforward to organize your programs this way. Your code generally ends up as simple as or simpler than the global-mutation version of the same code.
Here's a highly contrived example:
(def *address-book* (ref {}))
(defn add [name addr]
(dosync (alter *address-book* assoc name addr)))
(defn report []
(doseq [[name addr] #*address-book*]
(println name ":" addr)))
(defn do-some-stuff []
(add "Brian" "123 Bovine University Blvd.")
(add "Roger" "456 Main St.")
(report))
Looking at do-some-stuff in isolation, what the heck is it doing? There are a lot of things happening implicitly. Down this path lies spaghetti. An arguably better version:
(defn make-address-book [] {})
(defn add [addr-book name addr]
(assoc addr-book name addr))
(defn report [addr-book]
(doseq [[name addr] addr-book]
(println name ":" addr)))
(defn do-some-stuff []
(let [addr-book (make-address-book)]
(-> addr-book
(add "Brian" "123 Bovine University Blvd.")
(add "Roger" "456 Main St.")
(report))))
Now it's clear what do-some-stuff is doing, even in isolation. You can have as many address books floating around as you want. Multiple threads could have their own. You can use this code from multiple namespaces safely. You can't forget to initialize the address book, because you pass it as an argument. You can test report easily: just pass the desired "mock" address book in and see what it prints. You don't have to care about any global state or anything but the function you're testing at the moment.
If you don't need to coordinate updates to a data structure from multiple threads, there's usually no need to use refs or global vars.