case or defmulti for closed-world, value-based dispatch? - clojure

Suppose I have a closed world of valid dispatch keys; in my concrete example, it's the type of nybbles. There are two obvious ways to define an operation of some parameters that behaves differently based on a nybble argument:
Using case, e.g.:
(defn read-arg [arg-mode mem cursor]
(case arg-mode
0x0 [:imm 0]
0x1 [:imm (read-fwd peek-word8 mem cursor)]
;; And so on
0xf [:ram (read-fwd peek-word32 mem cursor)]))
Using defmulti/defmethod:
(defmulti read-arg (fn [arg-mode mem cursor] arg-mode))
(defmethod read-arg 0x0 [_ mem cursor] [:imm 0])
(defmethod read-arg 0x1 [_ mem cursor] [:imm (read-fwd peek-word8 mem cursor)])
;; And so on
(defmethod read-arg 0xf [_ mem cursor] [:ram (read-fwd peek-word32 mem cursor)])
Which is considered nicer style for Clojure? Would the answer be different if the dispatch was done on symbols instead of nybbles?

I would expect the dispatch itself to be so much faster in this case using case as to render any stylistic considerations moot – dispatching based on an integer value in a narrow range is the best case for case, really – and then I actually think that case is also better style (this really does look like closed-world dispatch, so it's good to use a mechanism that signals that).
As a further comment, it seems like the case "result expressions" will be fairly short here, which is good. If not, then it might be worth factoring them out into their own functions to prevent the method performing the dispatch from growing too large. (The JIT will then be able to inline just those portions it finds are worth inlining.)
As for whether I'd feel different if symbols where involved, that depends on a number of factors – performance (is it important? is there a big difference in benchmarks?), number and size of branches (it's good to keep overall method size small – for performance and readability reasons – so the ability to define each branch in a separate defmethod can be convenient) etc.

Related

Clojure pipe collection one by one

How in Clojure process collections like in Java streams - one by one thru all the functions instead of evaluating all the elements in all the stack frame. Also I would describe it as Unix pipes (next program pulls chunk by chunk from previous one).
As far as I understand your question, you may want to look into two things.
First, understand the sequence abstraction. This is a way of looking at collections which consumes them one by one and lazily. It is an important Clojure idiom and you'll meet well known functions like map, filter, reduce, and many more. Also the macro ->>, which was already mentioned in a comment, will be important.
After that, when you want to dig deeper, you probably want to look into transducers and reducers. In a grossly oversimplifying summary, they allow you combine several lazy functions into one function and then process a collection with less laziness, less memory consumption, more performance, and possibly on several threads. I consider these to be advanced topics, though. Maybe the sequences are already what you were looking for.
Here is a simple example from ClojureDocs.org
;; Use of `->` (the "thread-first" macro) can help make code
;; more readable by removing nesting. It can be especially
;; useful when using host methods:
;; Arguably a bit cumbersome to read:
user=> (first (.split (.replace (.toUpperCase "a b c d") "A" "X") " "))
"X"
;; Perhaps easier to read:
user=> (-> "a b c d"
.toUpperCase
(.replace "A" "X")
(.split " ")
first)
"X"
As always, don't forget the Clojure CheatSheet or Clojure for the Brave and True.

purpose of clojure reduced function

What is the purpose of the clojure reduced function (added in clojure 1.5, https://clojure.github.io/clojure/clojure.core-api.html#clojure.core/reduced)
I can't find any examples for it. The doc says:
Wraps x in a way such that a reduce will terminate with the value x.
There is also a reduced? which is acquainted to it
Returns true if x is the result of a call to reduced
When I try it out, e.g with (reduce + (reduced 100)), I get an error instead of 100. Also why would I reduce something when I know the result in advance? Since it was added there is likely a reason, but googling for clojure reduced only contains reduce results.
reduced allows you to short circuit a reduction:
(reduce (fn [acc x]
(if (> acc 10)
(reduced acc)
(+ acc x)))
0
(range 100))
;= 15
(NB. the edge case with (reduced 0) passed in as the initial value doesn't work as of Clojure 1.6.)
This is useful, because reduce-based looping is both very elegant and very performant (so much so that reduce-based loops are not infrequently more performant than the "natural" replacements based on loop/recur), so it's good to make this pattern as broadly applicable as possible. The ability to short circuit reduce vastly increases the range of possible applications.
As for reduced?, I find it useful primarily when implementing reduce logic for new data structures; in regular code, I let reduce perform its own reduced? checks where appropriate.

Is immutability in clojure different than pass-by-value?

I'm just getting started with Clojure and I have no fp experience but the first thing that I've noticed is a heavy emphasis on immutability. I'm a bit confused by the emphasis, however. It looks like you can re-def global variables easily, essentially giving you a way to change state. The most significant difference that I can see is that function arguments are passed by value and can't be re-def(ined) within the function. Here's a repl snippet that shows what I mean:
towers.core=> (def a "The initial string")
#'towers.core/a
towers.core=> a
"The initial string"
towers.core=> (defn mod_a [aStr]
#_=> (prn aStr)
#_=> (prn a)
#_=> (def aStr "A different string")
#_=> (def a "A More Different string")
#_=> (prn aStr)
#_=> (prn a))
#'towers.core/mod_a
towers.core=> a
"The initial string"
towers.core=> (mod_a a)
"The initial string"
"The initial string"
"The initial string"
"A More Different string"
nil
towers.core=> a
"A More Different string"
If I begin my understanding of immutability in clojure by thinking of it as pass-by-value, what am I missing?
Call-by-value and immutability are two entirely distinct concepts. Indeed, one of the advantages of variable immutability is that such variables could be passed by name or reference without any effect on programme behaviour.
In short: don't think of them as linked.
generally very little is "def"d in a clojure script/class, it's mostly used for generating values that are used outside of the class. instead values are created in let bindings as you need them in your methods.
def is used to define vars, as stated in Clojure Programming:
top level functions and values are all stored in vars, which are
defined within the current namespace using the def special form or one
of its derivatives.
Your use of def inside a function isn't making a local variable, it's creating a new global var, and you're effectively replacing the old reference with a new one each time.
When you move onto using let, you'll see how immutability works, for instance using things like seqs which can be used over without penalty of something else having also read them (like an iteration over a list would in java for instance), e.g.
(let [myseq (seq [1 2 3 4 5])
f (first myseq)
s (second myseq)
sum (reduce + myseq)]
(println f s sum))
;; 1 2 15
As you can see, it doesn't matter that (first myseq) has "taken" an item from the sequence. because the sequence myseq is immutable, it's still the same, and unaffected by the operations on it. Also, notice that there isn't a single def in the code above, the assignment happened in the let bindings where the values myseq, f, s and sum were created (and are immutable within the rest of the sexp).
Yes, immutability is different from pass-by-value, and you've missed a couple of important details of what's going on in your examples:
value mutation versus variable re-binding. Your code exemplifies re-binding, but doesn't actually mutate values.
shadowing. Your local aStr shadows your global aStr, so you can't see the global one -- although it's still there -- so there's no difference between the effects of (def a ...) and (def aStr ...) here. You can verify that the global is created after running your function.
A final point: Clojure doesn't force you to be purely functional -- it has escape hatches, and it's up to you to use them responsibly. Rebinding variables is one of those escape hatches.
just a note that technically Java, and by extension Clojure (on the JVM) is strictly pass by value. In many cases the thing passed is a reference to a structure that others may be reading, though because it is immutable nobody will be changing out from under you. The important point being that mutability and immutability happen after you pass the reference to something so, and Marcin points out they really are distinct.
I think of much of the immutability in Clojure as residing in (most of) the built-in data structure types and (most of) the functions that allow manipulating ... uh, no, modifying ... no, really, constructing new data structures from them. There are array-like things, but you can't modify them, there are lists, but you can't modify them, there are hash maps, but you can't modify them, etc., and the standard tools for using them actually create new data structures even when they look, to a novice, as if they're performing in-place modifications. And all of that does add up to a big difference.

Use cases for metadata in Clojure [duplicate]

How have you used metadata in your Clojure program?
I saw one example from Programming Clojure:
(defn shout [#^{:tag String} message] (.toUpperCase message))
;; Clojure casts message to String and then calls the method.
What are some uses? This form of programming is completely new to me.
Docstrings are stored as metadata under the :doc key. This is probably the number 1 most apparent use of metadata.
Return and parameter types can be optionally tagged with metadata to improve performance by avoiding the overhead of reflecting on the types at runtime. These are also known as "type hints." #^String is a type hint.
Storing things "under the hood" for use by the compiler, such as the arglist of a function, the line number where a var has been defined, or whether a var holds a reference to a macro. These are usually automatically added by the compiler and normally don't need to be manipulated directly by the user.
Creating simple testcases as part of a function definition:
(defn #^{:test (fn [] (assert true))} something [] nil)
(test #'something)
If you are reading Programming Clojure, then Chapter 2 provides a good intro to metadata. Figure 2.3 provides a good summary of common metadata.
For diversity some answer, which does not concentrate on interaction with the language itself:
You can also eg. track the source of some data. Unchecked input is marked as :tainted. A validator might check things and then set the status to :clean. Code doing security relevant things might then barf on :tainted and only accept :cleaned input.
Meta Data was extremely useful for me for purposes of typing. I'm talking not just about type hints, but about complete custom type system. Simplest example - overloading of print-method for structs (or any other var):
(defstruct my-struct :foo :bar :baz)
(defn make-my-struct [foo bar baz]
(with-meta (struct-map my-struct :foo foo :bar baz :baz baz)
{:type ::my-struct}))
(defmethod print-method
[my-struct writer]
(print-method ...))
In general, together with Clojure validation capabilities it may increase safety and, at the same time, flexibility of your code very very much (though it will take some more time to do actual coding).
For more ideas on typing see types-api.
metadata is used by the compiler extensively for things like storing the type of an object.
you use this when you give type hints
(defn foo [ #^String stringy] ....
I have used it for things like storing the amount of padding that was added to a number. Its intended for information that is 'orthogonal' to the data and should not be considered when deciding if you values are the same.

Best Practice for globals in clojure, (refs vs alter-var-root)?

I've found myself using the following idiom lately in clojure code.
(def *some-global-var* (ref {}))
(defn get-global-var []
#*global-var*)
(defn update-global-var [val]
(dosync (ref-set *global-var* val)))
Most of the time this isn't even multi-threaded code that might need the transactional semantics that refs give you. It just feels like refs are for more than threaded code but basically for any global that requires immutability. Is there a better practice for this? I could try to refactor the code to just use binding or let but that can get particularly tricky for some applications.
I always use an atom rather than a ref when I see this kind of pattern - if you don't need transactions, just a shared mutable storage location, then atoms seem to be the way to go.
e.g. for a mutable map of key/value pairs I would use:
(def state (atom {}))
(defn get-state [key]
(#state key))
(defn update-state [key val]
(swap! state assoc key val))
Your functions have side effects. Calling them twice with the same inputs may give different return values depending on the current value of *some-global-var*. This makes things difficult to test and reason about, especially once you have more than one of these global vars floating around.
People calling your functions may not even know that your functions are depending on the value of the global var, without inspecting the source. What if they forget to initialize the global var? It's easy to forget. What if you have two sets of code both trying to use a library that relies on these global vars? They are probably going to step all over each other, unless you use binding. You also add overheads every time you access data from a ref.
If you write your code side-effect free, these problems go away. A function stands on its own. It's easy to test: pass it some inputs, inspect the outputs, they'll always be the same. It's easy to see what inputs a function depends on: they're all in the argument list. And now your code is thread-safe. And probably runs faster.
It's tricky to think about code this way if you're used to the "mutate a bunch of objects/memory" style of programming, but once you get the hang of it, it becomes relatively straightforward to organize your programs this way. Your code generally ends up as simple as or simpler than the global-mutation version of the same code.
Here's a highly contrived example:
(def *address-book* (ref {}))
(defn add [name addr]
(dosync (alter *address-book* assoc name addr)))
(defn report []
(doseq [[name addr] #*address-book*]
(println name ":" addr)))
(defn do-some-stuff []
(add "Brian" "123 Bovine University Blvd.")
(add "Roger" "456 Main St.")
(report))
Looking at do-some-stuff in isolation, what the heck is it doing? There are a lot of things happening implicitly. Down this path lies spaghetti. An arguably better version:
(defn make-address-book [] {})
(defn add [addr-book name addr]
(assoc addr-book name addr))
(defn report [addr-book]
(doseq [[name addr] addr-book]
(println name ":" addr)))
(defn do-some-stuff []
(let [addr-book (make-address-book)]
(-> addr-book
(add "Brian" "123 Bovine University Blvd.")
(add "Roger" "456 Main St.")
(report))))
Now it's clear what do-some-stuff is doing, even in isolation. You can have as many address books floating around as you want. Multiple threads could have their own. You can use this code from multiple namespaces safely. You can't forget to initialize the address book, because you pass it as an argument. You can test report easily: just pass the desired "mock" address book in and see what it prints. You don't have to care about any global state or anything but the function you're testing at the moment.
If you don't need to coordinate updates to a data structure from multiple threads, there's usually no need to use refs or global vars.