Is immutability in clojure different than pass-by-value? - clojure

I'm just getting started with Clojure and I have no fp experience but the first thing that I've noticed is a heavy emphasis on immutability. I'm a bit confused by the emphasis, however. It looks like you can re-def global variables easily, essentially giving you a way to change state. The most significant difference that I can see is that function arguments are passed by value and can't be re-def(ined) within the function. Here's a repl snippet that shows what I mean:
towers.core=> (def a "The initial string")
#'towers.core/a
towers.core=> a
"The initial string"
towers.core=> (defn mod_a [aStr]
#_=> (prn aStr)
#_=> (prn a)
#_=> (def aStr "A different string")
#_=> (def a "A More Different string")
#_=> (prn aStr)
#_=> (prn a))
#'towers.core/mod_a
towers.core=> a
"The initial string"
towers.core=> (mod_a a)
"The initial string"
"The initial string"
"The initial string"
"A More Different string"
nil
towers.core=> a
"A More Different string"
If I begin my understanding of immutability in clojure by thinking of it as pass-by-value, what am I missing?

Call-by-value and immutability are two entirely distinct concepts. Indeed, one of the advantages of variable immutability is that such variables could be passed by name or reference without any effect on programme behaviour.
In short: don't think of them as linked.

generally very little is "def"d in a clojure script/class, it's mostly used for generating values that are used outside of the class. instead values are created in let bindings as you need them in your methods.
def is used to define vars, as stated in Clojure Programming:
top level functions and values are all stored in vars, which are
defined within the current namespace using the def special form or one
of its derivatives.
Your use of def inside a function isn't making a local variable, it's creating a new global var, and you're effectively replacing the old reference with a new one each time.
When you move onto using let, you'll see how immutability works, for instance using things like seqs which can be used over without penalty of something else having also read them (like an iteration over a list would in java for instance), e.g.
(let [myseq (seq [1 2 3 4 5])
f (first myseq)
s (second myseq)
sum (reduce + myseq)]
(println f s sum))
;; 1 2 15
As you can see, it doesn't matter that (first myseq) has "taken" an item from the sequence. because the sequence myseq is immutable, it's still the same, and unaffected by the operations on it. Also, notice that there isn't a single def in the code above, the assignment happened in the let bindings where the values myseq, f, s and sum were created (and are immutable within the rest of the sexp).

Yes, immutability is different from pass-by-value, and you've missed a couple of important details of what's going on in your examples:
value mutation versus variable re-binding. Your code exemplifies re-binding, but doesn't actually mutate values.
shadowing. Your local aStr shadows your global aStr, so you can't see the global one -- although it's still there -- so there's no difference between the effects of (def a ...) and (def aStr ...) here. You can verify that the global is created after running your function.
A final point: Clojure doesn't force you to be purely functional -- it has escape hatches, and it's up to you to use them responsibly. Rebinding variables is one of those escape hatches.

just a note that technically Java, and by extension Clojure (on the JVM) is strictly pass by value. In many cases the thing passed is a reference to a structure that others may be reading, though because it is immutable nobody will be changing out from under you. The important point being that mutability and immutability happen after you pass the reference to something so, and Marcin points out they really are distinct.

I think of much of the immutability in Clojure as residing in (most of) the built-in data structure types and (most of) the functions that allow manipulating ... uh, no, modifying ... no, really, constructing new data structures from them. There are array-like things, but you can't modify them, there are lists, but you can't modify them, there are hash maps, but you can't modify them, etc., and the standard tools for using them actually create new data structures even when they look, to a novice, as if they're performing in-place modifications. And all of that does add up to a big difference.

Related

Why can I change immutable variables in Clojure?

I come from the Javascript world where const is used to declare immutable variables.
The definition of a immutable variable is explained in the same way in Clojure.
However, this is allowed:
(def cheese "I like cheese")
...
...
(def cheese "Actually, I changed my mind)
When I run this, the repl gives me actually, I changed my mind.
In JS, it will throw an error because a const cannot be changed.
I would appreciate it if someone explained where my understanding of immutable variables is incorrect in the clojure world?
Thanks
To be precise, Clojure has immutable values, not immutable variables. After all, the name Var is shorthand for "variable".
Imagine the number 5. You never need to worry about who "owns" it, or that someone might change its definition. Also, there can be many copies of that number used for many purposes in many parts of your program. Clojure extends this idea to collection values such as the vector [1 2 3] or the map {:first "Joe" :last "Cool"}.
Having said that, in Clojure a Var is normally used for a global "constant" value that is never changed (although it could). Using a Clojure Atom (global or local) is normal for values that do change. There are many other options (functions like reduce have an internal accumulator, for example).
This list of documentation sources is a good place to start, esp the books "Getting Clojure" and "Brave Clojure".
As Alan mentions, Clojure has immutable values, not immutable variables.
When you execute
(def x 42)
what happens is that a Clojure Var is created, the Var is bound to the name (aka symbol) x, and the immutable value 42 is placed inside the Var. A Var is a container of values. Typically, only one value is ever placed in a Var. But, as in your example, there can be different immutable values placed inside the Var at different times.
Reading Clojure Vars and the Global Environment might be helpful.

Is there a canonical way to name variables that would otherwise cause name collisions?

Say I'd like to name a variable seq, referring to some kind of sequence.
But seq is already a function in clojure.core, so if I try to name my variable seq, the existing meaning of seq will be overwritten.
Is there a canonical way in Clojure to name a variable that would otherwise have a name collision with a default variable?
(e.g., in this case, my-seq could be used, but I don't know whether that would be standard as far as style goes)
There is no "standard" way of naming things (see the quote and the related joke).
If it is a function of only one thing, I often just name it arg. Sometimes, people use abbreviations like x for a single thing and xs for a sequence, list, or vector of things.
For small code fragments, abbreviating to the first letter of the "long" name is often sufficient. For example, when looping over a map, each MapEntry is often accessed as:
(for [[k v] some-map] ; destructure key/val into k & v
...)
Other times, you may prefix it with a letter like aseq or the-seq.
Another trick I often use is to add a descriptive suffix like
name-in
name-full
name-first
(yes, there is a Clojure function name).
Note that if you did name it seq, you would create a local variable that shadowed the clojure.core/seq function (it would not be "overwritten"). I often just "let it slide" if the scope of the shadowing is limited and the name in question is clear & appropriate (key and val are often victims of this practice). For name, I would also probably just ignore the shadowing of clojure.core/name, since I rarely use that function.
Note that you can shadow your own local variables. This is often handy to coerce data in to a specific format:
(defn foo
[items]
; assume we need a sorted vector with no duplicates
(let [items (vec (sort (set (items))))]
...))
By shadowing the original items argument, we ensure the data is in the desired form without needing to come up with two good, descriptive names. When this technique doesn't quite fit, I often fall back to the suffix trick and just name them items-in and items or similar.
Sometimes a suffix indicating type is valuable, when multiple representations are required. For example:
items
items-set
items-vec
type-str
type-kw
type-sym
There are many other possibilities. The main point is to make it clear to the reader what is happening, and to avoid creating booby traps for the unaware.
When in doubt, add a few more letters so it is obvious to a new reader what is happening.
You won't override clojure.core/seq. You will be simply shadowing the var seq with your local bindings or vars. One can always use fully qualified name to use core seq.
Example:
;; shadow core seq
(def seq [1 2 3])
WARNING: seq already refers to: #'clojure.core/seq in namespace: user, being replaced by: #'user/seq
=> #'user/seq
;; local binding
(defn print-something [seq]
(prn seq)
(prn (var seq)))
=> #'user/print-something
;; use fully qualified name
(clojure.core/seq "abc")
=> (\a \b \c)
(print-something "a")
"a"
#'user/seq
=> nil
(prn seq)
[1 2 3]
=> nil
(var seq)
=> #'user/seq
But, its not a clean practice to shadow clojure.core vars as it might lead to buggy code. It does more harm than good if any. I usually name vars based on code context, like employee-id-seq, url-seq etc. Sometimes, it okay to use short names like x or s if usage scope is limited. You can also see clojure.core implementation to find more examples.
A good guide: https://github.com/bbatsov/clojure-style-guide#idiomatic-names
I also recommend clj-kondo plugin

Creating clojure atoms with a function

I want to 1) create a list of symbols with the function below; then 2) create atoms with these symbols/names so that the atoms can be modified from other functions. This is the function to generate symbols/names:
(defn genVars [ dist ]
(let [ nms (map str (range dist)) neigs (map #(apply str "neig" %) nms) ]
(doseq [ v neigs ]
(intern *ns* (symbol v) [ ] ))
))
If dist=3, then 3 symbols, neig0, ... neig2 are created each bound with an empty vector. If it is possible to functionally create atoms with these symbols so that they are accessible from other functions. Any help is much appreciated, even if there are other ways to accomplish this.
your function seems to be correct, just wrap the value in the intern call with atom call. Also I would rather use dotimes.
user>
(defn gen-atoms [amount prefix]
(dotimes [i amount]
(intern *ns* (symbol (str prefix i)) (atom []))))
#'user/gen-atoms
user> (gen-atoms 2 "x")
nil
user> x0
#atom[[] 0x30f1a7b]
user> x1
#atom[[] 0x2149efef]
The desire to generate names suggests you would be better served by a single map instead:
(def neighbours (atom (make-neighbours)))
Where the definition of make-neigbours might look something like this:
(defn make-neighbours []
(into {} (for [i (range 10)]
[(str "neig" i) {:age i}])))
Where the other namespace would look values up using something like:
(get-in #data/neighbours ["neig0" :age])
Idiomatic Clojure tends to avoid creating many named global vars, preferring instead to collocating state into one or a few vars governed by Clojure's concurrency primitives (atom/ref/agent). I encourage you to think about whether your problem can be solved with a single atom in this way instead of requiring defining multiple vars.
Having said that, if you really really need multiple atoms, consider storing them all in a single map var instead of creating many global vars. Personally, I have never encountered a situation where creating many atoms was better than a single big atom (so I would be interested to hear about situations where this would be important).
If you really really need many vars, be aware that defining vars inside a function is actually bad style (https://github.com/bbatsov/clojure-style-guide#dont-def-vars-inside-fns). With good reason too! The beauty of using functions and data comes from the purity of the functions. def inside a function is particularly nasty as it is not only a side-effect, but is an potentially execution flow altering side-effect.
Of course yes there is a way to achieve it, as another answer points out.
Where it comes to defining things that goes beyond def and defn, there is quite a lot of precedence to using macros. For example defroutes from compojure, defschema from Schema, deftest from clojure.test. Generally anything that is a convenience form for creating vars. You could use a macro solution to create defs for your atoms:
(defmacro defneighbours [n]
`(do
~#(for [sym (for [i (range n)]
(symbol (str "neig" i)))]
`(def ~sym (atom {}))))
In my opinion this is actually less offensive than a functional version, only because it is creating global defs. It is a little more obvious about creating global defs by using the regular def syntax. But I only bring it up as a strawman, because this is still bad.
The reason functions and data work best is because they compose.
There are tangible considerations that make a single atom governing state very convenient. You can iterate over all neighbors conveniently, you can add new ones dynamically. Also you can do things like concatenating neighbors with other neighbors etc. Basically there are lots of function/data abstractions that you lock yourself out of if you create many global vars.
This is the reason that macros are generally considered useful for syntactic tricks, but best avoided in favor of functions and data. And it has a real impact on the flexibility of your code. For example going back to compojure; the macro syntax is actually very limiting, and for that reason I prefer not to use defroutes at all.
In summary:
Don't make lots of global defs if you can avoid it.
Prefer 1 atom over many atoms where possible.
Don't def inside a function.
Macros are best avoided in favor of functions and data.
Regardless of these guidelines, it is always good to explore what is possible, and I can't know your circumstances, so above all I hope you overcome your immediate problem and find Clojure a pleasant language to use.

Questions about Vars Clojure

I'm new in Clojure and i read that it is a functional language. It says that Clojure doesn't have variables, still when i find (def n 5), what's the difference between it and a variable?
I can change the value of the var after, so is it really that different from a variable? I don't understand the difference.
Assuming that by variable you mean a refer to a mutable storage location, I guess the main difference(depending against which language you compare) is that if you dynamically rebind the var in Clojure is on a per-thread basis.
But the long answer is that you don't usually use a var in Clojure unless you really need a reference to a mutable storage location.
Clojure favors immutability and programming using values instead of references.
You can watch Rich Hickey's talk about values.
A summary would be, when you're programming in Clojure what you have are values , not references to locations that may change (maybe even changed by another thread).
So.
(let [a 1
_ (println a) => prints 1
a 2
_ (println a) => prints 2
])
Even if you get the illusion of "changing a" in that code, you're not changing the "old" a you just have a new value. (if someone would have looked at the first definition it would still be seeing the value 1).
Actually you can see that sequence of assignments as a composed function calls where a is being replaced in scope, but not the same "variable" at all.
((fn [a]
(println a) => prints 1
((fn [a]
(println a) => prints 2
) 2) 1)
None the less, if you need to have a mutable storage with potentially many threads accessing that storage, Clojure gives you vars, atoms, refs, etc.
It is not true that Clojure does not have variables, i. e. changeable references. They are however not used to store and look up during computations that can be modeled as pure mathematical functions.
The concept of immutability is that of dealing with concrete values instead of references that one or others can change. Just like 1 is a value that you can't change, in Clojure the vector [3 2] is value that you also can't change. E. g. if your algorithm is required to append 1 to that vector, it needs to create a new vector, leaving the old one intact, while in imperative languages you could just "change" the vector, breaking everything potentially relying on it. The takeaway of immutability is that you don't have to worry about that anymore and your code becomes less error prone.
Clojure implements immutable datastructures in a way that such newly created values efficiently reuse most memory of the values they are based on. They provide near the same performance characteristics as their mutable counterparts both for reading and writing (i. e. creating new versions). You may want to read more about that here and Rich Hickey does some excellent explaining in this conversation with Brian Beckmann.
Think about def as defining constant. It can be change by calling def again but you should not do it.
The closes thing to variables are agents which are thread safe.
(def counter (agent 0))
(send counter inc)
#counter
;;=> 1
You can also access the variable in Java class.
New class
(def object (ClassName.))
Se value
(.fieldName object)
Set value
(set! (.fieldName object) 5)
The "whole" point of of not having variables is to make program automatically thread safe. It is because thread error will "always" fail on that thread 1 wil tell the variable a is 1 and thread b tells that a is 2 and after that something fail. This is also reason to use pure functions - no variables "no" thread problem.
See also this question:Clojure differences between Ref, Var, Agent, Atom, with examples and this one Clojure: vars, atoms, and refs (oh my).
Treat "" as in 80% or more - not 100%.

How do I get core clojure functions to work with my defrecords

I have a defrecord called a bag. It behaves like a list of item to count. This is sometimes called a frequency or a census. I want to be able to do the following
(def b (bag/create [:k 1 :k2 3])
(keys bag)
=> (:k :k1)
I tried the following:
(defrecord MapBag [state]
Bag
(put-n [self item n]
(let [new-n (+ n (count self item))]
(MapBag. (assoc state item new-n))))
;... some stuff
java.util.Map
(getKeys [self] (keys state)) ;TODO TEST
Object
(toString [self]
(str ("Bag: " (:state self)))))
When I try to require it in a repl I get:
java.lang.ClassFormatError: Duplicate interface name in class file compile__stub/techne/bag/MapBag (bag.clj:12)
What is going on? How do I get a keys function on my bag? Also am I going about this the correct way by assuming clojure's keys function eventually calls getKeys on the map that is its argument?
Defrecord automatically makes sure that any record it defines participates in the ipersistentmap interface. So you can call keys on it without doing anything.
So you can define a record, and instantiate and call keys like this:
user> (defrecord rec [k1 k2])
user.rec
user> (def a-rec (rec. 1 2))
#'user/a-rec
user> (keys a-rec)
(:k1 :k2)
Your error message indicates that one of your declarations is duplicating an interface that defrecord gives you for free. I think it might actually be both.
Is there some reason why you cant just use a plain vanilla map for your purposes? With clojure, you often want to use plain vanilla data structures when you can.
Edit: if for whatever reason you don't want the ipersistentmap included, look into deftype.
Rob's answer is of course correct; I'm posting this one in response to the OP's comment on it -- perhaps it might be helpful in implementing the required functionality with deftype.
I have once written an implementation of a "default map" for Clojure, which acts just like a regular map except it returns a fixed default value when asked about a key not present inside it. The code is in this Gist.
I'm not sure if it will suit your use case directly, although you can use it to do things like
user> (:earth (assoc (DefaultMap. 0 {}) :earth 8000000000))
8000000000
user> (:mars (assoc (DefaultMap. 0 {}) :earth 8000000000))
0
More importantly, it should give you an idea of what's involved in writing this sort of thing with deftype.
Then again, it's based on clojure.core/emit-defrecord, so you might look at that part of Clojure's sources instead... It's doing a lot of things which you won't have to (because it's a function for preparing macro expansions -- there's lots of syntax-quoting and the like inside it which you have to strip away from it to use the code directly), but it is certainly the highest quality source of information possible. Here's a direct link to that point in the source for the 1.2.0 release of Clojure.
Update:
One more thing I realised might be important. If you rely on a special map-like type for implementing this sort of thing, the client might merge it into a regular map and lose the "defaulting" functionality (and indeed any other special functionality) in the process. As long as the "map-likeness" illusion maintained by your type is complete enough for it to be used as a regular map, passed to Clojure's standard function etc., I think there might not be a way around that.
So, at some level the client will probably have to know that there's some "magic" involved; if they get correct answers to queries like (:mars {...}) (with no :mars in the {...}), they'll have to remember not to merge this into a regular map (merge-ing the other way around would work fine).