What is the rationale for Symbols in Clojure to be bound to an underlying object and have an optional separate value ? Perhaps something elementary I am missing but would be great if someone could point out the Why.
General intro:
Symbols in any Lisp are used as identifiers. If you're going to refer to the value of a variable, say, you need to have a way of naming it; that's what symbols are for. Remember that all Lisp code gets translated at read time to Lisp data structures; identifiers must also be represented by some data structure and it happens to be the symbol. Upon encountering a symbol, eval dispatches to some kind of a "name lookup" operation.
Moving from Lisp generalities to Clojure particulars, the behaviour of the Clojure eval / compiler is that upon encountering a symbol, it takes it to be a name for either a let-introduced local variable or function parameter or the name of an entry in a namespace. Actually only non-namespace-qualified symbols may be used in the first capacity (meaning symbols of the form foo and not some-namespace/foo).
A roughly sketched example:
For a non-namespace-qualified symbol foo, if a let binding / function parameter of name foo is found, the symbol evaluates to its value. If not, the symbol gets transformed to the form *ns*/foo (*ns* denotes the current namespace) and an attempt is made to look up a curresponding entry in *ns*; if there is such an entry, its value is returned, if not, an exception is thrown.
Note that a symbol like identity, when used in namespace quux, will be resolved to clojure.core/identity through an intermediate step in which an entry under quux/identity is discovered; this will normally refer to clojure.core/identity. That's an implementation detail one doesn't think of when coding intuitively, but which I find impossible not to mention when trying to explain this.
A symbol which is already namespace-qualified (something like a zip/root in a namespace which refers to clojure.zip without use'ing it) will be looked up in the appropriate namespace.
There's some added complexity with macros (which can only occur in operator position), but it's not really something relevant to the behaviour of symbols themselves.
Vars vs Symbols:
Note that in Clojure, symbols are not themselves storage locations -- Vars are. So when I say in the above that a symbol gets looked up in a namespace, what I mean is that eval looks up the Var named by the symbol resolved to its namespace-qualified form and then takes the value of that. The special form var (often abbreviated to #') modifies this behaviour so that the Var object itself is returned. The mechanics of symbol-to-Var resolution are unchanged, though.
Closing remarks:
Note that all this means that symbols are only "bound" to objects in the sense that eval, when evaluating a symbol, goes on to look for some further object. The symbol itself has no "slot" or "field" for an object to be bound to it; any impression that a symbol is "bound" to some object is due to eval's workings. This is a Clojure characteristic, as in some Lisps symbols do themselves act as storage locations.
Finally, one can use the usual quoting mechanism to prevent the evaluation of a symbol: in 'foo, the symbol foo will not be evaluted (so no name lookup of any sort will be performed); it'll be returned unchanged instead.
In response to OP's comment: Try this for fun:
(defmacro symbol?? [x]
(if (symbol? x)
true
false))
(def s 1)
(symbol? s)
; => false
(symbol?? s)
; => true
(symbol? 's)
; => true
(symbol?? 's)
; => false
The last one explained: 's is shorthand for (quote s); this is a list structure, not a symbol. A macro operates on its arguments passed in directly, without being evaluated; so in (symbol?? 's) it actually sees the (quote s) list structure, which is of course not itself a symbol -- although when passed to eval, it would evaluate to one.
There may be some confusion here from the different usages of the term "symbol" in Common Lisp and in Clojure.
In Common Lisp, a "symbol" is a location in memory, a place where data can be stored. The "value" of a symbol is the data stored at that location in memory.
In Clojure, a "symbol" is just a name. It has no value.
When the Clojure compiler encounters a symbol, it tries to resolve it as
a Java class name (if the symbol contains a dot)
a local (as with "let" or function parameters)
a Var in the current Namespace
a Var referred from another Namespace
The Var, as a previous poster pointed out, represents a storage location.
There are good reasons why Clojure separates Vars from Symbols. First, it avoids the annoyance of Common Lisp's automatically-interned symbols, which can "pollute" a package with unwanted symbols.
Secondly, Clojure Vars have special semantics with regard to concurrency. A Var has a exactly one "root binding" visible to all threads. (When you type "def" you are setting the root binding of a Var.) Changes to a Var made within a thread (using "set!" or "binding") are visible only to that thread and its children.
Related
I read online that Clojure uses the ASM library to generate JVM Bytecode, I also saw that Clojure has a REPL.
I assume each line of code executed by the REPL is compiled into a Java class using ASM and then that class is loaded to execute the code. If this is the case then each line would cause a new class file to be generated, so I'm not sure how local variables declared on one line could be shared with the lines which follow in the REPL.
Does anyone know how Clojure's REPL works? I tried reading the Clojure source code but I don't know much Clojure.
It's not "each line" that is compiled at a time, but "each form".
In the REPL, you are always in some namespace. You can change the current namespace of a REPL by using in-ns. In each namespace, there is a binding between symbols (loosely, "names") and Vars (loosely, a container that holds an immutable value). The "state" of the namespace is in the bindings of that namespace.
For example, if you evaluate the form (def a 17) in the current namespace, that will create a new (if it does not already exist) binding for the name a that points to a Var that contains the value 17. Now, you could later evaluate the form (+ a 25) in the same namespace. That will get the value of a in the namespace and add that to 25 to return 42.
The above is for symbols that are local to the namespace. These symbols are available to all forms evaluated in that namespace. (They also can be accessed from other namespaces, but I'll leave that out for now).
You might take a look at https://clojure.org/reference/evaluation if you have not already. The article at https://clojure.org/reference/vars might also be helpful.
There are different symbols for statements (processes) and function calls in flowcharts. When I have a statement which assigns the return value of a function to a variable, how can I show it in a flowchart? Should I show it as a process or a function (i.e. a plain rectangle or a rectangle with stripes)?
q = myFunction(x,y);
From Flowchart Symbols Defined:
A Predefined Process symbol is a marker for another process step or series of process flow steps that are formally defined elsewhere. This shape commonly depicts sub-processes (or subroutines in programming flowcharts). If the sub-process is considered "known" but not actually defined in a process procedure, work instruction, or some other process flowchart or documentation, then it is best not to use this symbol since it implies a formally defined process.
Given
q = myFunction(x,y);
Use a Predefined Process symbol, if myFunction is formally defined elsewhere; otherwise use a Process symbol.
Do stars have a specific meaning when used in defining symbols (such as in functions, bindings etc.)? Is it just a normal binding name when i define something like:
(def *clojure* "CLOJURE")
As i def this i get in the REPL:
Warning: *clojure* not declared dynamic and thus is not dynamically rebindable, but its name suggests otherwise. Please either indicate ^:dynamic *clojure* or change the name.
Where can i read learn more about the special characters and things like **?
By convention, variables with 'earmuffs' (i.e. enclosed by *s) are dynamic vars which can be rebound using binding and related functions e.g.
(def ^:dynamic *dyn*)
(binding [*dyn* "Hello world!"]
(println *dyn*))
if you name a variable in this way without making it dynamic you get the warning you're seeing.
Check this clojure style guide. Earmuffs are just one type of convention as already been mentioned by #Lee.
In Clojure, if you call a function before its definition, e.g.
(foo (bar 'a))
(defn bar [] ...)
it is not compiled. One should add
(declare bar)
before (foo (bar 'a)). Why Clojure is designed as this? I mean, in most languages, except C/C++, such as Java, Python, PHP, Scala, Haskell or even other Lisps, especially in dynamic-type languages, function declaration is not needed, that is, function definition could be put either before or after a call. I feel it uncomfortable to use.
Clojure does a single-pass compilation (well I simplify, read the following links) :
https://news.ycombinator.com/item?id=2467359
https://news.ycombinator.com/item?id=2466912
So it seems logical that if you read the source only one time, from top to bottom you cannot have things like forward declaration and do it safely.
To quote Rich (first link) :
But, what should happen here, when the compiler has never before seen
bar?
`(defn foo [] (bar))`
or in CL:
`(defun foo () (bar))`
CL happily compiles it, and if bar is never defined, a runtime error will occur. Ok, but, what reified thing
(symbol) did it use for bar during compilation? The symbol it interned
when the form was read. So, what happens when you get the runtime
error and realize that bar is defined in another package you forgot to
import. You try to import other-package and, BAM!, another error -
conflict, other-package:bar conflicts with read-in-package:bar. Then
you go learn about uninterning.
In Clojure, the form doesn't compile,
you get a message, and no var is interned for bar. You require
other-namespace and continue.
I vastly prefer this experience, and so
made these tradeoffs. Many other benefits came about from using a
non-interning reader, and interning only on definition/declaration.
I'm not inclined to give them up, nor the benefits mentioned earlier,
in order to support circular reference.
In clojure I would like to write a function which I can call like this:
(function undefined-symbol-which-means-something-else)
: Is there any way of writing such a function without resorting to ', :, or using a macro?
If the symbol is undefined, this is always going to give you an error. This is because Clojure will try to resolve the symbol before calling the function, and fail.
Some options to consider (in my personal order of preference...):
Use a keyword (i.e. ":my-keyword") - this is what they were designed for after all! You never need to pre-define keywords. Also this is probably the most idiomatic way of doing things.
Use a regular string as a parameter. You can always convert this into a symbol later if you need to with (symbol "somename")
If function is a macro rather than a function, then you could theoretically achieve something like what you want by reinterpreting the symbol on the fly. This works because macro expansion happens before evaluation.