In clojure documentation the term smap is occasionally used. For example, core/replace:
Given a map of replacement pairs and a vector/collection, returns a
vector/seq with any elements = a key in smap replaced with the
corresponding val in smap.
What is meant by the term "smap"?
There's no special meaning. It's just referring to the name of the parameter of the function:
(replace smap coll)
It seems plausible that "smap" stands for "substitution map".
They might have prefixed it with an "s" so it doesn't shadow the built-in used inside the function, or because they considered the use case of the map parameter to be specific enough to warrant a more specific name.
Related
What's the most efficient way to parse natural language?
Let "strings" be a map<string, void (*func)(int,char**)> containing strings such as:
Set the alarm for *.
Call *.
Get me an * at * for *.
and their corresponding functions. Now suppose "input" is a string containing a sentence like:
Call David.
How to implement a function such as parse which would take the "input" and match it to one of the strings in the map. Then call its corresponding function, passing it argc and argv containing all the wild card entires (* in strings). What's the most efficient way to implement such a function?
Not sure why this question got a downvote. It's well-posed an non-trivial.
There are plenty of academic approaches to parsing, which are mostly needed for degenerate grammars. "natural language" is perhaps not a well-defined term, and natural languages do have some ambiguity, but such constrained subsets are not problematic.
In this specific example, we see that the different production rules (map entries) are not mutually ambiguous. In fact, the first token is sufficient for disambiguation. And since a std::map is sorted, we can do an efficient O(log N) search for that token.
Hence, we only need to derive the substitutions. Again, we'll ignore the degenerate cases. Nobody is going to bother with "Get me an at at at for at."`, even though it parses unambiguously.
Instead, for substitutions you simply collect tokens until you get the expected next token. Get me an * at * for *. means that the first * gets all tokens up to at, the second * collects tokens up to for, and the final * gets all remaining tokens.
You see that no backtracking is needed. If parsing fails, there simply is no match.
I am relatively new to Clojure and can't quite wrap my mind around the difference between reader macros and regular macros, and why the difference is significant.
In what situations would you use one over the other and why?
Reader macros change the syntax of the language (for example, #foo turns into (deref foo)) in ways that normal macros can't (a normal macro wouldn't be able to get rid of the parentheses, so you'd have to do something like (# foo)). It's called a reader macro, because it's implemented in the read pass of the repl (check out the source).
As a clojure developer, you'll only create regular macros, but you'll use plenty of reader macros, without necessarily considering them explicitly.
The full list of reader macros is here: https://clojure.org/reference/reader and includes common things like # ', and #{}.
Clojure (unlike some other lisps) doesn't support user-defined reader macros, but there is some extensibility built into the reader via tagged literals (e.g. #inst or #uuid)
tl;dr*
Macros [normal macros] are expanded during evaluation (E of REPL), tied to symbols, operate on lisp objects, and appear in the first, or "function", part of a form. Clojure, and all lisps, allow defining new macros.
Reader macros run during reading, prior to evaluation, are single characters, operate on a character string, prior to all the lisp objects being emitted from the reader, and are not restricted to being in the first, or "function", part of a form. Clojure, unlike some other lisps, does not allow defining new reader macros, short of editing the Clojure compiler itself.
more words:
Normal non-reader macros, or just "macros", operate on lisp objects. Consider:
(and 1 b :x)
The and macro will be called with two values, one value is 1 and the other is a list consisting of the symbol b (not the value of b) and the keyword :x. Everything the and macro is dealing with is already a lisp (Clojure) value.
Macro expansion only happens when the macro is at the beginning of a list. (and 1 2) expands the and macro. (list and) returns an error, "Can't take value of a macro"
The reader is reasponsible for turning a character string into In Clojure a reader macro is a single character that changes how the reader, the part responsible for turning a text stream into lisp objects, operates. The dispatch for Clojure's lisp reader is in LispReader.java. As stated by Alejandro C., Clojure does not support adding reader macros.
Reader macros are one character. (I do not know if that is true for all lisps, but Clojure's current implementation only supports single character reader macros.)
Reader macros can exist at any point in the form. Consider (conj [] 'a) if the ' macro were normal, the tick would need to become a lisp object so the code wold be a list of the symbol conj, an empty vector, the symbol ' and finally the symbol a. But now the evaulation rules would require that ' be evaluated by itself. Instead the reader, upon seeing the ' wraps the complete s-exp that follows with quote so that the value returned to the evaluator is a list of conj, an empty vector, and a list of quote followed by a. Now quote is the head of a list and can change the evaluation rules for what it quotes.
Talking shortly, a reader macros is a low-level feature. That's why there are so few of them (just #, quiting and a bit more). Having to many reader rules will turn any language into a mess.
A regular macro is a tool that is widely used in Clojure. As a developer, you are welcome to write your own regular macroses but not reader ones if you are not a core Clojure developer.
Your may always use your own tagged literals as a substitution of reader rules, for example #inst "2017" will give you a Date instance and so forth.
I'm trying to figure out if there is a macro similar to delay in clojure to get a lazy expression/ variable that can be evaluated later.
The use case is a default value for Map.get/3, since the default value comes from a database call, I'd prefer it to be called only when it's needed.
Elixir's macro could be used for writing simple wrapper function for conditional evaluation. I've put one gist in the following, though it may be better/smarter way.
https://gist.github.com/parroty/98a68f2e8a735434bd60
"Generic" laziness is a bit of a tough nut to crack because it's a fairly broad question. Streams allow laziness for enumerables but I'm not sure what laziness for an expression would mean. For example what would a lazy form of x = 1 + 2 be? When would it be evaluated?
The thought that comes to mind for a lazy form of an expression is a procedure expression:
def x, do: 1 + 2
Because the value of x wouldn't be calculated until the expression is actually invoked (as far as I know). I'm sure others will correct me if I'm wrong on that point. But I don't think that's what you want.
Maybe you want to rephrase your question--leaving out streams and lazy evaluation of enumerated values.
One way to do this would be using processes. For example the map could be wrapped in a process like a GenServer or an Agent where the default value will be evaluated lazy.
The default value can be a function which makes the expensive call. If Map.get/3 isn't being used to return functions you can check if the value is a function and invoke it if it is returned. Like so:
def default_value()
expensive_db_call()
end
def get_something(dict, key) do
case Map.get(dict, key, default_value) do
value when is_fun(value) ->
value.() # invoke the default function and return the result of the call
value ->
value # key must have existed, return value
end
end
Of course if the map contains functions this type of solution probably won't work.
Also check Elixir's Stream module. While I don't know that it would help solve your particular problem it does allow for lazy evaluation. From the documentation:
Streams are composable, lazy enumerables. Any enumerable that generates items one by one during enumeration is called a stream. For example, Elixir’s Range is a stream:
More information is available in the Stream documentation.
Map.get_lazy and Keyword.get_lazy hold off on generating the default until needed, links the documentation below
https://hexdocs.pm/elixir/Map.html#get_lazy/3
https://hexdocs.pm/elixir/Keyword.html#get_lazy/3
You can wrap it in an anonymous function, then it will be evaluated when the function is called:
iex()> lazy = fn -> :os.list_env_vars() end
#Function<45.79398840/0 in :erl_eval.expr/5>
iex()> lazy.()
I found this line of Clojure code: #(d/transact conn schema-tx). It's a Datomic statement that creates a database schema. I couldn't find anything relevant on Google due to difficulties searching for characters like "#".
What does the 'at' sign mean before the first parenthesis?
This is the deref macro character. What you're looking for in the context of Datomic is at:
http://docs.datomic.com/transactions.html
under Processing Transactions:
In Clojure, you can also use the deref method or # to get a
transaction's result.
For more on deref in Clojure, see:
http://clojuredocs.org/clojure_core/clojure.core/deref
Here is a useful overview of Clojure default syntax and "sugar" (i.e. macro definitions).
http://java.ociweb.com/mark/clojure/article.html#Overview
You'll find explained the number sign #, which indicates regex or hash map, the caret ^, which is for meta data, and among many more the "at sign" #. It is a sugar form for dereferencing, which means you get the real value the reference is pointing to.
Clojure has three reference types: Refs, Atoms and Agents.
http://clojure-doc.org/articles/language/concurrency_and_parallelism.html#clojure-reference-types
Your term #(d/transact conn schema-tx) seems to deliver a reference to an atom, and by the at sign # you defer and thus get the value this reference points to.
BTW, you'll find results with search engines if you look e.g. for "Clojure at sign". But it needs some patience ;-)
The # is equivalent to deref in Clojure. transact returns a future which you deref to get the result. deref/# will block until the the transaction completes/aborts/times out.
What are the best practices for defining constants in Clojure in terms of style, conventions, efficiency, etc.
For example, is this right?
(def *PI* 3.14)
Questions:
Should constants be capitalized in Clojure?
Stylistically, should they have the asterisk (*) character on one or both sides?
Any computational efficiency considerations I should be aware of?
I don't think there is any hard and fast rules. I usually don't give them any special treatment at all. In a functional language, there is less of a distinction between a constant and any other value, because things are more often pure.
The asterisks on both sides are called "ear muffs" in Clojure. They are usually used to indicate a "special" var, or a var that will be dynamically rebound using binding later. Stuff like out and in which are occasionally rebound to different streams by users and such are examples.
Personally, I would just name it pi. I don't think I've ever seen people give constants special names in Clojure.
EDIT: Mister Carper just pointed out that he himself capitalizes constants in his code because it's a convention in other languages. I guess this goes to show that there are at least some people who do that.
I did a quick glance through the coding standards but didn't find anything about it. This leads me to conclude that it's really up to you whether or not you capitalize them. I don't think anyone will slap you for it in the long run.
On the computational efficiency front you should know there is no such thing as a global constant in Clojure. What you have above is a var, and every time you reference it, it does a lookup. Even if you don't put earmuffs on it, vars can always be rebound, so the value could always change, so they are always looked up in a table. For performance critical loops this is most decidedly non-optimal.
There are some options like putting a let block around your critical loops and let the value of any "constant" vars so that they are not looked up. Or creating a no-arg macro so that the constant value is compiled into the code. Or you could create a Java class with a static member.
See this post, and the following discussion about constants for more info:
http://groups.google.com/group/clojure/msg/78abddaee41c1227
The earmuffs are a way of denoting that a given symbol will have its own thread-local binding at some point. As such, it does not make sense to apply the earmuffs to your Pi constant.
*clojure-version* is an example of a constant in Clojure, and it's entirely in lower-case.
Don't use a special notation for constants; everything is assumed a constant unless specified otherwise.
See http://dev.clojure.org/display/community/Library+Coding+Standards
Clojure has a variety of literals such as:
3.14159
:point
{:x 0
:y 1}
[1 2 3 4]
#{:a :b :c}
The literals are constant. As far as I know, there is no way to define new literals. If you want to use a new constant, you can effectively generate a literal in the code at compile-time:
(defmacro *PI* [] 3.14159265358979323)
(prn (*PI*))
In Common Lisp, there's a convention of naming constants with plus signs (+my-constant+), and in Scheme, by prefixing with a dollar sign ($my-constant); see this page. Any such convention conflicts with the official Clojure coding standards, linked in other answers, but maybe it would be reasonable to want to distinguish regular vars from those defined with the :const attribute.
I think there's an advantage to giving non-function variables of any kind some sort of distinguishing feature. Suppose that aside from variables defined to hold functions, you typically only use local names defined by function parameters, let, etc. If you nevertheless occasionally define a non-function variable using def, then when its name appears in a function definition in the same file, it looks to the eye like a local variable. If the function is complex, you may spend several seconds looking for the name definition within the function. Adding a distinguishing feature like earmuffs or plus signs or all uppercase, as appropriate to the variable's use, makes it obvious that the variable's definition is somewhere else.
In addition, there are good reasons to give special constants like pi a special name, so no one has to wonder whether pi means, say, "print-index", or the i-th pizza, or "preserved interface". Of course I think those variables should have more informative names, but lots of people use cryptic, short variable names, and I end up reading their code. I shouldn't have to wonder whether pi means pi, so something like PI might make sense. None would think that's a run of the mill variable in Clojure.
According to the "Practical Clojure" book, it should be named *pi*