Is there any guideline to help clojure programmer to determine in what situation should be built in macros over function? [duplicate] - clojure

This question already has answers here:
Where do you use macros in clojure where functions wont work
(2 answers)
Closed 4 months ago.
When I was reading a book "On Lisp", I find that there is an interesting topic about macros (chapter 7 in the book), convert the example to Clojure and execute in REPL (Clojure). Both of them are almost identical, I still don't know when the price of coding should be built in macros when I can build that as a function. Is there any guideline to help programmer to determine in what situation should be built in macros over function?
Example in Macros
(defmacro nif [expr pos zero neg]
`(case (Integer/signum ~expr)
1 ~pos
0 ~zero
-1 ~neg))
(map #(nif % 'p 'z 'n) [0 2.5 -8])
=> (z p n)
Example in function
(defn nif2 [expr pos zero neg]
(case (Integer/signum expr)
1 pos
0 zero
-1 neg))
(map #(nif2 % 'p 'z 'n) [0 2.5 -8])
=> (z p n)

Please note that the new symbol nif is a language extension. The implementation of nif is a compiler extension which adds a layer of pre-processing of the source code AST before normal compilation resumes.
This is only useful if you cannot already do something in Clojure with either a built-in or user-defined function. For example, the macro clojure.core/when is a simplified version of clojure.core/if (see the source code), and is not possible to write as a function.
Also, note that functions can do things that macros cannot; i.e. you can compose functions, but you cannot compose macros.
I haven't seen that book, but for this example there is zero benefit to writing the macro. It is just an academic exercise to prove you could do it.
However, note that you cannot use map with a macro, only a function. The example only works at all because you wrap the macro with a function:
#(nif % 'p 'z 'n)
which is the same as the anonymous function
(fn [val]
(nif val 'p 'z 'n))
Normally, you'd just write it as
(defn val->sign
[val]
(case (Integer/signum val)
1 :p
0 :z
-1 :n))
and invoke it as
(mapv val->sign [0 2.5 -8])
;=> [:z :p :n]
This example also has zero benefit to allowing custom symbols are the return values. It is more natural in Clojure to just use pre-defined keywords for the return value, as shown above.

Related

Clojure Core function argument positions seem rather confusing. What's the logic behind it?

For me as, a new Clojurian, some core functions seem rather counter-intuitive and confusing when it comes to arguments order/position, here's an example:
> (nthrest (range 10) 5)
=> (5 6 7 8 9)
> (take-last 5 (range 10))
=> (5 6 7 8 9)
Perhaps there is some rule/logic behind it that I don't see yet?
I refuse to believe that the Clojure core team made so many brilliant technical decisions and forgot about consistency in function naming/argument ordering.
Or should I just remember it as it is?
Thanks
Slightly offtopic:
rand&rand-int VS random-sample - another example where function naming seems inconsistent but that's a rather rarely used function so it's not a big deal.
There is an FAQ on Clojure.org for this question: https://clojure.org/guides/faq#arg_order
What are the rules of thumb for arg order in core functions?
Primary collection operands come first. That way one can write → and its ilk, and their position is independent of whether or not they have variable arity parameters. There is a tradition of this in OO languages and Common Lisp (slot-value, aref, elt).
One way to think about sequences is that they are read from the left, and fed from the right:
<- [1 2 3 4]
Most of the sequence functions consume and produce sequences. So one way to visualize that is as a chain:
map <- filter <- [1 2 3 4]
and one way to think about many of the seq functions is that they are parameterized in some way:
(map f) <- (filter pred) <- [1 2 3 4]
So, sequence functions take their source(s) last, and any other parameters before them, and partial allows for direct parameterization as above. There is a tradition of this in functional languages and Lisps.
Note that this is not the same as taking the primary operand last. Some sequence functions have more than one source (concat, interleave). When sequence functions are variadic, it is usually in their sources.
Adapted from comments by Rich Hickey.
Functions that work with seqs usually has the actual seq as last argument.
(map, filter, remote etc.)
Accessing and "changing" individual elements takes a collection as first element: conj, assoc, get, update
That way, you can use the (->>) macro with a collection consistenly,
as well as create transducers consistently.
Only rarely one has to resort to (as->) to change argument order. And if you have to do so, it might be an opportunity to check if your own functions follow that convention.
For some functions (especially functions that are "seq in, seq out"), the args are ordered so that one can use partial as follows:
(ns tst.demo.core
(:use tupelo.core tupelo.test))
(dotest
(let [dozen (range 12)
odds-1 (filterv odd? dozen)
filter-odd (partial filterv odd?)
odds-2 (filter-odd dozen) ]
(is= odds-1 odds-2
[1 3 5 7 9 11])))
For other functions, Clojure often follows the ordering of "biggest-first", or "most-important-first" (usually these have the same result). Thus, we see examples like:
(get <map> <key>)
(get <map> <key> <default-val>)
This also shows that any optional values must, by definition, be last (in order to use "rest" args). This is common in most languages (e.g. Java).
For the record, I really dislike using partial functions, since they have user-defined names (at best) or are used inline (more common). Consider this code:
(let [dozen (range 12)
odds (filterv odd? dozen)
evens-1 (mapv (partial + 1) odds)
evens-2 (mapv #(+ 1 %) odds)
add-1 (fn [arg] (+ 1 arg))
evens-3 (mapv add-1 odds)]
(is= evens-1 evens-2 evens-3
[2 4 6 8 10 12]))
Also
I personally find it really annoying trying to parse out code using partial as with evens-1, especially for the case of user-defined functions, or even standard functions that are not as simple as +.
This is especially so if the partial is used with 2 or more args.
For the 1-arg case, the function literal seen for evens-2 is much more readable to me.
If 2 or more args are present, please make a named function (either local, as shown for evens-3), or a regular (defn some-fn ...) global function.

Why does Lisp allow replacement of math operators in let?

I know in Scheme I can write this:
(let ((+ *)) (+ 2 3)) => 6
As well as this, in Clojure:
(let [+ *] (+ 2 3)) => 6
I know this can work, but it feels so weird. I think in any language, the math operators are predefined. C++ and Scala can do operator overloading, but this doesn't seem to be that.
Doesn't this cause confusion? Why does Lisp allow this?
This is not a general Lisp feature.
In Common Lisp the effects of binding a core language function is undefined. This means the developer should not expect that it works in portable code. An implementation may also signal a warning or an error.
For example the SBCL compiler will signal this error:
; caught ERROR:
; Lock on package COMMON-LISP violated when
; binding + as a local function while
; in package COMMON-LISP-USER.
; See also:
; The SBCL Manual, Node "Package Locks"
; The ANSI Standard, Section 11.1.2.1.2
; (DEFUN FOO (X Y)
; (FLET ((+ (X Y)
; (* X Y)))
; (+ X Y)))
We can have our own + in Common Lisp, but it then has to be in a different package (= symbol namespace):
(defpackage "MYLISP"
(:use "CL")
(:shadow CL:+))
(in-package "MYLISP")
(defun foo (a b)
(flet ((+ (x y)
(* x y)))
(+ a b)))
Disclaimer: This is from a Clojure point of view.
+ is just another function. You can pass it around and write sum with it, have partial application, read docs about it, ...:
user=> (apply + [1 2 3])
6
user=> (reduce + [1 2 3])
6
user=> (map (partial + 10) [1 2 3])
(11 12 13)
user=> `+
clojure.core/+
user=> (doc +)
-------------------------
clojure.core/+
([] [x] [x y] [x y & more])
Returns the sum of nums. (+) returns 0. Does not auto-promote
longs, will throw on overflow. See also: +'
So you can have many + in different namespaces. The core one get's "use"-ed for you by default, but you can simply write your own. You can write your own DSL:
user=> (defn + [s] (re-pattern (str s "+")))
WARNING: + already refers to: #'clojure.core/+ in namespace: user, being replaced by: #'user/+
#'user/+
user=> (+ "\\d")
#"\d+"
user=> (re-find (+ "\\d") "666")
"666"
It's not special form, it's nothing different from any other function. So with that established, why should it not be allowed to be overriden?
In Scheme you are making a local binding, shadowing whatever is higher, With let. Since + and * are just variables that just happen to evaluate to procedures you are just giving old procedures other variable names.
(let ((+ *))
+)
; ==> #<procedure:*> (non standard visualization of a procedure)
In Scheme there are no reserved words. If you look at other languages the list of reserved words are quite high. Thus in Scheme you can do this:
(define (test v)
(define let 10) ; from here you cannot use let in this scope
(define define (+ let v)) ; from here you cannot use define to define stuff
define) ; this is the variable, not the special form
;; here let and define goes out of scope and the special forms are OK again
(define define +) ; from here you cannot use top level define
(define 5 6)
; ==> 11
THe really nice thing about this is that if you choose a name and the next version of the standard happens to use the same name for something similar, but not compatible, your code will not break. In other languages I have worked with a new version might introduce conflicts.
R6RS makes it even easier
From R6RS we have libraries. That means that we have full control over what top level forms we get from the standard into our programs. You have several ways to do it:
#!r6rs
(import (rename (except (rnrs base) +) (* +)))
(+ 10 20)
; ==> 200
This is also OK.
#!r6rs
(import (except (rnrs base) +))
(define + *)
(+ 10 20)
; ==> 200 guaranteed
And finally:
#!r6rs
(import (rnrs base)) ; imports both * and +
(define + *) ; defines + as an alias to *
(+ 10 20)
; ==> 200 guaranteed
Other languages does this too:
JavaScript is perhaps the most obvious:
parseFloat = parseInt;
parseFloat("4.5")
// ==> 4
But you cannot touch their operators. They are reserved because the language needs to do a lot of stuff for the operator precedence. Just like Scheme JS is nice language for duck typing.
Mainstream Lisp dialects do not have reserved tokens for infix operations. There is no categorical difference between +, expt, format or open-file: they are all just symbols.
A Lisp proram which performs (let ((+ 3)) ...) is spiritually very similar to a C program which does something like { int sqrt = 42; ... }. There is a sqrt function in the standard C library, and since C has a single namespace (it's a Lisp-1), that sqrt is now shadowed.
What we can't do in C is { int + = 42; ...} which is because + is an operator token. An identifier is called for, so there is a syntax error. We also can't do { struct interface *if = get_interface(...); } because if is a reserved keyword and not an identifier, even though it looks like one. Lisps tend not to have reserved keywords, but some dialects have certain symbols or categories of symbols that can't be bound as variables. In ANSI Common Lisp, we can't use nil or t as variables. (Specifically, those symbols nil and t that come from the common-lisp package). This annoys some programmers, because they'd like a t variable for "time" or "type". Also, symbols from the keyword package, usually appearing with a leading colon, cannot be bound as variables. The reason is that all these symbols are self-evaluating. nil, t and the keyword symbols evaluate to themselves, and so do not act as variables to denote another value.
The reason we allow this in lisp is that all bindings are done with lexical scope, which is a concept that comes from lambda calculus.
lambda calculus is a simplified system for managing variable binding. In lambda calculus the rules for things like
(lambda (x) (lambda (y) y))
and
(lambda (x) (lambda (y) x))
and even
(lambda (x) (lambda (x) x))
are carefully specified.
In lisp LET can be thought of as syntactic sugar for a lambda expression, for example your expression (let ([+ x]) (+ 2 3)) is equivalent to ((lambda (+) (+ 2 3)) x) which according to lambda calculus simplifies down to (x 2 3).
In summary, lisp is based on uniformly applying a very simple and clear model (called lambda calculus). If it seems strange at first, that's because most other programming languages don't have such consistency or base their variable binding on a mathematical model.
Scheme's philosophy is to impose minimal restriction such that to give maximal power to programmer.
A reason to allow such things is that in Scheme you can embed other languages and in other languages you want to use the * operator with different semantics.
For example, if you implement a language to represent regular expressions you want to give the * the semantics of the algebraic kleene operator and write programs like this one
(* (+ "abc" "def") )
to represent a language that contain words like this one
empty
abc
abcabc
abcdef
def
defdef
defabc
....
Starting from the main language, untyped lambda calculus, it is possible to create a language in which you can redefine absolutely everything apart from the lambda symbol. This is the model of computation scheme is build on.
It's not weird because in lisp there are no operators except functions and special forms like let or if, that can be builtin or created as macros. So here + is not an operator, but a function that is assigned to symbol + that is adding its arguments (in scheme and clojure you can say that it's just variable that hold function for adding numbers), the same * is not multiplication operator but asterisk symbol that is multiplying its arguments, so this is just convenient notation that it use + symbol it could be add or sum but + is shorter and similar as in other languages.
This is one of this mind bending concepts when you found it for the first time, like functions as arguments and return values of other functions.
If you use very basic Lisp and lambda calculus you don't even need numbers and + operators in base language. You can create numbers from functions and plus and minus functions using same trick and assign them to symbols + and - (see Church encoding)
Why does Lisp allow rebinding of math operators?
for consistency and
because it can be useful.
Doesn't this cause confusion?
No.
Most programming languages, following traditional algebraic notation, have special syntax for the elementary arithmetic functions of addition and subtraction and so on. Rules of priority and association make some function calls implicit. This syntax makes the expressions easier to read at the price of consistency.
LISPs tip the see-saw the other way, preferring consistency over legibility.
Consistency
In Clojure (the Lisp I know), the math operators (+, -, *, ... ) are nothing special.
Their names are just ordinary symbols.
They are core functions like any others.
So of course you can replace them.
Usefulness
Why would you want to override the core arithmetic operators? For example, the units2 library redefines them to accept dimensioned quantities as well as plain numbers.
Clojure algebra is harder to read.
All operators are prefix.
All operator applications are explicit - no priorities.
If you are determined to have infix operators with priorities, you can do it. Incanter does so: here are some examples and here is the source code.

Trouble understanding a simple macro in Clojure - passing macro as stand-in for higher order function to map

I've been into Clojure lately and have avoided macros up until now, so this is my first exposure to them. I've been reading "Mastering Clojure Macros", and on Chapter 3, page 28, I encountered the following example:
user=> (defmacro square [x] `(* ~x ~x))
;=> #'user/square
user=> (map (fn [n] (square n)) (range 10))
;=> (0 1 4 9 16 25 36 49 64 81)
The context is the author is explaining that while simply passing the square macro to map results in an error (can't take value of a macro), wrapping it in a function works because:
when the anonymous function (fn [n] (square n)) gets compiled, the
square expression gets macroexpanded, to (fn [n] (clojure.core/* n
n)). And this is a perfectly reasonable function, so we don’t have any
problems with the compiler
This makes sense to me if we assume the body of the function is evaluated before runtime (at compile, or "definition" time) thus expanding the macro ahead of runtime. However, I always thought that function bodys were not evaluated until runtime, and at compile time you would basically just have a function object with some knowledge of it's lexical scope (but no knowledge of its body).
I'm clearly mixed up on the compile/runtime semantics here, but when I look at this sample I keep thinking that square won't be expanded until map forces it's call, since it's in the body of the anonymous function, which I thought would be unevaluated until runtime. I know my thinking is wrong, because if that was the case, then n would be bound to each number in (range 10), and there wouldn't be an issue.
I know it's a pretty basic question, but macros are proving to be pretty tricky for me to fully wrap my head around at first exposure!
Generally speaking function bodies aren't evaluated at compile time, but macros are always evaluated at compile time because they're always expanded at compile time whether inside a function or not.
You can write a macro that expands to a function, but you still can't refer/pass the macro around as if it were a function:
(defmacro inc-macro [] `(fn [x#] (inc x#)))
=> #'user/inc-macro
(map (inc-macro) [1 2 3])
=> (2 3 4)
defmacro is expanded at compile time, so you can think of it as a function executed during compilation. This will replace every occurrence of the macro "call" with the code it "returns".
You may consider a macros as a syntax rule. For example, in Scheme, a macros is declared with define-syntax form. There is a special step in compiler that substitutes all the macros calls into their content before compiling the code. As a result, there won't be any square calls in your final code. Say, if you wrote something like
(def value (square 3))
the final version after expansion would be
(def value (clojure.core/* 3 3))
There is a special way to check what will be the body of your macros after being expanded:
user=> (defmacro square [x] `(* ~x ~x))
#'user/square
user=> (macroexpand '(square 3))
(clojure.core/* 3 3)
That's why a macros is an ephemeral thing that lives only in source code but not in the compiled version of it. That's why it cannot be passed as a value or referenced somehow.
The best rule regarding macroses is: try to avoid them until you really need them in your work.

Clojure, unquote slicing outside of syntax quote

In clojure, we can use unquote slicing ~# to spread the list. For example
(macroexpand `(+ ~#'(1 2 3)))
expands to
(clojure.core/+ 1 2 3)
This is useful feature in macros when rearranging the syntax. But is it possible to use unquote slicing or familiar technique outside of macro and without eval?
Here is the solution with eval
(eval `(+ ~#'(1 2 3))) ;-> 6
But I would rather do
(+ ~#'(1 2 3))
Which unfortunately throws an error
IllegalStateException Attempting to call unbound fn: #'clojure.core/unquote-splicing clojure.lang.Var$Unbound.throwArity (Var.java:43)
At first I thought apply would do it, and it is indeed the case with functions
(apply + '(1 2 3)) ; -> 6
However, this is not the case with macros or special forms. It's obvious with macros, as it's expanded before apply and must be first element in the form anyway. With special forms it's not so obvious though, but still makes sense as they aren't first class citizens as functions are. For example the following throws an error
(apply do ['(println "hello") '(println "world")]) ;-> error
Is the only way to "apply" list to special form at runtime to use unquote slicing and eval?
Clojure has a simple model of how programs are loaded and executed. Slightly simplified, it goes something like this:
some source code is read from a text stream by the reader;
this is passed to the compiler one form at a time;
the compiler expands any macros it encounters;
for non-macros, the compiler applies various simple evaluation rules (special rules for special forms, literals evaluate to themselves, function calls are compiled as such etc.);
the compiled code is evaluated and possibly changes the compilation environment used by the following forms.
Syntax quote is a reader feature. It is replaced at read time by code that emits list structure:
;; note the ' at the start
user=> '`(+ ~#'(1 2 3))
(clojure.core/seq
(clojure.core/concat (clojure.core/list (quote clojure.core/+)) (quote (1 2 3))))
It is only in the context of syntax-quoted blocks that the reader affords ~ and ~# this special handling, and syntax-quoted blocks always produce forms that may call a handful of seq-building functions from clojure.core and are otherwise composed from quoted data.
This all happens as part of step 1 from the list above. So for syntax-quote to be useful as an apply-like mechanism, you'd need it to produce code in the right shape at that point in the process that would then look like the desired "apply result" in subsequent steps. As explained above, syntax-quote always produces code that creates list structure, and in particular it never returns unquoted expressions that look like unquoted dos or ifs etc., so that's impossible.
This isn't a problem, since the code transformations that are reasonable given the above execution model can be implemented using macros.
Incidentally, the macroexpand call is actually superfluous in your example, as the syntax-quoted form already is the same as its macroexpansion (as it should be, since + is not a macro):
user=> `(+ ~#'(1 2 3))
(clojure.core/+ 1 2 3)

Clojure: Apply or to a list

This question similar to say, In clojure, how to apply 'and' to a list?, but solution doesn't apply to my question. every? function returns just a boolean value.
or is a macro, so this code is invalid:
(apply or [nil 10 20]) ; I want to get 10 here, as if 'or was a function.
Using not-every? I will get just boolean value, but I want to preserve semantics of or - the "true" element should be a value of the expression. What is idiomatic way to do this?
My idea so far:
(first (drop-while not [nil 10 20])) ; = 10
you might be able to use some for this:
(some identity [nil 10 20]) ; = 10
Note that this differs from or if it fails
(some identity [false]) ; = nil
(or false) ; = false
A simple macro:
(defmacro apply-or
[coll]
`(or ~#coll))
Or even more abstract
(defmacro apply-macro
[macro coll]
`(~macro ~#coll))
EDIT: Since you complained about that not working in runtime here comes a version of apply-macro that works in runtime. Compare it with answers posted here: In clojure, how to apply a macro to a list?
(defmacro apply-macro-in-runtime
[m c]
`(eval (concat '(~m) ~c)))
Notice that this version utilizes that parameters are passed unevaluated (m is not evaluated, if this was a function, it would throw because a macro doesn't have a value) it uses concat to build a list containing of a list with the quoted macro-name and whatever the evaluation of form c (for coll) returns so that c as (range 5) would be fully evaluated to the list that range returns. Finally it uses eval to evaluate the expression.
Clarification: That obviously uses eval which causes overhead. But notice that eval was also used in the answer linked above.
Also this does not work with large sequences due to the recursive definition of or. It is just good to know that it is possible.
Also for runtime sequences it makes obviously more sense to use some and every?.