What is ACTUALLY happening with parenthesis '()' in Clojure? - clojure

I'm looking for the technical answer answer here. How is Clojure interpreting these symbols? My current working understanding is that the opening paren '(' is a kind of call that calls the succeeding operator on the operands while the closing paren ')' is a terminate that wraps up the previous evaluation and returns the final value generated (whether function or value).
Any and all details on the truth here would be appreciated. I'm looking to go deep here as well as seeing/knowing every level of abstraction along the way. It bugs me to know that I may have some imaginal thinking going on currently.

(foo x1 x2)
is the syntax for calling a special form or var (function or macro).
So the compiler will analyze the form (foo x1 x2) and will check if foo is a special form (if, try, let*, etc.) and if not, the symbol will be resolved to a var in the context of the current namespace. If that var is macro, then macroexpansion will happen, else the call will be treated as a normal function call.
To prevent treating (foo x1 x2) as a function call you can quote the expression: '(foo x1 x2) and then it will just remain a list of symbols.
More info:
https://clojure.org/reference/special_forms
https://clojure.org/reference/macros

You are trying to do too much at once when you say
'(' is a kind of call that calls the succeeding operator on the operands while the closing paren ')' is a terminate that wraps up the previous evaluation and returns the final value generated
The Clojure evaluation model does not assign semantics to characters directly. Instead, evaluation of a Clojure program goes through two broad phases:
First, read the characters in the source file according to the language's lexical rules, yielding a Clojure data structure
Second, evaluate that data structure, according to the language's evaluation rules, yielding a value
So when we write an expression like (+ (* 4 5) 2), what happens? The reader matches up parentheses to create lists, and yields as its result a list of three elements: the symbol +, another list (containing the symbol * and the numbers 4 and 5), and the number 2.
Next we move to evaluate that expression. Notice, crucially, that at this point there is no trace of parentheses. The textual source of the program is no longer material. We're evaluating a list. Of course, if we printed that list, conventionally we would surround it with parentheses, but that does not concern the evaluator. How do we evaluate this list? Well, the evaluation rule for lists is to, first, evaluate each of the components, and then invoke the first component as a function, passing the remaining components as arguments1. So our list of pending tasks is:
Evaluate +
Evaluate (* 4 5)
Evaluate 2
Invoke the result of (1), passing the results of (2) and (3) as arguments
(1), of course, evaluates to the addition function, (2) evaluates (in a similar manner) to the number 20, and (3) evaluates to the number 2 (since numbers evaluate to themselves). Thus, (4) becomes "Invoke the addition function, passing the numbers 20 and 2 as arguments". Of course, the final result is 22.
1 The rule is actually more complicated than this, because of macros, but for functions this suffices.

The other 2 answers are great. The simple summary is:
foo( x1, x2 ) // Java function call
(foo x1 x2) // Clojure function call
In both cases, the compiler will evaluate any nested function calls in x1 or x2 before calling foo on the resulting values.

Related

Why doesn't a user defined conditional work?

I am currently studying OCAML, and have a question about a user-defined if-then such as:
let cond (c,t,e) =
match c with
| true -> t
| false -> e
When used in a factorial function:
let rec fact n =
cond (n=0,1, n * fact (n-1))
Intuitively, it seems to be correct, but I know it will throw a stack overflow error. Can someone explain to me why this is, and how this user-defined if-then differs from the builtin if-then?
Basically your user defined conditional is not lazy evaluated. Before the actual match takes place, OCaml tries to evaluate both expressions you pass - for the true and false cases.
Example:
Let's suppose we try to evaluate fact 2.
The return value is the expression cond (2=0,1, 2 * fact (2-1)). Before the 3-tuple is passes to cond, it has to be fully evaluated. To do that Ocaml has to evaluate the function fact (2-1).
Now we evaluate fact 1. The return value is cond (1=0,1, 2 * fact (1-1)). Again, we need to know the value of fact (1-1), so we compute it recursively.
We evaluate fact 0. Here the problem starts to show. The return value is cond (0=0,1, 0 * fact (0-1)), but in order to evaluate the function cond we first have to evaluate its arguments - the 3-tuple. This makes us evaluate fact (0-1)!
Then, we are evaluating fact -1...
... fact -2 ... fact -3 ... and the stack overflows :)
The built-in if-then evaluates its arguments lazily: first, it checks whether the condition is true or false, then it accordingly chooses only one branch to evaluate - this behavior is called lazy evaluation.
Actually OCaml has operations lazy and force you could use to avoid this undesirable behavior, but probably it is better just to stick to traditional if.

why are common-lisp functions unbound when I evaluate them individually

why does
(floor 4.5)
return 4 and 0.5 but
floor
gives an error:
The variable FLOOR is unbound.
[Condition of type UNBOUND-VARIABLE]
Note: I come from a clojure background
How would I be able to access the actual floor procedure?
if you use parentheses, like in your first example: (floor ...) Common Lisp recognizes it as a list and because it's unquoted, it evaluates it. The first form in an evaluated list must be a function name, a macro name or a special form.
In your second example, you did not use parentheses, so it is not treated as a list, therefore CL tries to interpret it as a variable (variables and functions are in different namespaces).
Try typing (floor), you'll get different error message (invalid number of arguments).
You can access the function namespace by typing
#'floor
or
(function floor)
(these are essentially the same).
#'floor
Common Lisp keeps variables and functions in different namespaces.

How to concatenate list values in OCaml

If I have a function
let rec function n =
if n<0 then []
else n-2 # function n-2 ;;
I get an error saying that the expression function n-2 is a list of int but it is expecting an int.
How do I concatenate the values to return all the n-2 values above zero as a list?
I cannot use the List module to fold.
Thanks
Your title asks how to concatenate lists, but your question seems rather different.
To concatenate lists, you can use the # operator. In many cases, code that depends on this operator is slower than it needs to be (something to keep in mind for later :-).
Here are some things I see wrong with the code you give:
a. You can't name a function function, because function is a keyword in OCaml.
b. If you use the # operator, you should have lists on both sides of it. As near as I can see, the thing on the left in your code is not a list.
c. Function calls have higher precedence than infix operators. So myfun n - 2 is parsed as (myfun n) - 2. You probably want something closer to myfun (n - 2).
Even with these changes, your code seems to generate a list of integers that are 2 apart, which isn't what you say you want. However, I can't understand what the function is actually supposed to return.
It seems like you are not concatenating lists, but concatenating ints instead. This is done by the :: operator. So your code would look like:
else (n-2)::(fun (n-2))
Although I could see this function possibly not producing the desired output if you put in negative numbers. For example if you pass through n = 1, n-2 will evaluate to -1 which is less than zero.

What's the general rules to placing parenthesis in OCaml?

I always have troubles to place parenthesis in OCaml. Well, I always don't want to, but sometimes get error.
for example, let's say I have two functions:
let f_a x y = x+y and let f_b x = x+1.
If I do f_a 3 f_b 4, I can't and I should do f_a 3 (f_b 4).
But if I do f_a 3 * f_b 4, it is perfectly fine.
Another example, If I do f_a x y::[], it is fine too and I don't need to add parenthesis like this (f_a x y)::[].
Also I find that I don't need parenthesis for elements inside a tuple: (f_a 1 2, f_b 3) is fine.
So can anyone teach me the general rules to decide when using parenthesis and when not?
Here's a table explaining the precedence and associativity of certain expressions in OCaml.
As you can see from there, function application is left-associative, meaning that f_a 3 f_b 4 is interpreted as (((f_a) 3) f_b) 4. However, multiplication (*) has lower precedence than function application, which means that f_a 3 * f_b 4 is interpreted as (f_a 3) * (f_b 4) (first applying functions, afterwards multiplication).
Last, :: has a lower precedence than function application, so f_a x y::[] first applies the function, and afterwards concatenates to the empty list (i.e. "consumes" ::[]). This means that f_a x y::[] is seen as (f_a x y)::[].
Unfortunately, I could not deduce a simple rule of thumb, but I always remember that "function application has a quite high precedence and is left-associative". This works quite well for me.
Parentheses are simply to group things so they are evaluated as one whereas without parentheses they may or may not be due to operator precedence. Function application has higher precedence than all normal operators. You can see the OCaml precedence table by going here and then scrolling up a little.

Comma operator in a conditional

I have read in a lot of places but I really can't understand the specified behavior in conditionals.
I understand that in assignments it evaluates the first operand, discards the result, then evaluates the second operand.
But for this code, what it supposed to do?
CPartFile* partfile = (CPartFile*)lParam;
ASSERT( partfile != NULL );
bool bDeleted = false;
if (partfile,bDeleted)
partfile->PerformFileCompleteEnd(wParam);
The partfile in the IF was an unnecessary argument, or it have any meaning?
In this case, it is an unnecessary expression, and can be deleted without changing the meaning of the code.
The comma operator performs the expression of the first item, discards the results, then evaluates the result as the last expression.
So partfile,bDeleted would evaulate whatever partfile would, discard that result, then evaluate and return bDeleted
It's useful if you need to evaluate something which has a side-effect (for example, calling a method). In this case, though, it's useless.
For more information, see Wikipedia: Comma operator
bool bDeleted = false;
if (partfile,bDeleted)
partfile->PerformFileCompleteEnd(wParam);
Here, the if statement evaluates partfile,bDeleted, but bDelete is always false, so the expression fails to run. The key question is "what's that all about?". The probable answer is that someone temporarily wanted to prevent the partfile->PerformFileCompleteEnd(wParam); statement from running, perhaps because it was causing some problem or they wanted to ensure later code reported errors properly if that step wasn't performed. So that they're remember how the code used to be, they left the old "if (partfile)" logic there, but added a hardcoded bDeleted variable to document that the partfile->Perform... logic had effectively been "deleted" from the program.
A better way to temporarily disable such code is probably...
#if 0
if (partfile)
partfile->PerformFileCompleteEnd(wParam);
#endif
...though sometimes I try to document the reasoning too...
#ifndef DONT_BYPASS_FILE_COMPLETE_PROCESSING_DURING_DEBUGGING
if (partfile)
partfile->PerformFileCompleteEnd(wParam);
#endif
...or...
if (partFile, !"FIXME remove this after debugging")
partfile->PerformFileCompleteEnd(wParam);
The best choice depends on your tool set and existing habits (e.g. some editors highlight "FIXME" and "TODO" in reverse video so it's hard to miss or grey out #if 0 blocks; you might have particular strings your source-control checkin warns about; preprocessor defines only in debug vs release builds can prevent accidental distribution etc.).
partfile is evaluated, then bDeleted is evaluated and used as the test. Since evaluation of partfile does not have any side effects, removing it from the conditional has no effect.
The comma operator is a rather obscure feature of C/C++. It should not be confused with the comma in initialising lists (ie: int x, int y; ) nor with function call parameter separation comma (ie: func(x, y) ).
The comma operator has one single purpose: to give the programmer a guaranteed order of evaluation of an expression. For almost every operator in C/C++, the order of evaluation of expressions is undefined. If I write
result = x + y;
where x and y are subexpressions, then either x or y can be evaluated first. I cannot know which, it's up to the compiler. If you however write
result = x, y;
the order of evaluation is guaranteed by the standard: left first.
Of course, the uses of this in real world applications are quite limited...