What's it for ';;' in OCaml? - ocaml

This is my simple OCaml code to print out a merged list.
let rec merge cmp x y = match x, y with
| [], l -> l
| l, [] -> l
| hx::tx, hy::ty ->
if cmp hx hy
then hx :: merge cmp tx (hy :: ty)
else hy :: merge cmp (hx :: tx) ty
let rec print_list = function
| [] -> ()
| e::l -> print_int e ; print_string " " ; print_list l
;;
print_list (merge ( <= ) [1;2;3] [4;5;6]) ;;
I copied the print_list function from Print a List in OCaml, but I had to add ';;' after the function's implementation in order to avoid this error message.
File "merge.ml", line 11, characters 47-57:
Error: This function has type int list -> unit
It is applied to too many arguments; maybe you forgot a `;'.
My question is why ';;' is needed for print_list while merge is not?

The ;; is, in essence, a way of marking the end of an expression. It's not necessary in source code (though you can use it if you like). It is useful in the toplevel (the OCaml REPL) to cause the evaluation of what you've typed so far. Without such a symbol, the toplevel has no way to know whether you're going to type more of the expression later.
In your case, it marks the end of the expression representing the body of the function print_list. Without this marker, the following symbols look like they're part of the same expression, which leads to a type error.
For top-level expressions in OCaml files, I prefer to write something like the following:
let () = print_list (merge ( <= ) [1;2;3] [4;5;6])
If you code this way you don't need to use the ;; token in your source files.

This is an expansion of Jeffrey's answer.
As you know, when doing language parsing, a program has to break the flow in manageable lexical elements, and expect these so called lexemes (or tokens) to follow certain syntactic rules, allowing to regroup lexemes in larger units of meaning.
In many languages, the largest syntactic element is the statement, which diversifies in instructions and definitions. In these same languages, the structure of the grammar requires a special lexeme to indicate the end of some of these units, usually either instructions or statements. Others use instead a lexeme to separate instructions (rather than end each of them), but it's basically the same idea.
In OCaml, the grammar follows patterns which, within the algorithm used to parse the language, permits to elide such instruction terminator (or separator) in various circumstances. The keyword let for instance, is always necessary to introduce a new definition. This can be used to detect the end of the preceding statement at the outermost level of program statements (the top level).
However, you can easily see the problem it induces: the interactive version of Ocaml always need the beginning of a new statement to be able to figure out the previous one, and a user would never be able to provide statements one by one! The ;; special token was thus defined(*) to indicate the end of a top level statement.
(*): I actually seem to recall that the token was originally in OCaml ancestors, but then made optional when OCaml was devised; however, don't quote me on that.

Related

Absolute value function Ocaml

I'm taking a look to this programming language "Ocaml" and I have some troubles because I read the official ocaml documentation but I don't uderstand how to use :
";" and ";;" and "in" specially inside the definition of functions.
This is my code :
let abs_val value : int -> int =
let abs_ret = ref 0 ;
if value >= 0
then abs_ret := value
else abs_ret := -value ;
let return : int = abs_ret
;;
print_int abs_val -12
Compiled with "ocamlc" it said :
File "first_program.ml", line 7, characters 2-4:
7 | ;;
^^
Error: Syntax error
And it sounds so weird for me because official ocaml's doc says that when function definition ends I must use ";;".
I noticed that after the definition of abs_val VisualStudio Code ,when I go on a newline, put automatically the cursor to 2 spaces on the right, not at the beginning of the line.
I'm new in ocaml so I don't know if this is common or not but for me sounds like if something is missing, and probably it is :)
P.S. : I know that an abs function already exists but I'm doing this to learn.
Update :
let abs_val value =
let abs_ret = ref 0 in
if value >= 0
then abs_ret := value
else abs_ret := -value in
let return : int = abs_ret;
;;
print_int abs_val -12
Am I closer right?
Sometimes it happens the syntax error is not here but above. Did you closed your previous function with ;; ? Also, what is this return ? Use the functional paradigm of the language. You use variables and references and force the type checking. I'm not sure how do you want to use this code but in a general way, try to let OCaml determine the type of your functions rather than telling the compiler what is the signature. Plus, your code shouldn't be using that much references. For instance the following :
let abs_val value =
if value < 0 then
-value
else
value
Will work perfectly and not mess up things with reference. If you wish to use references, I suggest you learn more about the functional paradigm of OCaml before going deeper into its imperative possibilities.
Your syntax error is a result of having let with no matching in.
This is a very common error when learning OCaml syntax. There are two separate uses of let in OCaml. At the top level of a module, you use let to define a symbol (a function or a value) that is one of the elements of the module. So in the following:
module M = struct
let f x = x * 2
end
The let defines a function named M.f.
Similarly your code uses let this way to define abs_val.
In other cases (not at the top level of a module), let is used only as part of the let ... in expression that looks like this:
let v = exp1 in exp2
This essentially defines a local variable v with the value exp1 that can be used in the body of exp2.
All your other uses of let (except the initial definition of abs_val) are of this second kind. However, none of them have in, so they are all syntactically incorrect.
You need to fix up these problems before you can make progress with this function. You can fix the first one, for example, by changing the first semicolon (;) to in.
As #SDAChess points out, you have a second problem with the return value of your function. There is no special return keyword in OCaml that's used to return the value of a function. A function in OCaml is just a set of nested function calls, and the value of the function is the value returned by the outermost call.

let rec for values and let rec for functions in ocaml

What is the best intuition for why the first definition would be refused, while the second one would be accepted ?
let rec a = b (* This kind of expression is not allowed as right-hand side of `let rec' *)
and b x = a x
let rec a x = b x (* oki doki *)
and b x = a x
Is it linked to the 2 reduction approaches : one rule for every function substitution (and a Rec delimiter) VS one rule per function definition (and lambda lifting) ?
Verifying that recursive definitions are valid is a very hard thing to do.
Basically, you want to avoid patterns in this kind of form:
let rec x = x
In the case where every left-hand side of the definitions are function declarations, you know it is going to be fine. At worst, you are creating an infinite loop, but at least you are creating a value. But the x = x case does not produce anything and has no semantic altogether.
Now, in your specific case, you are indeed creating functions (that loop indefinitely), but checking that you are is actually harder. To avoid writing a code that would attempt exhaustive checking, the OCaml developers decided to go with a much easier algorithm.
You can have an outlook of the rules here. Here is an excerpt (emphasis mine):
It will be accepted if each one of expr1 … exprn is statically constructive with respect to name1 … namen, is not immediately linked to any of name1 … namen, and is not an array constructor whose arguments have abstract type.
As you can see, direct recursive variable binding is not permitted.
This is not a final rule though, as there are improvements to that part of the compiler pending release. I haven't tested if your example passes with it, but some day your code might be accepted.

OCaml update value through iteration of a list

I was wondering how I can update the value of a variable through the iteration of a list. For example, let's say I want to keep track of the number of variables of a list. I could do something like
let list = [1;2;3;4;5]
let length = 0 in
let getCount elmt =
length = length+1 in
List.iter getCount list
but I get the error This expression has type 'a -> bool which makes sense because at length = length+1 I am comparing using =. How should I update the value of length?
EDIT:
I tried
let wordMap =
let getCount word =
StringMap.add word (1000) wordMap in
List.fold_left getCount StringMap.empty wordlist;;
but it doesn't know what wordMap is in getCount function...
#PatJ gives a good discussion. But in real code you would just use a fold. The purpose of a fold is precisely what you ask for, to maintain some state (of any type you like) while traversing a list.
Learning to think in terms of folds is a basic skill in functional programming, so it's worth learning.
Once you're good at folds, you can decide on a case-by-case basis whether you need mutable state. In almost all cases you don't.
(You can definitely use a fold to accumulate a map while traversing a list.)
You have several ways to do this.
The simpler way (often preferred by beginners coming from the imperative world) is to use a reference. A reference is a variable you can legally mutate.
let length l =
let count = ref 0 in
let getCount _ = (* _ means we ignore the argument *)
count := !count + 1
in
List.iter getCount l;
!count
As you can see in here, !count returns the value currently in the reference and := allows you to do the imperative update.
But you should not write that code
Yeah, I'm using bold, this is how serious I am about it. Basically, you should avoid using references when you can rely on pure functional programing. That is, when there are no side-effects.
So how do you modify a variable when you are not allowed to? That's where recursion comes in. Check this:
let rec length l =
match l with
| [] -> 0
| _::tl -> 1 + length tl
In that code, we no longer have a count variable. Don't worry, we'll get it back soon. But you can see that just by calling length again, we can assign a new value tl to the argument l. Yet it is pure and considered a better practice.
Well, almost.
The last code has the problem of recursion: each call will add (useless) data to the stack and, after being through the list, will do the additions. We don't want that.
However, function calls can be optimized if they are tail calls. As Wikipedia can explain to you:
a tail call is a subroutine call performed as the final action of a
procedure.
In the later code, the recursive call to length isn't a tail call as + is the final action of the function. The usual trick is to use an accumulator to store the intermediate results. Let's call it count.
let rec length_iterator count l =
match l with
| [] -> count
| _::tl -> length_iterator (count+1) tl
in
let length l = length_iterator 0 l
And now we have a neat, pure, and easy-to-optimize code that calculates the length of your list.
So, to answer the question as stated in the title: iterate with a (tail-)recursive function and have the updatable variables as arguments of this function.
If the goal is to get the length of the list, just use the function provided by List Module: List.length. Otherwise, variables in OCaml are never mutable and what you're trying to do is illegal in OCaml and not functional at all. But if you really have to update a value, consider using ref(for more info: http://www.cs.cornell.edu/courses/cs3110/2011sp/recitations/rec10.htm).

Printing list in without sequence operator

Want to print list without using sequence operator:-
This is my program, I want to make this program without ";" operator and without using "let" for assigning variables !
let rec print_row = function
[]->()
|h::t -> print_int h; Printf.printf(" "); print_row t;;
You want to implement a generic approach for applying functions over any type of lists. You can go about this in the following manner using List.map:
print_string (String.concat " " (List.map string_of_int [1;2;3;4]));;
You can learn more about List.map here
This looks like a school assignment, so I'll just make some suggestions.
If you're trying to reformulate the function into more idiomatic OCaml, one thing to think of is the functions in the List module. In particular, you could write a function that does what you want to do for each element of the list, then look in the List module for a way to call a function for every element of a list.
If you just want to eliminate the ; operator (which is pretty unlikely I realize), you can always rewrite a; b as let _ = a in b.

How does pattern matching works with lists

I just start learning haskell and pattern matching. I just don't understand how it implemented, is the [] and (x:_) evaluates to the different type and the function implementations for this pattern recognized because of polymorphism, or I just wrong and there is another technique used.
head' :: [a] -> a
head' [] = error "Can't call head on an empty list, dummy!"
head' (x:_) = x
Or lets consider this pattern matching function:
tell :: (Show a) => [a] -> String
tell [] = "The list is empty"
tell (x:[]) = "The list has one element: " ++ show x
tell (x:y:[]) = "The list has two elements: " ++ show x ++ " and " ++ show y
tell (x:y:_) = "This list is long. The first two elements are: " ++ show x ++ " and " ++ show y
I think I wrong because each list with different number of elements can't have deferent type. Can you explain me how haskell knows which pattern is correct for some function implementation? I understand how it works logically but not deep. Explain it to me please.
There's a bit of a level of indirection between the syntax in Haskell and the implementation at runtime. The list syntax that you see in most source is actually a bit of sugar for what is actually a fairly regular data type. The following two lines are equivalent:
data List a = Nil | Cons a (List a)
data [a] = [] | a : [a] -- pseudocode
So when you type in [a,b,c] this translates into the internal representation as (a : (b : (c : []))).
The pattern matching you'll see at the top level bindings in are also a bit of syntactic sugar ( sans some minor details ) for case statements which deconstruct the data types onto the right hand side of the case of the pattern match. The _ symbol is a builtin for the wildcard pattern which matches any expression ( so long as the pattern is well-typed ) but does add a variable to the RHS scope.
head :: [a] -> a
head x = case x of
(a : _) -> a
I think I wrong because each list with different number of elements can't have deferent type. Can you explain me how haskell knows which pattern is correct for some function implementation?
So to answer your question [] and (x:_) are both of the same type (namely [a] or List a in our example above). The list datatype is also recursive so all recursive combinations of Cons'ing values of type a have the same type, namely [a].
When the compiler typechecks the case statement it runs along each of the LHS columns and ensures that all the types are equivalent or can be made equivalent using a process called unification. The same is done on the right hand side of the cases.
Haskell "knows" which pattern is correct by trying each branch sequentially and unpacking the matched value to see if it has the same number of cons elements as the pattern and then proceeding to the branch which does, or failing with a pattern match error if none match. The same process is done for any pattern matching on data types, not just lists.
There are two separate sets of checks:
in compile time (before the program is run) the compiler simply checks that the types of the cases conform to the function's signature and that the calls to the function conform to the function's signature.
In runtime, when we have access to the actual input, a similar, but different, set of checks is executed: these checks take the * actual* input (not its type) at hand and determine which of the cases matches it.
So, to answer your question: the decision is made in compile time where we have not only the type but also the actual value.