What is <cycle> in data? - ocaml

(I use OCaml version 4.02.3)
I defined a type self
# type self = Self of self;;
type self = Self of self
and its instance s
# let rec s = Self s;;
val s : self = Self <cycle>
Since OCaml is a strict language, I expected defining s will fall into infinite recursion. But the interpreter said s has a value and it is Self <cycle>.
I also applied a function to s.
# let f (s: self) = 1;;
val f : self -> int = <fun>
# f s;;
- : int = 1
It seems s is not evaluated before the function application (like in non-strict language).
How OCaml deal with cyclic data like s? Self <cycle> is a normal form?

OCaml is indeed an eager language, however s is a perfectly valid and fully evaluated term that happens to contain a cycle. For instance, this code yields the expected result:
let f (Self Self x) = x
f s == s;;
More precisely, the memory representation of constructors with at n arguments are boxed and read like this:
⋅—————————————————————————————————————————————⋅
| header | field[0] | field[1] | ⋯ | fiekd[n] |
⋅—————————————————————————————————————————————⋅
The header contains metadata whereas field[k] is an OCaml value, i.e. either an integer or a pointer. In the case of s, Self has only one argument, and thus only one field field[0]. The value of field[0] is then simply a pointer towards the start of the block. The term s is thus perfectly representable in OCaml.
Moreover, the toplevel printer is able to detect this kind of cycles and print an <cycle> to avoid falling into an infinite recursion when printing the value of s. Here, <cycle>, like <abstr> or <fun>, represents just a kind of value that the toplevel printer cannot print.
Note, however, that cyclical value will trigger infinite recursion in many situations, for instance f s = s where (=) is the structural equality
and not the physical one (i.e. (==)) triggers such recursion, another example would be
let rec ones = 1 :: ones;; (* prints [1;<cycle>] *)
let twos = List.map ((+) 1) ones;; (* falls in an infinite recursion *)

Related

When should extensible variant types be used in OCaml?

I took a course on OCaml before extensible variant types were introduced, and I don't know much about them. I have several questions:
(This question was deleted because it attracted a "not answerable objectively" close vote.)
What are the low-level consequences of using EVTs, such as performance, memory representation, and (un-)marshaling?
Note that my question is about extensible variant type specifically, unlike the question suggested as identical to this one (that question was asked prior to the introduction of EVTs!).
Extensible variants are quite different from standard variants in term of
runtime behavior.
In particular, extension constructors are runtime values that lives inside
the module where they were defined. For instance, in
type t = ..
module M = struct
type t +=A
end
open M
the second line define a new extension constructor value A and add it to the
existing extension constructors of M at runtime.
Contrarily, classical variants do not really exist at runtime.
It is possible to observe this difference by noticing that I can use
a mli-only compilation unit for classical variants:
(* classical.mli *)
type t = A
(* main.ml *)
let x = Classical.A
and then compile main.ml with
ocamlopt classical.mli main.ml
without troubles because there are no value involved in the Classical module.
Contrarily with extensible variants, this is not possible. If I have
(* ext.mli *)
type t = ..
type t+=A
(* main.ml *)
let x = Ext.A
then the command
ocamlopt ext.mli main.ml
fails with
Error: Required module `Ext' is unavailable
because the runtime value for the extension constructor Ext.A is missing.
You can also peek at both the name and the id of the extension constructor
using the Obj module to see those values
let a = [%extension_constructor A]
Obj.extension_name a;;
: string = "M.A"
Obj.extension_id a;;
: int = 144
(This id is quite brittle and its value it not particurlarly meaningful.)
An important point is that extension constructor are distinguished using their
memory location. Consequently, constructors with n arguments are implemented
as block with n+1 arguments where the first hidden argument is the extension
constructor:
type t += B of int
let x = B 0;;
Here, x contains two fields, and not one:
Obj.size (Obj.repr x);;
: int = 2
And the first field is the extension constructor B:
Obj.field (Obj.repr x) 0 == Obj.repr [%extension_constructor B];;
: bool = true
The previous statement also works for n=0: extensible variants are never
represented as a tagged integer, contrarily to classical variants.
Since marshalling does not preserve physical equality, it means that extensible
sum type cannot be marshalled without losing their identity. For instance, doing
a round trip with
let round_trip (x:'a):'a = Marshall.from_string (Marshall.to_string x []) 0
then testing the result with
type t += C
let is_c = function
| C -> true
| _ -> false
leads to a failure:
is_c (round_trip C)
: bool = false
because the round-trip allocated a new block when reading the marshalled value
This is the same problem which already existed with exceptions, since exceptions
are extensible variants.
This also means that pattern-matching on extensible type is quite different
at runtime. For instance, if I define a simple variant
type s = A of int | B of int
and define a function f as
let f = function
| A n | B n -> n
the compiler is smart enough to optimize this function to simply accessing the
the first field of the argument.
You can check with ocamlc -dlambda that the function above is represented in
the Lambda intermediary representation as:
(function param/1008 (field 0 param/1008)))
However, with extensible variants, not only we need a default pattern
type e = ..
type e += A of n | B of n
let g = function
| A n | B n -> n
| _ -> 0
but we also need to compare the argument with each extension constructor in the
match leading to a more complex lambda IR for the match
(function param/1009
(catch
(if (== (field 0 param/1009) A/1003) (exit 1 (field 1 param/1009))
(if (== (field 0 param/1009) B/1004) (exit 1 (field 1 param/1009))
0))
with (1 n/1007) n/1007)))
Finally, to conclude with an actual example of extensible variants,
in OCaml 4.08, the Format module replaced its string-based user-defined tags
with extensible variants.
This means that defining new tags looks like this:
First, we start with the actual definition of the new tags
type t = Format.stag = ..
type Format.stag += Warning | Error
Then the translation functions for those new tags are
let mark_open_stag tag =
match tag with
| Error -> "\x1b[31m" (* aka print the content of the tag in red *)
| Warning -> "\x1b[35m" (* ... in purple *)
| _ -> ""
let mark_close_stag _tag =
"\x1b[0m" (*reset *)
Installing the new tag is then done with
let enable ppf =
Format.pp_set_tags ppf true;
Format.pp_set_mark_tags ppf true;
Format.pp_set_formatter_stag_functions ppf
{ (Format.pp_get_formatter_stag_functions ppf ()) with
mark_open_stag; mark_close_stag }
With some helper function, printing with those new tags can be done with
Format.printf "This message is %a.#." error "important"
Format.printf "This one %a.#." warning "not so much"
Compared with string tags, there are few advantages:
less room for a spelling mistake
no need to serialize/deserialize potentially complex data
no mix up between different extension constructor with the same name.
chaining multiple user-defined mark_open_stag function is thus safe:
each function can only recognise their own extension constructors.

Function that converts a sequence to a list in OCaml

I want to convert a sequence to a list using List.init. I want at each step to retrieve the i th value of s.
let to_list s =
let n = length s in
List.init n
(fun _i ->
match s () with
| Nil -> assert false
| Cons (a, sr) -> a)
This is giving me a list initialized with the first element of s only. Is it possible in OCaml to initialize the list with all the values of s?
It may help to study the definition of List.init.
There are two variations depending on the size of the list: a tail recursive one, init_tailrec_aux, whose result is in reverse order, and a basic one, init_aux. They have identical results, so we need only look at init_aux:
let rec init_aux i n f =
if i >= n then []
else
let r = f i in
r :: init_aux (i+1) n f
This function recursively increments a counter i until it reaches a limit n. For each value of the counter that is strictly less than the limit, it adds the value given by f i to the head of the list being produced.
The question now is, what does your anonymous function do when called with different values of i?:
let f_anon =
(fun _i -> match s () with
|Nil -> assert false
|Cons(a, sr) -> a)
Regardless of _i, it always gives the head of the list produced by s (), and if s () always returns the same list, then f_anon 0 = f_anon 1 = f_anon 2 = f_anon 3 = hd (s ()).
Jeffrey Scofield's answer describes a technique for giving a different value at each _i, and I agree with his suggestion that List.init is not the best solution for this problem.
The essence of the problem is that you're not saving sr, which would let you retrieve the next element of the sequence.
However, the slightly larger problem is that List.init passes only an int as an argument to the initialization function. So even if you did keep track of sr, there's no way it can be passed to your initialization function.
You can do what you want using the impure parts of OCaml. E.g., you could save sr in a global reference variable at each step and retrieve it in the next call to the initialization function. However, this is really quite a cumbersome way to produce your list.
I would suggest not using List.init. You can write a straightforward recursive function to do what you want. (If you care about tail recursion, you can write a slightly less straightforward function.)
using a recursive function will increase the complexity so i think that initializing directly the list (or array) at the corresponding length will be better but i don't really know how to get a different value at each _i like Jeffrey Scofield said i am not really familiar with ocaml especially sequences so i have some difficulties doing that:(

How do I write a function to create a circular version of a list in OCaml?

Its possible to create infinite, circular lists using let rec, without needing to resort to mutable references:
let rec xs = 1 :: 0 :: xs ;;
But can I use this same technique to write a function that receives a finite list and returns an infinite, circular version of it? I tried writing
let rec cycle xs =
let rec result = go xs and
go = function
| [] -> result
| (y::ys) -> y :: go ys in
result
;;
But got the following error
Error: This kind of expression is not allowed as right-hand side of `let rec'
Your code has two problems:
result = go xs is in illegal form for let rec
The function tries to create a loop by some computation, which falls into an infinite loop causing stack overflow.
The above code is rejected by the compiler because you cannot write an expression which may cause recursive computation in the right-hand side of let rec (see Limitations of let rec in OCaml).
Even if you fix the issue you still have a problem: cycle does not finish the job:
let rec cycle xs =
let rec go = function
| [] -> go xs
| y::ys -> y :: g ys
in
go xs;;
cycle [1;2];;
cycle [1;2] fails due to stack overflow.
In OCaml, let rec can define a looped structure only when its definition is "static" and does not perform any computation. let rec xs = 1 :: 0 :: xs is such an example: (::) is not a function but a constructor, which purely constructs the data structure. On the other hand, cycle performs some code execution to dynamically create a structure and it is infinite. I am afraid that you cannot write a function like cycle in OCaml.
If you want to introduce some loops in data like cycle in OCaml, what you can do is using lazy structure to prevent immediate infinite loops like Haskell's lazy list, or use mutation to make a loop by a substitution. OCaml's list is not lazy nor mutable, therefore you cannot write a function dynamically constructs looped lists.
If you do not mind using black magic, you could try this code:
let cycle l =
if l = [] then invalid_arg "cycle" else
let l' = List.map (fun x -> x) l in (* copy the list *)
let rec aux = function
| [] -> assert false
| [_] as lst -> (* find the last cons cell *)
(* and set the last pointer to the beginning of the list *)
Obj.set_field (Obj.repr lst) 1 (Obj.repr l')
| _::t -> aux t
in aux l'; l'
Please be aware that using the Obj module is highly discouraged. On the other hand, there are industrial-strength programs and libraries (Coq, Jane Street's Core, Batteries included) that are known to use this sort of forbidden art.
camlspotter's answer is good enough already. I just want to add several more points here.
First of all, for the problem of write a function that receives a finite list and returns an infinite, circular version of it, it can be done in code / implementation level, just if you really use the function, it will have stackoverflow problem and will never return.
A simple version of what you were trying to do is like this:
let rec circle1 xs = List.rev_append (List.rev xs) (circle1 xs)
val circle: 'a list -> 'a list = <fun>
It can be compiled and theoretically it is correct. On [1;2;3], it is supposed to generate [1;2;3;1;2;3;1;2;3;1;2;3;...].
However, of course, it will fail because its run will be endless and eventually stackoverflow.
So why let rec circle2 = 1::2::3::circle2 will work?
Let's see what will happen if you do it.
First, circle2 is a value and it is a list. After OCaml get this info, it can create a static address for circle2 with memory representation of list.
The memory's real value is 1::2::3::circle2, which actually is Node (1, Node (2, Node (3, circle2))), i.e., A Node with int 1 and address of a Node with int 2 and address of a Node with int 3 and address of circle2. But we already know circle2's address, right? So OCaml just put circle2's address there.
Everything will work.
Also, through this example, we can also know a fact that for a infinite circled list defined like this actually doesn't cost limited memory. It is not generating a real infinite list to consume all memory, instead, when a circle finishes, it just jumps "back" to the head of the list.
Let's then go back to example of circle1. Circle1 is a function, yes, it has an address, but we do not need or want it. What we want is the address of the function application circle1 xs. It is not like circle2, it is a function application which means we need to compute something to get the address. So,
OCaml will do List.rev xs, then try to get address circle1 xs, then repeat, repeat.
Ok, then why we sometimes get Error: This kind of expression is not allowed as right-hand side of 'let rec'?
From http://caml.inria.fr/pub/docs/manual-ocaml/extn.html#s%3aletrecvalues
the let rec binding construct, in addition to the definition of
recursive functions, also supports a certain class of recursive
definitions of non-functional values, such as
let rec name1 = 1 :: name2 and name2 = 2 :: name1 in expr which
binds name1 to the cyclic list 1::2::1::2::…, and name2 to the cyclic
list 2::1::2::1::…Informally, the class of accepted definitions
consists of those definitions where the defined names occur only
inside function bodies or as argument to a data constructor.
If you use let rec to define a binding, say let rec name. This name can be only in either a function body or a data constructor.
In previous two examples, circle1 is in a function body (let rec circle1 = fun xs -> ...) and circle2 is in a data constructor.
If you do let rec circle = circle, it will give error as circle is not in the two allowed cases. let rec x = let y = x in y won't do either, because again, x not in constructor or function.
Here is also a clear explanation:
https://realworldocaml.org/v1/en/html/imperative-programming-1.html
Section Limitations of let rec

SML: How can I pass a function a list and return the list with all negative reals removed?

Here's what I've got so far...
fun positive l1 = positive(l1,[],[])
| positive (l1, p, n) =
if hd(l1) < 0
then positive(tl(l1), p, n # [hd(l1])
else if hd(l1) >= 0
then positive(tl(l1), p # [hd(l1)], n)
else if null (h1(l1))
then p
Yes, this is for my educational purposes. I'm taking an ML class in college and we had to write a program that would return the biggest integer in a list and I want to go above and beyond that to see if I can remove the positives from it as well.
Also, if possible, can anyone point me to a decent ML book or primer? Our class text doesn't explain things well at all.
You fail to mention that your code doesn't type.
Your first function clause just has the variable l1, which is used in the recursive. However here it is used as the first element of the triple, which is given as the argument. This doesn't really go hand in hand with the Hindley–Milner type system that SML uses. This is perhaps better seen by the following informal thoughts:
Lets start by assuming that l1 has the type 'a, and thus the function must take arguments of that type and return something unknown 'a -> .... However on the right hand side you create an argument (l1, [], []) which must have the type 'a * 'b list * 'c list. But since it is passed as an argument to the function, that must also mean that 'a is equal to 'a * 'b list * 'c list, which clearly is not the case.
Clearly this was not your original intent. It seems that your intent was to have a function that takes an list as argument, and then at the same time have a recursive helper function, which takes two extra accumulation arguments, namely a list of positive and negative numbers in the original list.
To do this, you at least need to give your helper function another name, such that its definition won't rebind the definition of the original function.
Then you have some options, as to which scope this helper function should be in. In general if it doesn't make any sense to be calling this helper function other than from the "main" function, then it should not be places in a scope outside the "main" function. This can be done using a let binding like this:
fun positive xs =
let
fun positive' ys p n = ...
in
positive' xs [] []
end
This way the helper function positives' can't be called outside of the positive function.
With this take care of there are some more issues with your original code.
Since you are only returning the list of positive integers, there is no need to keep track of the
negative ones.
You should be using pattern matching to decompose the list elements. This way you eliminate the
use of taking the head and tail of the list, and also the need to verify whether there actually is
a head and tail in the list.
fun foo [] = ... (* input list is empty *)
| foo (x::xs) = ... (* x is now the head, and xs is the tail *)
You should not use the append operator (#), whenever you can avoid it (which you always can).
The problem is that it has a terrible running time when you have a huge list on the left hand
side and a small list on the right hand side (which is often the case for the right hand side, as
it is mostly used to append a single element). Thus it should in general be considered bad
practice to use it.
However there exists a very simple solution to this, which is to always concatenate the element
in front of the list (constructing the list in reverse order), and then just reversing the list
when returning it as the last thing (making it in expected order):
fun foo [] acc = rev acc
| foo (x::xs) acc = foo xs (x::acc)
Given these small notes, we end up with a function that looks something like this
fun positive xs =
let
fun positive' [] p = rev p
| positive' (y::ys) p =
if y < 0 then
positive' ys p
else
positive' ys (y :: p)
in
positive' xs []
end
Have you learned about List.filter? It might be appropriate here - it takes a function (which is a predicate) of type 'a -> bool and a list of type 'a list, and returns a list consisting of only the elements for which the predicate evaluates to true. For example:
List.filter (fn x => Real.>= (x, 0.0)) [1.0, 4.5, ~3.4, 42.0, ~9.0]
Your existing code won't work because you're comparing to integers using the intversion of <. The code hd(l1) < 0 will work over a list of int, not a list of real. Numeric literals are not automatically coerced by Standard ML. One must explicitly write 0.0, and use Real.< (hd(l1), 0.0) for your test.
If you don't want to use filter from the standard library, you could consider how one might implement filter yourself. Here's one way:
fun filter f [] = []
| filter f (h::t) =
if f h
then h :: filter f t
else filter f t

Recursive function that returns all values in list (In OCaml)

I need a function that recursively returns (not prints) all values in a list with each iteration. However, every time I try programming this my function returns a list instead.
let rec elements list = match list with
| [] -> []
| h::t -> h; elements t;;
I need to use each element each time it is returned in another function that I wrote, so I need these elements one at a time, but I can't figure this part out. Any help would be appreciated.
Your function is equivalent to :
let rec elements list =
match list with
| [] -> []
| h :: t -> elements t
This happens because a ; b evaluates a (and discards the result) and then evaluates and returns b. Obviously, this is in turn equivalent to:
let elements (list : 'a list) = []
This is not a very useful function.
Before you try solving this, however, please understand that Objective Caml functions can only return one value. Returning more than one value is impossible.
There are ways to work around this limitation. One solution is to pack all the values you wish to return into a single value: a tuple or a list, usually. So, if you need to return an arbitrary number of elements, you would pack them together into a list and have the calling code process that list:
let my_function () = [ 1 ; 2; 3; 4 ] in (* Return four values *)
List.iter print_int (my_function ()) (* Print four values *)
Another less frequent solution is to provide a function and call it on every result:
let my_function action =
action 1 ;
action 2 ;
action 3 ;
action 4
in
my_function print_int
This is less flexible, but arguably faster, than returning a list : lists can be filtered, sorted, stored...
Your question is kind of confusing - you want a function that returns all the values in a list. Well the easiest way of returning a variable number of values is using a list! Are you perhaps trying to emulate Python generators? OCaml doesn't have anything similar to yield, but instead usually accomplishes the same by "passing" a function to the value (using iter, fold or map).
What you have currently written is equivalent to this in Python:
def elements(list):
if(len(list) == 0):
return []
else:
list[0]
return elements(list[1:])
If you are trying to do this:
def elements(list):
if(len(list) > 0):
yield list[0]
# this part is pretty silly but elements returns a generator
for e in elements(list[1:]):
yield e
for x in elements([1,2,3,4,5]):
dosomething(x)
The equivalent in OCaml would be like this:
List.iter dosomething [1;2;3;4;5]
If you are trying to determine if list a is a subset of list b (as I've gathered from your comments), then you can take advantage of List.mem and List.for_all:
List.for_all (fun x -> List.mem x b) a
fun x -> List.mem x b defines a function that returns true if the value x is equal to any element in (is a member of) b. List.for_all takes a function that returns a bool (in our case, the membership function we just defined) and a list. It applies that function to each element in the list. If that function returns true for every value in the list, then for_all returns true.
So what we have done is: for all elements in a, check if they are a member of b. If you are interested in how to write these functions yourself, then I suggest reading the source of list.ml, which (assuming *nix) is probably located in /usr/local/lib/ocaml or /usr/lib/ocaml.