OCaml Signature with multiple types - ocaml

I would like to represent some scalar value (e.g. integers or strings)
by either it's real value or by some NA value and later store them
in a collection (e.g. a list). The purpose is to handle missing values.
To do this, I have implemented a signature
module type Scalar = sig
type t
type v = Value of t | NA
end
Now I have some polymorphic Vector type in mind that contains Scalars. Basically, some of the following
module Make_vector(S: Scalar) = struct
type t = S.v list
... rest of the functor ...
end
However, I cannot get this to work. I would like to do something like
module Int_vector = Make_vector(
struct
type t = int
end
)
module Str_vector = Make_vector(
struct
type t = string
end
)
... and so on for some types.
I have not yet worked a lot with OCaml so maybe this is not the right way. Any advises on how to realize such a polymorphic Scalar with a sum type?
The compiler always responds with the following message:
The parameter cannot be eliminated in the result type.
Please bind the argument to a module identifier.
Before, I have tried to implement Scalar as a sum type but ran into
complexity issues when realizing some features due to huge match clauses. Another (imo not so nice) option would be to use option. Is this a better strategy?

As far as I can see, you are structuring v as an input type to your functor, but you really want it to be an output type. Then when you apply the functor, you supply only the type t but not v. My suggestion is to move the definition of v into your implementation of Make_vector.

What are you trying to do exactly with modules / functors? Why simple 'a option list is not good enough? You can have functions operating on it, e.g.
let rec count_missing ?acc:(acc=0) = function
| None::tail -> count_missing ~acc:(acc+1) tail
| _::tail -> count_missing ~acc tail
| [] -> acc ;;
val count_missing : ?acc:int -> 'a option list -> int = <fun>
count_missing [None; Some 1; None; Some 2] ;;
- : int = 2
count_missing [Some "foo"; None; Some "bar"] ;;
- : int = 1

Related

When should extensible variant types be used in OCaml?

I took a course on OCaml before extensible variant types were introduced, and I don't know much about them. I have several questions:
(This question was deleted because it attracted a "not answerable objectively" close vote.)
What are the low-level consequences of using EVTs, such as performance, memory representation, and (un-)marshaling?
Note that my question is about extensible variant type specifically, unlike the question suggested as identical to this one (that question was asked prior to the introduction of EVTs!).
Extensible variants are quite different from standard variants in term of
runtime behavior.
In particular, extension constructors are runtime values that lives inside
the module where they were defined. For instance, in
type t = ..
module M = struct
type t +=A
end
open M
the second line define a new extension constructor value A and add it to the
existing extension constructors of M at runtime.
Contrarily, classical variants do not really exist at runtime.
It is possible to observe this difference by noticing that I can use
a mli-only compilation unit for classical variants:
(* classical.mli *)
type t = A
(* main.ml *)
let x = Classical.A
and then compile main.ml with
ocamlopt classical.mli main.ml
without troubles because there are no value involved in the Classical module.
Contrarily with extensible variants, this is not possible. If I have
(* ext.mli *)
type t = ..
type t+=A
(* main.ml *)
let x = Ext.A
then the command
ocamlopt ext.mli main.ml
fails with
Error: Required module `Ext' is unavailable
because the runtime value for the extension constructor Ext.A is missing.
You can also peek at both the name and the id of the extension constructor
using the Obj module to see those values
let a = [%extension_constructor A]
Obj.extension_name a;;
: string = "M.A"
Obj.extension_id a;;
: int = 144
(This id is quite brittle and its value it not particurlarly meaningful.)
An important point is that extension constructor are distinguished using their
memory location. Consequently, constructors with n arguments are implemented
as block with n+1 arguments where the first hidden argument is the extension
constructor:
type t += B of int
let x = B 0;;
Here, x contains two fields, and not one:
Obj.size (Obj.repr x);;
: int = 2
And the first field is the extension constructor B:
Obj.field (Obj.repr x) 0 == Obj.repr [%extension_constructor B];;
: bool = true
The previous statement also works for n=0: extensible variants are never
represented as a tagged integer, contrarily to classical variants.
Since marshalling does not preserve physical equality, it means that extensible
sum type cannot be marshalled without losing their identity. For instance, doing
a round trip with
let round_trip (x:'a):'a = Marshall.from_string (Marshall.to_string x []) 0
then testing the result with
type t += C
let is_c = function
| C -> true
| _ -> false
leads to a failure:
is_c (round_trip C)
: bool = false
because the round-trip allocated a new block when reading the marshalled value
This is the same problem which already existed with exceptions, since exceptions
are extensible variants.
This also means that pattern-matching on extensible type is quite different
at runtime. For instance, if I define a simple variant
type s = A of int | B of int
and define a function f as
let f = function
| A n | B n -> n
the compiler is smart enough to optimize this function to simply accessing the
the first field of the argument.
You can check with ocamlc -dlambda that the function above is represented in
the Lambda intermediary representation as:
(function param/1008 (field 0 param/1008)))
However, with extensible variants, not only we need a default pattern
type e = ..
type e += A of n | B of n
let g = function
| A n | B n -> n
| _ -> 0
but we also need to compare the argument with each extension constructor in the
match leading to a more complex lambda IR for the match
(function param/1009
(catch
(if (== (field 0 param/1009) A/1003) (exit 1 (field 1 param/1009))
(if (== (field 0 param/1009) B/1004) (exit 1 (field 1 param/1009))
0))
with (1 n/1007) n/1007)))
Finally, to conclude with an actual example of extensible variants,
in OCaml 4.08, the Format module replaced its string-based user-defined tags
with extensible variants.
This means that defining new tags looks like this:
First, we start with the actual definition of the new tags
type t = Format.stag = ..
type Format.stag += Warning | Error
Then the translation functions for those new tags are
let mark_open_stag tag =
match tag with
| Error -> "\x1b[31m" (* aka print the content of the tag in red *)
| Warning -> "\x1b[35m" (* ... in purple *)
| _ -> ""
let mark_close_stag _tag =
"\x1b[0m" (*reset *)
Installing the new tag is then done with
let enable ppf =
Format.pp_set_tags ppf true;
Format.pp_set_mark_tags ppf true;
Format.pp_set_formatter_stag_functions ppf
{ (Format.pp_get_formatter_stag_functions ppf ()) with
mark_open_stag; mark_close_stag }
With some helper function, printing with those new tags can be done with
Format.printf "This message is %a.#." error "important"
Format.printf "This one %a.#." warning "not so much"
Compared with string tags, there are few advantages:
less room for a spelling mistake
no need to serialize/deserialize potentially complex data
no mix up between different extension constructor with the same name.
chaining multiple user-defined mark_open_stag function is thus safe:
each function can only recognise their own extension constructors.

Walking OCaml tuples of arbitrary depth

I'm trying to understand better the OCaml type inference. I created this example:
let rec f t = match t with
| (l,r) -> (f l)+(f r)
| _ -> 1
and I want to apply it on any binary tuple (pair) with nested pairs, to obtain the total number of leafs. Example: f ((1,2),3)
The function f refuses to compile, because a contradiction in types at (f l): "This expression has type 'a but an expression was expected of type 'a * 'b".
Question: 'a being any type, could not also be a pair, or else be handled by the _ case? Is any method to walk tuples of arbitrary depth without converting them to other data structures, such as variants?
PS: In C++ I would solve this kind of problem by creating two template functions "f", one to handle tuples and one other types.
There is a way to do this, although I wouldn't recommend it to a new user due to the resulting complexities. You should get used to writing regular OCaml first.
That said, you can walk arbitrary types in a generic way by capturing the necessary structure as a GADT. For this simple problem it is quite easy:
type 'a ty =
| Pair : 'a ty * 'b ty -> ('a * 'b) ty
| Other : 'a ty
let rec count_leaves : type a . a -> a ty -> int =
fun a ty ->
match ty with
| Pair (ta, tb) -> count_leaves (fst a) ta + count_leaves (snd a) tb
| Other -> 1
Notice how the pattern matching on the a ty here corresponds to the pattern matching on values in your (poorly typed) example function.
More useful functions could be written with a more complete type representation, although the machinery becomes heavy and complicated once arbitrary tuples, records, sum types, etc have to be supported.
Any combination of tuples will have a value shape completely described by it's type (because there is no "choice" in the type structure) - hence the "number of leaves" question can be answered completely statically at compile-time. Once you have a function operating on such type - this function is fixed to operate on that specific type (and shape) only.
If you want to build a tree that can have different shapes (but same type - hence can be handled by same function) - you need to add variants to the mix, i.e. classic type 'a tree = Leaf of 'a | Node of 'a tree * 'a tree, or any other type that describes value with some dynamic "choice" of shape.

How SML achieve abstraction? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I am new in SML and it is my first time to learn a functional language. I suppose there is abstraction of SML. I have not found perfect explanation of how to achieve abstraction in SML. Does anyone can offer a explanation?
Generally speaking, there are at least two forms of "abstraction" in programming:
Abstracting the client (parameterisation)
Abstracting the implementation (encapsulation)
(If you care, these correspond to universal and existential quantification in logic and type theory.)
In ML, parameterisation can be done on two levels. Either in the small, using functions (abstraction over values) and polymorphism (abstraction over types). Note in particular that functions are first-class, so you can parameterise one function over another. For example:
fun map f [] = []
| map f (x::xs) = f x :: map f xs
abstracts list transformation over the transforming function f as well as the element types.
In the large, parameterisation can be done using the module system: a functor abstracts a whole module over another module (i.e., over both values and types). For example, you could also write the map function as a functor:
functor Mapper(type t; type u; val f : t -> u) =
struct
fun map [] = []
| map (x::xs) = f x :: map xs
end
But usually you use functors for mass abstraction, i.e., in cases where there is more than just a single function you need to parameterise.
Encapsulation is also achieved by using modules. Specifically, by sealing them, i.e., hiding details of their types behind a signature. For example, here is a (naive) implementation of integer sets:
signature INT_SET =
sig
type set
val empty : set
val add : int * set -> set
val mem : int * set -> bool
end
structure IntSet :> INT_SET = (* ':>' hides the implementation of type set *)
struct
type set = int list
val empty = []
fun add(x, s) = x::s
fun mem(x, s) = List.exists (fn y => y = x) s
end
Outside the structure IntSet, its type set is fully abstract, i.e., it cannot be interchanged with lists. That is the purpose of the so-called sealing operator :> for modules.
Both forms of abstraction can occur together. For example, in ML one would usually implement a set as a functor:
signature ORD =
sig
type t
val compare : t * t -> order
end
signature SET =
sig
type elem
type set
val empty : set
val add : elem * set -> set
val mem : elem * set -> bool
end
functor Set(Elem : ORD) :> SET where type elem = Elem.t =
struct
type elem = Elem.t
datatype set = Empty | Branch of set * elem * set
val empty = Empty
fun add(x, Empty) = Branch(Empty, x, Empty)
| add(x, Branch(l, y, r)) =
case Elem.compare(x, y) of
LESS => Branch(add(x, l), y, r)
| EQUAL => Branch(l, y, r)
| GREATER => Branch(l, y, add(x, r))
fun mem(x, Empty) = false
| mem(x, Branch(l, y, r)) =
case Elem.compare(x, y) of
LESS => mem(x, l)
| EQUAL => true
| GREATER => mem(x, r)
end
This implementation of sets works for any type for which an ordering function can be provided. Unlike the naive implementation before, it also uses a more efficient search tree as its implementation. However, that is not observable outside, because the type's implementation is again hidden.
SML programs frequently are build on a descriptive types for the problem at hand. The language then uses pattern matching to figure out what case your are working with.
datatype Shape = Circle of real | Rectangle of real*real | Square of real
val a = Circle(0.2)
val b = Square(1.3)
val c = Rectangle(4.0,2.0)
fun area (Circle(r)) = 3.14 * r * r
| area (Square(s)) = s * s
| area (Rectangle(b,h)) = b * h
Does this help to explain a little about sml?
In SML you can define "abstractions" by means of using a combination of things like algebraic data types and signatures.
Algebraic data types let you define new types specific to the problem domain and signatures let you provide functionality/behavior around those types and provide a convenient way to implement information hiding and extensibility and reusability.
Combining this things you can create "abstractions" whose implementation details are hidden from you and that you simply understand through their public interfaces (whatever the signature expose).

Accepting only one variant of sum type as OCaml function parameter

I have a large sum type originating in existing code. Let's say it looks like this:
type some_type =
| Variant1 of int
| Variant2 of int * string
Although both Variant1 and Variant2 are used elsewhere, I have a specific function that only operates on Variant2:
let print_the_string x =
match x with
| Variant2(a,s) -> print_string s; ()
| _ -> raise (Failure "this will never happen"); ()
Since this helper function is only called from one other place, it is easy to show that it will always be called with an input of Variant2, never with an input of Variant1.
Let's say the call looks like this:
let () =
print_the_string (Variant2(1, "hello\n"))
If Variant1 and Variant2 were separate types, I would expect OCaml to infer the type Variant2 -> () for print_the_string, however, since they are both variants of the same sum type, OCaml infers the signature some_type -> ().
When I encounter a program that throws an exception with a message like "this will never happen," I usually assume the original programmer did something wrong.
The current solution works, but it means that a mistake in the program would be caught at runtime, not as a compiler error as would be preferable.
Ideally, I'd like to be able to annotate the function like this:
let print_the_string (x : some_type.Variant2) =
But, of course, that's not allowed.
Question: Is there a way to cause a compiler error in any case where Variant1 was passed to print_the_string?
A related question was asked here, but nlucarioni and Thomas's answers simply address cleaner ways to handle incorrect calls. My goal is to have the program fail more obviously, not less.
Update: I'm accepting gallais's solution as, after playing with it, it seems like the cleanest way to implement something like this. Unfortunately, without a very messy wrapper, I don't believe any of the solutions work in the case where I cannot modify the original definition of some_type.
There is not enough information in your post to decide whether what follows could be useful for you. This approach is based on propagating an invariant and will play nicely if your code is invariant-respecting. Basically, if you do not have functions of type some_type -> some_type which turn values using Variant2 as their head constructor into ones constructed using Variant1 then you should be fine with this approach. Otherwise it gets pretty annoying pretty quickly.
Here we are going to encode the invariant "is built using Variant2" into the type by using
phantom types and defining some_type as a GADT. We start by declaring types whose sole purpose is to play the role of tags.
type variant2
type variantNot2
Now, we can use these types to record which constructor was used to produce a value of some_type. This is the GADT syntax in Ocaml; it's just slightly different from the ADT one in the sense that we can declare what the return type of a constructor is and different constructors can have different return types.
type _ some_type =
| Variant1 : int -> variantNot2 some_type
| Variant2 : int * string -> variant2 some_type
One could also throw in a couple of extra constructors as long as their signature records the fact their are not Variant2. I won't deal with them henceforth but you can try to extend the definitions given below so that they'll work well with these extra constructors. You can even add a print_the_second_int which will only take Variant3 and Variant4 as inputs to check that you get the idea behind this.
| Variant3 : int * int -> variantNot2 some_type
| Variant4 : float * int -> variantNot2 some_type
Now, the type of print_the_string can be extremely precise: we are only interested in elements of some_type which have been built using the constructor Variant2. In other words, the input of print_the_string should have type variant2 some_type. And the compiler can check statically that Variant2 is the only constructor possible for values of that type.
let print_the_string (x : variant2 some_type) : unit =
match x with Variant2 (_, s) -> print_string s
Ok. But what if we have a value of type 'a some_type because it was handed over to us by a client; we built it tossing a coin; etc.? Well, there's no magic there: if you want to use print_the_string, you need to make sure that this value has been built using a Variant2 constructor. You can either try to cast the value to a variant2 some_type one (but this may fail, hence the use of the option type):
let fromVariant2 : type a. a some_type -> (variant2 some_type) option = function
| Variant2 _ as x -> Some x
| Variant1 _ -> None
Or (even better!) decide in which realm the value lives:
type ('a, 'b) either = | Left of 'a | Right of 'b
let em : type a. a some_type -> (variant2 some_type, variantNot2 some_type) either =
fun x -> match x with
| Variant1 _ -> Right x
| Variant2 _ -> Left x
My solution would be to have print_the_string : int * string -> unit, since the Variant2 part provides no information you should drop it.
The type inference works toward inferring types (obviously) not values of types. But you can do what you propose with polymorphic variants. Although, I agree with Thomash.
type v1 = [ `Variant1 of int ]
type v2 = [ `Variant2 of int * string ]
let print_the_string (`Variant1 x) = ()
Gallais provided an excellent, but long answer, so I've decided to add a shorter version.
If you have a variant type and would like to add functions that works only on a subset of variants, then you can use GADTS. Consider the example:
open Core.Std
type _ t =
| Int: int -> int t
| Str: string -> string t
let str s = Str s
let uppercase (Str s) = Str (String.uppercase s)
Function uppercase has type string t -> string t and accepts only string version of type t, so you can deconstruct the variant just in place. Function str has type string -> string t, so that the return type carries in itself an information (a witness type) that the only possible variant, produced from this function is Str. So when you have a value that has such type, you can easily deconstruct it without using explicit pattern-matching, since it becomes irrefutable, i.e., it can't fail.

meaning of warning and type in ML

fun a(list) =
let
val num = length(hd(list))
fun inner(list) =
if num = length(hd(list)) then
if tl(list) = nil then true
else inner(tl(list))
else false
in
if length(hd(list))-1 = length(tl(list)) then inner(tl(list))
else false
end;
this is ml code and I got this warning and type.
stdIn:6.16 Warning: calling polyEqual
val a = fn : ''a list list -> bool
I don't understand about the warning. why it appear and the type. ''a why it has two '? ''?
what is the difference between 'a list list and ''a list list?
Excerpted from ML Hints:
Warning: calling polyEqual [may occur] whenever you use = to
compare two values with polymorphic type.
For example, fun eq(x,y) = (x = y); will cause this warning to be
generated, because x and y will have polymorphic type ''a. This
is perfectly fine and you may ignore the warning. It is not reporting
any kind of semantic error or type error in your code. The compiler
reports the warning because there can be a slight ineffeciency in how
ML tests whether two values of a polymorphic type are equal. In
particular, to perform the equality test, the run-time system must
first determine what types of values you are currently using and then
determine whether the values are equal. The first part (checking the
run-time types) can make the = test slightly slower than if the
types are known ahead of time (such as when we test 3 = 4 and know
that the = test is being applied to integers). However, that is not
something most users of ML ever need to worry about...
To answer your second question,
why it has two '? ''? what is the difference between 'a list list and
''a list list?
''a is the same as 'a, but requires it to be an equality type. An equality type in SML is a type that can be compared using =. Non-equality types cannot be compared using =. When you create a datatype, you can specify whether it is an equality type or not.
dict=val a =[("a",[1,2]),("b",[2,3])] ;
here is the code which has implementation of look up in dictionary
fun look key [] = []
| look key ((a,b)::xs) = if (key =a ) then b else look key xs ;
which gives output as
test1.sml:8.36 Warning: calling polyEqual
It is because it does not know what are the types which are compared so
the below code says that both are string type .
fun look (key:string) [] = []
| look (key:string) ((a:string,b)::xs) = if (key =a ) then b else look (key:string) xs ;