Consider something.ml:
type some_type =
| This
| That
Then, I can implement main.ml like this:
let x = Something.This
I want to create something.mli and keep the same functionality in main.ml. My first attempt was to write something.mli as:
type some_type
I thought that would make the variant constructors publicly available, but it didn't and now main.ml doesn't compile. Is there a way to expose the variant constructors in the .mli?
The .mli files define the interface for a module all on their own and the .ml file is not used at all when compiling them. You actually can have a .mli file for a pack made out of multiple .ml files. The never magically pull something from the .ml file into the interface.
Now, same as in the .ml file, there are three ways to specify a type in the .ml file:
1) As an abstract type. Nothing is exposed of the type:
# type some_type;;
type some_type
# let v = This;;
Error: Unbound constructor This
# let to_int = function This -> 1 | That -> 2;;
Error: Unbound constructor This
This hides the details of the type from the outside allowing the module to change the type at will later without breaking any source code. It is also used for phantom types that have no values or external values (see interfacing with C in the manual) that aren't ocaml types.
2) As public type. The structure of the type is exposed and values can be created:
# type some_type = This | That;;
type some_type = This | That
# let v = This;;
val v : some_type = This
# let to_int = function This -> 1 | That -> 2;;
val to_int : some_type -> int = <fun>
This is the opposite of the first case. Everything is made public.
But there is a third option:
3) As a private type. The structure of the type is exposed but values can not be created:
# type some_type = private This | That;;
type some_type = private This | That
# let v = This;;
Error: Cannot create values of the private type some_type
# let to_int = function This -> 1 | That -> 2;;
val to_int : some_type -> int = <fun>
This is somewhat between 1 and 2. The use case for this is when you need to control the construction of values. For example consider a type that holds small integers less than 100. You would write:
# let make x =
if x < 0 || x >= 100
then raise (Invalid_argument "Out of range")
else x;;
val make : int -> int = <fun>
You then write the .mli file as:
type t = private int;;
val make : int -> t;;
This ensures that values of type t can only be constructed using the make function. Anything expecting a type t will only accept a value of type t constructed by make. On the other hand anything expecting a type int will also accept a value of type t. The later would not be the case with an abstract type.
The something.mli file gives the interface for the something.ml file. So anything you want to be visible in the interface has to be defined in something.mli.
Since you want This and That to be visible, they have to be defined in something.mli.
For your small example, something.mli would contain exactly what you show for something.ml above:
type some_type = This | That
In a more realistic example, of course, the interface would contain much less than the implementation. In particular it just has the types of public functions and not the code.
Related
I took a course on OCaml before extensible variant types were introduced, and I don't know much about them. I have several questions:
(This question was deleted because it attracted a "not answerable objectively" close vote.)
What are the low-level consequences of using EVTs, such as performance, memory representation, and (un-)marshaling?
Note that my question is about extensible variant type specifically, unlike the question suggested as identical to this one (that question was asked prior to the introduction of EVTs!).
Extensible variants are quite different from standard variants in term of
runtime behavior.
In particular, extension constructors are runtime values that lives inside
the module where they were defined. For instance, in
type t = ..
module M = struct
type t +=A
end
open M
the second line define a new extension constructor value A and add it to the
existing extension constructors of M at runtime.
Contrarily, classical variants do not really exist at runtime.
It is possible to observe this difference by noticing that I can use
a mli-only compilation unit for classical variants:
(* classical.mli *)
type t = A
(* main.ml *)
let x = Classical.A
and then compile main.ml with
ocamlopt classical.mli main.ml
without troubles because there are no value involved in the Classical module.
Contrarily with extensible variants, this is not possible. If I have
(* ext.mli *)
type t = ..
type t+=A
(* main.ml *)
let x = Ext.A
then the command
ocamlopt ext.mli main.ml
fails with
Error: Required module `Ext' is unavailable
because the runtime value for the extension constructor Ext.A is missing.
You can also peek at both the name and the id of the extension constructor
using the Obj module to see those values
let a = [%extension_constructor A]
Obj.extension_name a;;
: string = "M.A"
Obj.extension_id a;;
: int = 144
(This id is quite brittle and its value it not particurlarly meaningful.)
An important point is that extension constructor are distinguished using their
memory location. Consequently, constructors with n arguments are implemented
as block with n+1 arguments where the first hidden argument is the extension
constructor:
type t += B of int
let x = B 0;;
Here, x contains two fields, and not one:
Obj.size (Obj.repr x);;
: int = 2
And the first field is the extension constructor B:
Obj.field (Obj.repr x) 0 == Obj.repr [%extension_constructor B];;
: bool = true
The previous statement also works for n=0: extensible variants are never
represented as a tagged integer, contrarily to classical variants.
Since marshalling does not preserve physical equality, it means that extensible
sum type cannot be marshalled without losing their identity. For instance, doing
a round trip with
let round_trip (x:'a):'a = Marshall.from_string (Marshall.to_string x []) 0
then testing the result with
type t += C
let is_c = function
| C -> true
| _ -> false
leads to a failure:
is_c (round_trip C)
: bool = false
because the round-trip allocated a new block when reading the marshalled value
This is the same problem which already existed with exceptions, since exceptions
are extensible variants.
This also means that pattern-matching on extensible type is quite different
at runtime. For instance, if I define a simple variant
type s = A of int | B of int
and define a function f as
let f = function
| A n | B n -> n
the compiler is smart enough to optimize this function to simply accessing the
the first field of the argument.
You can check with ocamlc -dlambda that the function above is represented in
the Lambda intermediary representation as:
(function param/1008 (field 0 param/1008)))
However, with extensible variants, not only we need a default pattern
type e = ..
type e += A of n | B of n
let g = function
| A n | B n -> n
| _ -> 0
but we also need to compare the argument with each extension constructor in the
match leading to a more complex lambda IR for the match
(function param/1009
(catch
(if (== (field 0 param/1009) A/1003) (exit 1 (field 1 param/1009))
(if (== (field 0 param/1009) B/1004) (exit 1 (field 1 param/1009))
0))
with (1 n/1007) n/1007)))
Finally, to conclude with an actual example of extensible variants,
in OCaml 4.08, the Format module replaced its string-based user-defined tags
with extensible variants.
This means that defining new tags looks like this:
First, we start with the actual definition of the new tags
type t = Format.stag = ..
type Format.stag += Warning | Error
Then the translation functions for those new tags are
let mark_open_stag tag =
match tag with
| Error -> "\x1b[31m" (* aka print the content of the tag in red *)
| Warning -> "\x1b[35m" (* ... in purple *)
| _ -> ""
let mark_close_stag _tag =
"\x1b[0m" (*reset *)
Installing the new tag is then done with
let enable ppf =
Format.pp_set_tags ppf true;
Format.pp_set_mark_tags ppf true;
Format.pp_set_formatter_stag_functions ppf
{ (Format.pp_get_formatter_stag_functions ppf ()) with
mark_open_stag; mark_close_stag }
With some helper function, printing with those new tags can be done with
Format.printf "This message is %a.#." error "important"
Format.printf "This one %a.#." warning "not so much"
Compared with string tags, there are few advantages:
less room for a spelling mistake
no need to serialize/deserialize potentially complex data
no mix up between different extension constructor with the same name.
chaining multiple user-defined mark_open_stag function is thus safe:
each function can only recognise their own extension constructors.
I am trying to extend a functor in OCaml.
For example, assume the following functor X:
module type X = functor (A : ModuleA) -> I with type t := A.t
I am trying to create a similar functor Y that also accepts A : Module A but returns an extended version of I.
I am trying something like:
module type Y = functor (A : ModuleA) ->
sig
include X(A)
val blah : A.t -> int
end
But I get a syntax error on this.
I am trying to extend the resulting signature from X with more functions. Is this possible in OCaml? What am I doing wrong?
Thanks!
EDIT:
I guess my question is: why don't functors behave the same way for modules and module types?
The functor X above returns a module type (or at least that's how I read that expression). If this expression is allowed, then why does OCaml forbid extending the resulting module type?
Unfortunately, to my knowledge this is not possible. You will have to do
module type Y = functor (A : ModuleA) ->
sig
include I with type t := A.t
val blah : A.t -> int
end
Hopefully someone else can elaborate why the feature you were trying to use is not implemented. Possibly there is a good reason.
EDIT:
If you already have a module XX of type X (an instance), you can do
module type Y = functor (A : ModuleA) ->
sig
include module type of XX(A)
val blah : A.t -> int
end
I am currently working with OCaml, and I want to create some types which are somehow secured, in the sense that I want to select only those instances which satisly some properties.
The way that I found to acheive that is to encapsulate my type in a module, making it private, and defining the constructors in such a way that they check if the object that they are trying to make satisfy these properties. As my code is a bit long, I want to split into different modules, but my types are mutually recursive, so I am using recursive modules. I end up in the following situation (I simplified a lot so that it becomes readable)
module rec A
: sig
type t = private int list
val secured_cons : int -> t -> t
end
= struct
type t = int list
let cons (i:int) (x:t) : t = i::x
let secured_cons i x : t = B.checker i x; cons i x
end
and B
: sig
val checker : int -> A.t -> unit
end
= struct
let checker i x = ()
end
But this code is rejected, with the following error message :
Characters 226-227:
let secured_cons i x = B.checker i x; cons i x
^
Error: This expression has type A.t but an expression was expected of type
t = int list
This looks to me very weird, because as we are in the context A, the two types t and A.t are supposed to be equal. From my understanding, what happens is that inside A, the type t is considered to be a synonym for int list whereas outside A, the signature tells us that it is private, so it is just a copy of this type, with a coercion A.t :> int list. The entire point is that there is no coercion the other way around, which is exactly why I want to use private type abbreviations
But in my case I am inside the module A, so I would like to use this extra information to say that my type t should coerce to A.t
Does anyone have a better explanation of why this error is happening, and how I could avoid it? (I have thought of switching to abstract types, but I get exactly the same error)
I have found a way to solve this issue I am posting it here in case anyone else ever encounters the same.
We just have to explicitly specify what types and coercion we expect from the system - here is my example slightly modified in a correct way :
module rec A
: sig
type t = private int list
val secured_cons : int -> t -> t
end
= struct
type t = int list
let cons (i:int) (x:t) : t = i::x
let secured_cons i (x:A.t) = B.checker i x; cons i (x :> t)
end
and B
: sig
val checker : int -> A.t -> unit
end
= struct
let checker i x = ()
end
It might look silly to write let secured_cons i (x:A.t) inside the module A itself, but as far as I understand it, it is the only way to specify to the system that it should go out of the module to check the signature, and use the same type as the signature (so here a private type) instead of the internal type t which is still a synonymous for int list
I had more trickier cases, but this idea could be adapted to each of them, and helped me solve them all.
Still I am not entirely sure of what is happening, and if anyone has clearer explanations, I would be very thankful
You're errors occur because when B.checker is invoked, x is infered as an A.t because of the signature of B.
You can see that easily if you explicitly type the secured_cons function :
let secured_cons i (x:t) : t = B.checker i x; cons i x
which now produces the symmetrical error:
let secured_cons i (x:t) = B.checker i x; cons i x
^
Error: This expression has type t = int list
but an expression was expected of type A.t
In fact you here have a real designing problem in my opinion. If you want the module B to check the values produced by the module A, so without surprise B must inspect in some way the type A.t. Having that type private makes it impossible.
From what i understand you have three options :
remove private
Add a browse, getter function that allows the B module to access the content of the values of type A.t
the way i would do this : put the checking function into the module A
I'd be glad to hear what more experienced users have to say about this, but here is my take on it.
I, as a developer, usually give a lot of importance to the semantics of a code. In your case, the B module is specifically used by the A module, and it has no other goal than that.
Thus, sticking to a nested module (even if it makes your code a bit longer) would be the way to go as far as I am concerned. There is no point is exposing the B module. Below is the refactored example to illustrate.
module A : sig
type t
val secured_cons : int -> t -> t
end = struct
type t = int list
module B : sig
val checker : int -> t -> unit
end = struct
let checker i x = ()
end
let cons i x = i::x
let secured_cons i x = B.checker i x; cons i x
end
And here is the signature of the module as given by utop:
module A : sig type t val secured_cons : int -> t -> t end
which is perfect in my sense because it only shows the interface to your module, and nothing of its implementation.
As a side-note, if you wanted to expose the signature of the B module (to give it to a functor, for example), just move it to the signature of the A module, as follows:
module A : sig
type t
val secured_cons : int -> t -> t
module B : sig
val checker : int -> t -> unit
end
end = struct
type t = int list
module B = struct
let checker i x = ()
end
let cons i x = i::x
let secured_cons i x = B.checker i x; cons i x
end;;
Here is the signature of the module as given by utop:
module A :
sig
type t
val secured_cons : int -> t -> t
module B : sig val checker : int -> t -> unit end
end
Still I am not entirely sure of what is happening, and if anyone has clearer explanations, I would be very thankful
A private type abbreviation of the form type u = private t declares a type u that is distinct from the implementation type t. It is the same as declaring an abstract type with the following two exceptions:
compiler treats the type t, as an implementation type, that opens an avenue for optimizations - this, however, doesn't mean that a type checker considers them the same, for the type checker they are distinct.
typechecker allows a coercion of type u to type t.
So, from a typechecker perspective, these two types are distinct. As always in OCaml type discipline all coercions should be made explicit, and a subtype is not equal to a super type unless it is coerced. In your case, the typechecker is trying to unify type A.t = private int list with type int list since A.t and int list are distinct types, it is rejected. It is, however, allowed to coerce A.t to int list (but not the vice verse).
It might look silly to write let secured_cons i (x:A.t) inside the module A itself
You don't need to write it (at least in your simple example). Just using x :> t is enough.
I would like to represent some scalar value (e.g. integers or strings)
by either it's real value or by some NA value and later store them
in a collection (e.g. a list). The purpose is to handle missing values.
To do this, I have implemented a signature
module type Scalar = sig
type t
type v = Value of t | NA
end
Now I have some polymorphic Vector type in mind that contains Scalars. Basically, some of the following
module Make_vector(S: Scalar) = struct
type t = S.v list
... rest of the functor ...
end
However, I cannot get this to work. I would like to do something like
module Int_vector = Make_vector(
struct
type t = int
end
)
module Str_vector = Make_vector(
struct
type t = string
end
)
... and so on for some types.
I have not yet worked a lot with OCaml so maybe this is not the right way. Any advises on how to realize such a polymorphic Scalar with a sum type?
The compiler always responds with the following message:
The parameter cannot be eliminated in the result type.
Please bind the argument to a module identifier.
Before, I have tried to implement Scalar as a sum type but ran into
complexity issues when realizing some features due to huge match clauses. Another (imo not so nice) option would be to use option. Is this a better strategy?
As far as I can see, you are structuring v as an input type to your functor, but you really want it to be an output type. Then when you apply the functor, you supply only the type t but not v. My suggestion is to move the definition of v into your implementation of Make_vector.
What are you trying to do exactly with modules / functors? Why simple 'a option list is not good enough? You can have functions operating on it, e.g.
let rec count_missing ?acc:(acc=0) = function
| None::tail -> count_missing ~acc:(acc+1) tail
| _::tail -> count_missing ~acc tail
| [] -> acc ;;
val count_missing : ?acc:int -> 'a option list -> int = <fun>
count_missing [None; Some 1; None; Some 2] ;;
- : int = 2
count_missing [Some "foo"; None; Some "bar"] ;;
- : int = 1
I have a large sum type originating in existing code. Let's say it looks like this:
type some_type =
| Variant1 of int
| Variant2 of int * string
Although both Variant1 and Variant2 are used elsewhere, I have a specific function that only operates on Variant2:
let print_the_string x =
match x with
| Variant2(a,s) -> print_string s; ()
| _ -> raise (Failure "this will never happen"); ()
Since this helper function is only called from one other place, it is easy to show that it will always be called with an input of Variant2, never with an input of Variant1.
Let's say the call looks like this:
let () =
print_the_string (Variant2(1, "hello\n"))
If Variant1 and Variant2 were separate types, I would expect OCaml to infer the type Variant2 -> () for print_the_string, however, since they are both variants of the same sum type, OCaml infers the signature some_type -> ().
When I encounter a program that throws an exception with a message like "this will never happen," I usually assume the original programmer did something wrong.
The current solution works, but it means that a mistake in the program would be caught at runtime, not as a compiler error as would be preferable.
Ideally, I'd like to be able to annotate the function like this:
let print_the_string (x : some_type.Variant2) =
But, of course, that's not allowed.
Question: Is there a way to cause a compiler error in any case where Variant1 was passed to print_the_string?
A related question was asked here, but nlucarioni and Thomas's answers simply address cleaner ways to handle incorrect calls. My goal is to have the program fail more obviously, not less.
Update: I'm accepting gallais's solution as, after playing with it, it seems like the cleanest way to implement something like this. Unfortunately, without a very messy wrapper, I don't believe any of the solutions work in the case where I cannot modify the original definition of some_type.
There is not enough information in your post to decide whether what follows could be useful for you. This approach is based on propagating an invariant and will play nicely if your code is invariant-respecting. Basically, if you do not have functions of type some_type -> some_type which turn values using Variant2 as their head constructor into ones constructed using Variant1 then you should be fine with this approach. Otherwise it gets pretty annoying pretty quickly.
Here we are going to encode the invariant "is built using Variant2" into the type by using
phantom types and defining some_type as a GADT. We start by declaring types whose sole purpose is to play the role of tags.
type variant2
type variantNot2
Now, we can use these types to record which constructor was used to produce a value of some_type. This is the GADT syntax in Ocaml; it's just slightly different from the ADT one in the sense that we can declare what the return type of a constructor is and different constructors can have different return types.
type _ some_type =
| Variant1 : int -> variantNot2 some_type
| Variant2 : int * string -> variant2 some_type
One could also throw in a couple of extra constructors as long as their signature records the fact their are not Variant2. I won't deal with them henceforth but you can try to extend the definitions given below so that they'll work well with these extra constructors. You can even add a print_the_second_int which will only take Variant3 and Variant4 as inputs to check that you get the idea behind this.
| Variant3 : int * int -> variantNot2 some_type
| Variant4 : float * int -> variantNot2 some_type
Now, the type of print_the_string can be extremely precise: we are only interested in elements of some_type which have been built using the constructor Variant2. In other words, the input of print_the_string should have type variant2 some_type. And the compiler can check statically that Variant2 is the only constructor possible for values of that type.
let print_the_string (x : variant2 some_type) : unit =
match x with Variant2 (_, s) -> print_string s
Ok. But what if we have a value of type 'a some_type because it was handed over to us by a client; we built it tossing a coin; etc.? Well, there's no magic there: if you want to use print_the_string, you need to make sure that this value has been built using a Variant2 constructor. You can either try to cast the value to a variant2 some_type one (but this may fail, hence the use of the option type):
let fromVariant2 : type a. a some_type -> (variant2 some_type) option = function
| Variant2 _ as x -> Some x
| Variant1 _ -> None
Or (even better!) decide in which realm the value lives:
type ('a, 'b) either = | Left of 'a | Right of 'b
let em : type a. a some_type -> (variant2 some_type, variantNot2 some_type) either =
fun x -> match x with
| Variant1 _ -> Right x
| Variant2 _ -> Left x
My solution would be to have print_the_string : int * string -> unit, since the Variant2 part provides no information you should drop it.
The type inference works toward inferring types (obviously) not values of types. But you can do what you propose with polymorphic variants. Although, I agree with Thomash.
type v1 = [ `Variant1 of int ]
type v2 = [ `Variant2 of int * string ]
let print_the_string (`Variant1 x) = ()
Gallais provided an excellent, but long answer, so I've decided to add a shorter version.
If you have a variant type and would like to add functions that works only on a subset of variants, then you can use GADTS. Consider the example:
open Core.Std
type _ t =
| Int: int -> int t
| Str: string -> string t
let str s = Str s
let uppercase (Str s) = Str (String.uppercase s)
Function uppercase has type string t -> string t and accepts only string version of type t, so you can deconstruct the variant just in place. Function str has type string -> string t, so that the return type carries in itself an information (a witness type) that the only possible variant, produced from this function is Str. So when you have a value that has such type, you can easily deconstruct it without using explicit pattern-matching, since it becomes irrefutable, i.e., it can't fail.