I'm trying to define an exception in OCaml that accepts a tuple pair of lists as an argument. However, this situation doesn't work?
# exception Foo of string list * string list;;
exception Foo of string list * string list
# let bar = (["a"], ["b"; "c"; "d"]);;
val bar : string list * string list = (["a"], ["b"; "c"; "d"])
# raise(Foo bar);;
Error: The constructor Foo expects 2 argument(s),
but is applied here to 1 argument(s)
However, if I do this, it works
# raise (Foo (["a"], ["b"; "c"; "d"]));;
Exception: Foo (["a"], ["b"; "c"; "d"]).
What's the deal? Thanks!
You're looking at this wrong (though I won't blame you: it's pretty surprising at first). It may seem to you that constructors follow the syntax Name of type where the type part follows normal type syntax (which lets it contain tuples).
In reality, tuples and constructors follow the exact same syntax: a constructor is merely a tuple with a name in front of it:
tuple/constructor == [name of] type [* type] [* type] ...
So, the * in a constructor definition are not part of the tuple syntax, they're part of the constructor syntax. You're literally defining a constructor as being this name, followed by N arguments as opposed to this name, followed by an argument which is a tuple.
The reason behind this subtle difference in behavior is one of performance. Right now, tuples and constructors are represented in memory as such:
[TYPE] [POINTER] [POINTER] [POINTER]
This is a fairly compact and efficient representation. If the multiple arguments of a constructor could indeed be accessed as a tuple, this would require the runtime to represent that tuple independently from the constructor (in order for it to be independently addressable) and so it would look like this:
[TYPE] [POINTER]
|
v
[TYPE] [POINTER] [POINTER] [POINTER]
This would use marginally more memory, require twice as many allocations when using a constructor, and reduce the performance of pattern-matching tuples (because of an additional dereference). In order to retain maximum performance, the name of type * type is represented using the first pattern, and you need to explicitly type name of (type * type) to cut off the * from the of and thus fall back on the second pattern.
Note that both patterns are accessed through the same pattern-matching and construction syntax: name (arg,arg). This means that type inference cannot deduce the pattern based on usage. This is no problem for normal constructors, which are always defined in a type definition, but it causes variants (which need no preliminary definition) to automatically fall back on the second version.
Additional reading on the memory representation of types here.
In this respect, OCaml's exception constructors are just like ordinary constructors:
Constr(1,2,3) is a special syntactic construct in which no triple occurs. On the other hand, a triple occurs in Constr((1,2,3)). The implementation matches this behavior, with Constr(1,2,3) being allocated as a single block, and Constr((1,2,3)) as two blocks, one containing a pointer to the other(the triple). In the run-time representation of Constr(1,2,3) there is no triple to get a pointer to, and if you need one you have to allocate a fresh one.
Note: Constr(((1,2,3))) is equivalent to Constr((1,2,3)). In Constr(((1,2,3))), the middle parentheses are interpreted as going around the expression (1,2,3), and parentheses around an expression are forgotten in the abstract syntax tree.
Foo is a constructor of an exception with 2 parameters. You'd have to decompose the tuple and pass each part into it.
# exception Foo of string list * string list;;
exception Foo of string list * string list
# let bar = (["a"], ["b"; "c"; "d"]);;
val bar : string list * string list = (["a"], ["b"; "c"; "d"])
# let a, b = bar in raise (Foo (a, b));;
Exception: Foo (["a"], ["b"; "c"; "d"]).
If you wish to use a tuple as the single parameter, you must define the exception using parens and pass the tuple in.
# exception Foo of (string list * string list);;
exception Foo of (string list * string list)
# let bar = (["a"], ["b"; "c"; "d"]);;
val bar : string list * string list = (["a"], ["b"; "c"; "d"])
# raise (Foo bar);;
Exception: Foo (["a"], ["b"; "c"; "d"]).
Related
I am new to OCaml and such languages.
I have been experimenting with Map and ended up with:
type thing = ThingA | ThingB
module ThingMap = Map.Make(String)
let things = [
("a", ThingA);
("b", ThingB)
] |> List.to_seq
|> ThingMap.of_seq
So far so good, I've got a map of strings to things.
Then I was reading that Map.Make(String) only enforces the type of the keys, the values could be any type.
I found this recipe for enforcing that values should be of thing type:
let m: thing ThingMap.t ref = ref ThingMap.empty
The recipe seems useful but I don't understand how it works.
Three questions:
what does m represent? it has type type thing ThingMap.t ref but that doesn't mean much to me... is it useful for anything? can I just rename it to _?
does declaring this in a module enforce the constraint for all instances of ThingMap in any module? (is "instances" the right word?)
what's up with the ref and empty? how does this work?
Update
After experimenting a bit more with benefit of #Jeffrey Scofield's helpful answer I can see that this recipe is not needed in my case (initialising an immutable Map from a list of key-value pairs).
First I observe that I get a type error if subsequent elements of the list do not have same type as the first element:
let things = [
("a", ThingA);
("b", 123)
] |> List.to_seq
|> ThingMap.of_seq
Error: This expression has type int but an expression was expected of type
thing
Cool, but that's not enough because if the "wrong" element is in the first position I get the "wrong" type error for the constraint I intended:
let things = [
("a", 123);
("b", ThingB)
] |> List.to_seq
|> ThingMap.of_seq
Error: This expression has type thing but an expression was expected of type
int
I can fix this by understanding that the part after the colon in let m: thing ThingMap.t ... is a type signature, so I can get the result I want by adding a signature to things:
let things: thing ThingMap.t = [
("a", 123);
("b", ThingB)
] |> List.to_seq
|> ThingMap.of_seq
Error: This expression has type int but an expression was expected of type
thing
Here the first item has wrong type, but I get the type error I wanted.
m is a global variable (a name) that is declared by the let. This is not a sweeping declaration that all values of type ThingMap contain values of type thing. It's just a declaration of one such map (accessed through a reference). You could later use ThingMap.t with values of any type.
# let m : thing ThingMap.t ref = ref ThingMap.empty;;
val m : thing ThingMap.t ref = {contents = <abstr>}
# let n : int ThingMap.t ref = ref ThingMap.empty;;
val n : int ThingMap.t ref = {contents = <abstr>}
Here I declared a value m, a reference to a map from strings to things. Then I declared a value n, a reference to a map from strings to ints.
The left side of let is a pattern. You can certainly use _ for this pattern but then you would have no name by which to refer to your map.
This declaration doesn't enforce any constraints on anything except the specific variable m.
If you don't know what references are, you should learn about them. In essence a reference is a mutable cell that always contains a value of one type, but it can be modified to contain different values of the type. (On the other hand, it's good when starting with OCaml to try to code without using references.)
ThingMap.empty is a map with no keys (and hence no values).
If you really want to control the values used in ThingMap, you can define it in a module that specifies types for the functions that access the map. Once you've defined ThingMap.t as a visible type, there's no way to change its meaning to make it more restricted.
If you just want a handy name for the type of maps from strings to things, you can give it a name:
type thingmap = thing ThingMap.t
Update
Here is a previous SO discussion of OCaml (immutable) variable bindings and references:
What is the difference between let-bindings and references in OCaml?
type card = int
type game = { dimension : int; p1 : card list; }
let fn (_dimension : int) (p1 : card list) : bool =
(int)p1 = (int)_dimension * 2
I want to check that p1 is exactly twice the size of dimension.
Your code doesn't look very much like OCaml code, so it's difficult to know how to help :-)
There is no operation in OCaml to change the type of a value. That wouldn't make sense in a strongly typed language. There are some things you can do with types, but "casting" isn't one of them.
So, there is no valid expression that looks like (int) expr. As #glennsl points out, there is a function that returns the length of a list. If that's what you're trying to calculate, you can use List.length _dimension. The other occurrences of (int) you can just remove. They aren't valid OCaml.
Your definition of fn doesn't use the type game anywhere, so the definition of game is unnecessary. However it makes me worry that you're expecting the definition to have an effect.
If you leave out all the type ascriptions in your definition of fn the compiler will deduce the most general type for your function. This means you can call it with any list to check its length. That would be more idiomatic in OCaml. I.e., you don't need to specify that p1 is a list of cards. The function makes sense for any list.
I don't know why my code doesn't work.
fun lookup _ [] = 0
| lookup key ((k,v)::entries) =
if k = key
then v
else (lookup key entries)
That's what happened when I tested it in cmd.
val lookup = fn : ''a -> (''a * int) list -> int
- lookup (1,[(1,2),(2,3)]);
val it = fn : ((int * (int * int) list) * int) list -> int
There's nothing wrong with your code, you just didn't call lookup with enough arguments. You make a common mistakes among beginner SML programmers coming from other languages. I'll try to clarify that.
First, the most important thing to know about functions in Standard ML is this:
All functions in Standard ML take exactly one argument.
You might be confused at this point, because your lookup function looks as if it's taking two arguments. It kind of does, but not really.
There are two main "workarounds" (I'm using quotes because this is actually a great feature of the language) for representing functions that take multiple arguments:
1. Using curried functions
If you need to write a function which, conceptually, needs three arguments, then you'd write this:
fun quux a =
fn b =>
fn c =>
(* do something with a, b and c *)
So, quux is:
a function, which takes an argument a and returns
a function, which takes an argument b and returns
a function, which takes an argument c and returns
the result computed using a, b and c
How would you call quux? Like this, right?
((quux a) b) c
But function application is already left associative, so we can actually write this:
quux a b c
We don't need parentheses to "call" functions! In Standard ML parentheses don't mean "call this function". They're used just for grouping expressions together when you want to change associativity, like in mathematics: (1 + 2) * 3.
Because defining quux as above is really cumbersome, there's a syntactic shortcut in the language. Instead of writing this:
fun quux a =
fn b =>
fn c =>
(* do something with a, b and c *)
We can write just this:
fun quux a b c = (* do something with a, b and c *)
But, they're the same thing. quux is still a function which takes just argument a and returns a new function with argument b, which returns a new function which argument c.
Ok, so that was one way of representing multi-argument functions in Standard ML. It's also the one you used to define lookup.
2. Using tuples
Another common way of representing multi-argument functions is to accept a tuple (which may have from 2 to as many components as you wish). Here's the above example using a tuple now:
fun quux args =
case args of
(a,b,c) => (* do something with a, b and c *)
How could we call quux now? Like this:
quux (a,b,c)
Notice that I put a space between quux and the tuple. It's optional, but I do it all the time to keep remembering that function application in standard ML is not represented by parentheses. quux gets called because it's been put before the tuple (a,b,c). Tuples, however, do require parentheses, which is why you're seeing them above.
Again, as before, it's cumbersome to define quux like this:
fun quux args =
case args of
(a,b,c) => (* do something with a, b and c *)
So we can actually use another great feature of the language, pattern matching in argument position, that lets us write this:
fun quux (a,b,c) = (* do something with a, b and c *)
Ok, now we can really answer your question.
You defined lookup using the curried function syntax:
fun lookup _ [] = 0
But you "called" lookup using the tuple syntax, where 1 is the first element of the tuple and [(1,2),(2,3)] is the second element.
lookup (1, [(1,2),(2,3)])
Why doesn't the compiler complain, though. In this unfortunate case, it doesn't because it happens that the type of the first argument of lookup is a tuple. So, you've basically called lookup with a single argument.
What you wanted was this:
lookup 1 [(1,2),(2,3)]
Notice that I'm not defining a tuple anymore.
type foo = A of int * int | B of (int * int)
What is the difference between int * int and (int * int) there? The only difference I see is in pattern matching:
let test_foo = function
| A (f, s) -> (f, s)
| B b -> b
Is it just a syntactic sugar? How do you select which one to use? Is there any performance difference between these two forms?
Yes, there is a performance difference:
In memory A (23, 42) will contain a tag identifying it as an A and the two integers 23 and 42. B (23, 42) will contain a tag identifying it as a B and a pointer to a tuple containing the integers 23 and 42. So there will be one additional memory allocation when creating a B and one additional level of indirection when accessing the individual values inside a B. So in cases where you don't actually use the constructor arguments as a tuple, using A will involve less overhead than using B.
On the other hand your test_foo function will create a new tuple every time it is called with an A value, but when it is called with a B value it will simply return the tuple that already exists in memory. So test_foo is a cheaper operation for B than it is for A. So if you'll be using the constructor's arguments as a tuple and you will do so multiple times for the same value, using B will be cheaper.
So if you're going to be using the constructor arguments as a tuple, it makes sense to use a constructor taking a tuple because you can get at the tuple using pattern matching with less code and because it will avoid having to create tuples from the same value multiple times. In all other cases not using a tuple is preferable because it involves less memory allocation and less indirection.
As already said, the constructor of A takes two int, whereas the constructor of B takes an ordered pair.
so you can write
let bar = A (1, 2)
or
let bar = B (1, 2)
or
let bar = (1, 2)
let baz = B bar
but you cannot write
let bar = (1, 2)
let baz = A bar
Moreover, in your pattern matching, you can still match the content of B as two int, but you cannot match the content of A as value bound to an ordered pair
let test_foo = function
| A a -> a (* wrong *)
| B (f, s) -> (f, s) (* ok *)
They are two different types. The interpretation of this syntax is ambiguous at the * operator. It may be reduced into the form:
type x = Y * Z in which the '*' is associated with the type keyword in OCaml
or
int * int in which the * is used in the capacity of an operator that constructs a tuple
The default precedence takes it to the former. By putting a parenthesis around the (int * int) you override the default precedence and force the latter interpretation.
This is one of the tricky things in OCaml syntax -- even though it looks like you are declaring a constructor with a tuple data type (A of int * int), and even though when you use the constructor, it looks like you are giving a tuple to it (A (2,3)), that is not actually what is happening.
If you actually construct a tuple value and try to pass it to the constructor, it will not compile -- let x = (2,3) in A x. Rather, the * in the constructor definition and the (,) in the constructor use expression are simply the syntax for a constructor of multiple arguments. The syntax imitates that of a constructor with a tuple argument, but is actually separate. The extra parentheses are necessary if you want to actually make a constructor with a single tuple argument.
Basically, I want to have a function to return a polymorphic function, some thing like this:
fun foo () = fn x => x
So the foo function takes in a value of type unit and returns a polymorphic identity function
and the compiler is happy with that, it gives me:
val foo = fn : unit -> 'a -> 'a
but once I actually call the foo function, the return value is not what I expected
val it = fn : ?.X1 -> ?.X2
Can't generalize because of value restriction it says, any help? thanks in advance
For technical reasons, you are not allowed to generalize (i.e., make polymorphic) the results of a function call. The result of a call must have a monomorphic type. If this weren't the case, you could subvert the type system by the following dirty trick:
Call ref [] and get back a list of type forall 'a . 'a list ref
Insert a string.
Remove a function
and there you are: you are now executing the contents of an arbitrary string as code. Not Good.
By insisting that the value returned by ref [] be monomorphic, you ensure that it can be used as a list of strings or a list of functions but not both. So this is part of the price we pay for type safety.