why does OCaml use subtyping for polymorphic variants? - ocaml

I have just read about row polymorphism and how it can be used for extensible records and polymorphic variants.
However, Ocaml uses subtyping for polymorphic variants. Why? Is it more powerful than row polymorphism?

OCaml uses both row polymoprhism and subtyping for polymorphic variants (and objects, for that matter). Row polymorphism is involved for "open" object types < m1 : t1; m2 : t2; .. > (the .. being literally part of the type), or "open" variant types [> `K1 of t1 | `K2 of t2 ]. Subtyping is used to be able to cast between closed, non-polymorphic types <m1:t1; m2:t2> :> <m1:t1> or [ `K1 of t1 ] :> [ `K1 of t1 | `K2 of t2 ].
Row polymorphism allows to avoid the need for bounded quantification to express types such as "take an object that has at least the method m, and return an object of the same type": subtyping is therefore rather simple, explicit, and cannot be abstracted over. On the contrary, row polymorphism is easier to infer and will play better with the rest of the type system. It should be rarely necessary to use closed types and explicit subtyping, but that is occasionally convenient -- and in particular, keeping type closed can produce error messages that are easier to understand.

To complement Gabriel's answer, one way to think about this is that subtyping provides a weak form of both universal and existential polymorphism. When both directions of parametric polymorphism are available, then the expressiveness of subtyping is mostly subsumed (especially when there is no depth subtyping). But that's not the case in Ocaml.
Ocaml replaces the universal aspect by actual universal polymorphism, but keeps subtyping to give you a form of existential quantification that it otherwise doesn't have. That is needed to form e.g. heterogeneous collections, such as a <a: int> list in which you want to be able to store arbitrary objects that at least have an a method of the right type.
I would go even further and say that, while this is usually explained as subtyping in the Ocaml world, you could actually interpret closed rows as existentially quantified over an (unknown) tail. Coercion via :> would then be existential introduction, thereby staying more faithful to the world of parametric polymorphism that rows are built upon. (Of course, under this interpretation, # would do implicit existential elimination.) If I were to design an Ocaml-like system from scratch, I'd probably try to model it that way.

Related

OCaml - Why is Either not a Monad

I'm new to OCaml, but have worked with Rust, Haskell, etc, and was very surprised when I was trying to implement bind on Either, and it doesn't appear that any of the general implementations have bind implemented.
JaneStreet's Base is missing it
What I assume is the standard library is missing it
bind was the first function I reached for... even before match, and the implementation seems quite easy:
let bind_either (m: ('e, 'a) Either.t) (f: 'a -> ('e, 'b) Either.t): ('e, 'b) Either.t =
match m with
| Right r -> f r
| Left l -> Left l
Am I missing something?
It is because we prefer a more specific Result.t, which has clear names for the ok state and for the exceptional state. And, in general, Either.t is not extremely popular amongst OCaml programmers as usually, a more specialized type could be used with the variant names that better communicate the domain-specific purpose of either branch. It is also worth mentioning that Either was introduced to the OCaml standard very recently, just 4.12, so it might become more popular.
As mentioned by #ivg, Either is relatively new to the standard library and generally one would prefer to use types that make more sense. For example, Result for error handling.
There is also another point of view, which also applies to Result. Monads act on types parameterised by one type.
In Haskell, this is much less obvious because it is possible to partially apply type constructors. Hence; bind:: (a -> b) -> Either a -> Either b allows you to go from Either a c to Either b c.
In trying to generalise the behaviour of a monad via parameterised modules (functors in the ML sense of the term), one would have to "trick" oneself into standardising, for example, the treatment of option (a type of arity 1) and either (or result) which are of arity 2.
There are several approaches. For example, expressing multiple interfaces to describe a monad. For example describing Monad2 and describing Monad in terms of Monad2 as is done in the Base library (https://ocaml.janestreet.com/ocaml-core/latest/doc/base/Base/Monad/index.html)
In Preface we used a rather different (and perhaps less generic) approach. We leave it to the user to set the left parameter of Either (via a functor) (and the right parameter for Result): https://github.com/xvw/preface/blob/master/lib/preface_stdlib/either.mli
However, we do not lose the ability to change the left-hand type of the calculation because Either also has a Bifunctor module that allows us to change the type of both parameters. The conversation is broadly described in this thread: https://discuss.ocaml.org/t/instance-modules-for-more-parametrized-types/5356/2

Understanding Monad

Question
Please help confirm or correct the understandings of what Monad is and its traits.
As Data Type
In my understanding, a Monad is:
a container which can accommodate any type T and
provides a bind interface that allows its client to apply a flat-map function and
projects its content into another Monad of any type T'.
There need to be a return or unit interface to create a Monad of type T.
unit:= T -> M[T]
In Scala, List() or Set() are the examples of return interface, and any Scala sequence types (Array, List, Map, String) are Monad which provide flatMap interface which is bind.
Are these correct?
As Design Pattern
Software engineering provides ways to manage complexity or to structure software, such as Structured Programming without goto, UNIX pipe to pipeline transformation, Object Oriented to encapsulate data & control access, etc.
Is Monad a design pattern providing a way to structure a computation as a chain?
In other systems
UNIX commands
I suppose UNIX commands e.g. cat, grep are functions that can be chained but it does not mean they are Monad, and they are not Monad because they do not have return/unit nor they are not data type. Or is it still regarded e.g. IO Monad as in Monadic i/o and UNIX shell programming?
Python
I believe there is no bind or Scala flatMap equivalent in Python out of the box. Can I say Python does not have Monad feature out of the box?
References
Demystifying the Monad in Scala
Functors, Applicatives, And Monads In Pictures
Functional Programming and Category Theory [Part 1] - Categories and Functors
Monad (functional programming)
Monad in plain English? (For the OOP programmer with no FP background)
Yes, you're right about those interface things. However, it is noteworthy that in abstraction, a monad should have two adjunctive methods which can be composed to chain the computations. Note that flatMap is simply the composition of such methods - flat and map. map can be used to define a computation of type M[A] -> M[M[B]] and flat which is used to define M[M[B]] -> M[B].
Yes, in Scala they're a means to chain computations.
The shell script commands may fulfil the purpose of the monads (in the considered analogies) but still can't be regarded as monads (by me at least) as they don't necessarily comply with point 1.
Yes, the monads are NOT supported out-of-the-box in Python. One has to rely on the nested loops only.

Is it possible to support higher-kinded types in Standard ML?

I have read in this post that ML dialects do not allow type variables of non-ground kind. E.g. the last statement is not representable:
-- Haskell code
type Ground = Int
type FirstOrder a = Maybe a
type SecondOrder c = c Int -- ML do not allow :c
OCaml has support of higher-kinded only at the level of modules. There are some explanations (here and author's comment here) about which features of OCaml clash with higher-kinded types opportunity.
If I understood it correctly, the main problem is in the following facts:
OCaml does not follow a "freshness" restriction for type definitions: construct type can define both an alias (an the type will remain the same) and a new fresh type
type alias definition can be hidden
AFAIK, Standard ML has different constructs for type definition and aliases: type for aliases and datatype for new fresh types introduction.
Unfortunatelly, I do not know SML well enough -- is it possible to export type aliases with its definition hidden? And can someone please show me if there are any other SML features that still do not go well with an opportunity of higher-kinded types?
Probably there will be some problems with functors -- Could one be so kind to show a code example of it? I've heard several times about such cases but still have not found a complete example of it.
Yes, SML can express the equivalent of higher-kinded types through functors, and can also make them abstract. Useless example:
functor F (type 'a t) :> sig type 'a u end =
struct
type 'a u = ('a t) t
end
However, unlike OCaml, SML does not (officially) have higher-order functors, so per the standard, you can only express second-order type constructors this way.
FWIW, OCaml may use the same keyword for type aliases and generative types (type vs datatype in SML), but they are still distinguished syntactically, by their right-hand side. So that's no real difference to SML.In both languages, an abstract occurring in a signature can be implemented as either a type alias or a generative type. So the problem for type inference that Leo is alluding to exists equally in both. Haskell can get away without that problem because it does not have the same expressiveness regarding type abstraction (i.e., no "sealing" operator for modules).

OCaml polymorphism example other than template function?

I am trying to understand for myself, which form of polymorhism does OCaml language have.
I was provided by an example
let id x = x
Isn't this example equivalent to C++ template function
template<class A> A id(A x) { return x; }
If so then my question is: are there any other forms of polymorphism in OCaml? This notion is called "generic algorithm" in the world of imperative languages, not "polymorphism".
There are basically three language features that are sometimes called polymorphism:
Parametric polymorphism (i.e. "generics")
Subtype polymorphism, this is the ability of a subtype of a type to offer a more specific version of an operation than the supertype, i.e. the ability to override methods (and the ability of the the runtime system to call the correct implementation of a method based on the runtime type of an object). In OO languages this is often simply referred to as "polymorphism".
So-called ad-hoc polymorphism, i.e. the ability to overload functions/methods.
As you already discovered, OCaml has parametric polymorphism. It also has subtype polymorphism. It does not have ad-hoc polymorphism.
Since in your title you've asked for examples, here's an example of subtype polymorphism in OCaml:
class c = object
method m x = x+1
end
class d = object
inherit c
method m x = x+2
end
let main =
let o:c = new d in
print_int (o#m 2)
This will print 4.
This kind of polymorphism is called generic programming but the theoretical concept behind it is called parametric polymorphism.
The two examples you provided indeed show parametric polymorphism but OCaml is supported by a strong inferring type checker instead that the one provided by C++ (which is a solution more pragmatic and with more caveats) so the real difference is that in C++ the code is duplicated for every type you use it in your code while in OCaml it is resolved by type checker by verifying that a substitution of implicit type variables through unification does exist.
Everything can be polymorphic in OCaml just because nothing is usually annotated with types so in practice if something can be used as an argument to any function then it is implicitly allowed.
You can for example have type variables to define polymorphic methods:
let swap ((x : 'a), (y : 'b)) : 'b * 'a = (y, x)
so that this will work whatever type 'a o 'b is.
Another powerful polymorphic feature of OCaml are functors (which are not the common C++ functors) but are modules parametrized by other modules. The concept sounds scarier that it is but they indeed represent an higher order of polymorphic behavior for OCaml code.

What does `B means?

In toplevel, i get the following output:
#`B
- : [> `B ] = `B
then what does `B mean ? Why do we need it ?
Sincerely!
An identifier prefixed with a backquote like `B is a constructor of a polymorphic variant type. It's similar to the constructor of an algebraic type:
type abc = A | B | C
However, you can use polymorphic variant values without declaring them, and in general they're much more flexible than the usual algebraic types. The tradeoff is that they're also quite a bit trickier to use.
One thing people use them for is as simple named values, like enum values in C. Or, more precisely, like atoms in Lisp. You can use ordinary algebraic types for this, but you need to carefully maintain your definitions of them and guard against duplication. With polymorphic variants, you don't need to do either of these. You can use them without declaring them, and the constructors aren't required to be unique (two different types can have the same constructor).
Polymorphic variant constructors can also take parameters, as algebraic constructors can. So you can also write (`B 77), a constructor with a single int parameter.
This is a pretty big topic--see the above linked section of the OCaml manual for more details.
It's a polymorphic variant. From the documentation:
Variants as presented in section 1.4 are a powerful tool to build data structures and algorithms. However they sometimes lack flexibility when used in modular programming. This is due to the fact every constructor reserves a name to be used with a unique type. One cannot use the same name in another type, or consider a value of some type to belong to some other type with more constructors.
With polymorphic variants, this original assumption is removed. That is, a variant tag does not belong to any type in particular, the type system will just check that it is an admissible value according to its use. You need not define a type before using a variant tag. A variant type will be inferred independently for each of its uses.