Haskell - types and if statements - if-statement

Is there a good way to use type information to choose to do different things?
For example, this isn't valid Haskell, but I don't see why it couldn't be:
tostring :: (Show b) => b -> String
tostring x = f x where f = if b == String then tail . init . show else show
The important part is not getting the correct string out, but using the type of b as a way to switch between functionality/functions.

#chi's answer already demonstrates how to use Typeable to do run-time type checking, but I'd like to point out that to me, this looks like exactly the thing typeclasses are meant for. For your example, the only problem is that you don't like the Show implementation for String: In that case, just create your own typeclass!
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE UndecidableInstances #-}
-- The class
class MyShow a where
myShow :: a -> String
-- The instance for String
-- (The `OVERLAPPING` pragma is required, because
-- otherwise GHC won't know which instance to choose for String)
instance {-# OVERLAPPING #-} MyShow [Char] where
myShow = tail . init . show
-- For everything that is not a String, just copy the Show instance
instance Show a => MyShow a where
myShow = show
EDIT: As pointed out by leftaroundabout, overlapping instances are complicated and can lead to some unexpected behavior. Look at the example at the bottom of the documentation.

I will answer the question as it is. Haskell erases all type information during compile time, mostly for efficiency reasons. By default, when a polymorphic function is called, e.g. f :: a->a, no type information is available, and f has no way to know what a actually is -- in this case, f can only be the identity function, fail to terminate, or raise an error.
For the rare cases where type information is needed, there is Typeable. A polymorphic function having type f :: Typeable a => ... is passed a run-time description of the type a, allowing it to test it. Essentially, the Typeable a constraint forces Haskell to keep the runtime information until run time. Note that such type information must be known at the call site -- either because f is called with a completely known type, or because f is called with a partially known type (say f x with x :: Maybe b) but there are suitable Typeable constraints in scope (Typeable b, in the previous example).
Anyway, here's an example:
{-# LANGUAGE TypeApplications, ScopedTypeVariables, GADTs #-}
import Data.Typeable
tostring :: forall b. (Show b, Typeable b) => b -> String
tostring x = case eqT #b #String of -- if b==String
Just Refl -> x -- then
Nothing -> show x -- else
Note how we were able to return x in the "then" branch, since there it is known to be a String.

Related

Is there a function that can make a string representation of any type?

I was desperately looking for the last hour for a method in the OCaml Library which converts an 'a to a string:
'a -> string
Is there something in the library which I just haven't found? Or do I have to do it different (writing everything by my own)?
It is not possible to write a printing function show of type 'a -> string in OCaml.
Indeed, types are erased after compilation in OCaml. (They are in fact erased after the typechecking which is one of the early phase of the compilation pipeline).
Consequently, a function of type 'a -> _ can either:
ignore its argument:
let f _ = "<something>"
peek at the memory representation of a value
let f x = if Obj.is_block x then "<block>" else "<immediate>"
Even peeking at the memory representation of a value has limited utility since many different types will share the same memory representation.
If you want to print a type, you need to create a printer for this type. You can either do this by hand using the Fmt library (or the Format module in the standard library)
type tree = Leaf of int | Node of { left:tree; right: tree }
let pp ppf tree = match tree with
| Leaf d -> Fmt.fp ppf "Leaf %d" d
| Node n -> Fmt.fp ppf "Node { left:%a; right:%a}" pp n.left pp n.right
or by using a ppx (a small preprocessing extension for OCaml) like https://github.com/ocaml-ppx/ppx_deriving.
type tree = Leaf of int | Node of { left:tree; right: tree } [##deriving show]
If you just want a quick hacky solution, you can use dump from theBatteries library. It doesn't work for all cases, but it does work for primitives, lists, etc. It accesses the underlying raw memory representation, hence is able to overcome (to some extent) the difficulties mentioned in the other answers.
You can use it like this (after installing it via opam install batteries):
# #require "batteries";;
# Batteries.dump 1;;
- : string = "1"
# Batteries.dump 1.2;;
- : string = "1.2"
# Batteries.dump [1;2;3];;
- : string = "[1; 2; 3]"
If you want a more "proper" solution, use ppx_deriving as recommended by #octachron. It is much more reliable/maintainable/customizable.
What you are looking for is a meaningful function of type 'a. 'a -> string, with parametric polymorphism (i.e. a single function that can operate the same for all possible types 'a, even those that didn’t exist when the function was created). This is not possible in OCaml. Here are explications depending on your programming background.
Coming from Haskell
If you were expecting such a function because you are familiar with the Haskell function show, then notice that its type is actually show :: Show a => a -> String. It uses an instance of the typeclass Show a, which is implicitly inserted by the compiler at call sites. This is not parametric polymorphism, this is ad-hoc polymorphism (show is overloaded, if you want). There is no such feature in OCaml (yet? there are projects for the future of the language, look for “modular implicits” or “modular explicits”).
Coming from OOP
If you were expecting such a function because you are familiar with OO languages in which every value is an object with a method toString, then this is not the case of OCaml. OCaml does not use the object model pervasively, and run-time representation of OCaml values retains no (or very few) notion of type. I refer you to #octachron’s answer.
Again, toString in OOP is not parametric polymorphism but overloading: there is not a single method toString which is defined for all possible types. Instead there are multiple — possibly very different — implementations of a method of the same name. In some OO languages, programmers try to follow the discipline of implementing a method by that name for every class they define, but it is only a coding practice. One could very well create objects that do not have such a method.
[ Actually, the notions involved in both worlds are pretty similar: Haskell requires an instance of a typeclass Show a providing a function show; OOP requires an object of a class Stringifiable (for instance) providing a method toString. Or, of course, an instance/object of a descendent typeclass/class. ]
Another possibility is to use https://github.com/ocaml-ppx/ppx_deriving with will create the function of Path.To.My.Super.Type.t -> string you can then use with your value. However you still need to track the path of the type by hand but it is better than nothing.
Another project provide feature similar to Batterie https://github.com/reasonml/reason-native/blob/master/src/console/README.md (I haven't tested Batterie so can't give opinion) They have the same limitation: they introspect the runtime encoding so can't get something really useable. I think it was done with windows/browser in mind so if cross plat is required I will test this one before (unless batterie is already pulled). and even if the code source is in reason you can use with same API in OCaml.

When does the relaxed value restriction kick in in OCaml?

Can someone give a concise description of when the relaxed value restriction kicks in? I've had trouble finding a concise and clear description of the rules. There's Garrigue's paper:
http://caml.inria.fr/pub/papers/garrigue-value_restriction-fiwflp04.pdf
but it's a little dense. Anyone know of a pithier source?
An Addendum
Some good explanations were added below, but I was unable to find an explanation there for the following behavior:
# let _x = 3 in (fun () -> ref None);;
- : unit -> 'a option ref = <fun>
# let _x = ref 3 in (fun () -> ref None);;
- : unit -> '_a option ref = <fun>
Can anyone clarify the above? Why does the stray definition of a ref within the RHS of the enclosing let affect the heuristic.
I am not a type theorist, but here is my interpretation of Garrigue's explanation. You have a value V. Start with the type that would be assigned to V (in OCaml) under the usual value restriction. There will be some number (maybe 0) monomorphic type variables in the type. For each such variable that appears only in covariant position in the type (on the right sides of function arrows), you can replace it with a fully polymorphic type variable.
The argument goes as follows. Since your monomorphic variable is a variable, you can imagine replacing it with any single type. So you choose an uninhabited type U. Now since it is in covariant position only, U can in turn be replaced by any supertype. But every type is a supertype of an uninhabited type, hence it's safe to replace with a fully polymorphic variable.
So, the relaxed value restriction kicks in when you have (what would be) monomorphic variables that appear only in covariant positions.
(I hope I have this right. Certainly #gasche would do better, as octref suggests.)
Jeffrey provided the intuitive explanation of why the relaxation is correct. As to when it is useful, I think we can first reproduce the answer octref helpfully linked to:
You may safely ignore those subtleties until, someday, you hit a problem with an abstract type of yours that is not as polymorphic as you would like, and then you should remember than a covariance annotation in the signature may help.
We discussed this on reddit/ocaml a few months ago:
Consider the following code example:
module type S = sig
type 'a collection
val empty : unit -> 'a collection
end
module C : S = struct
type 'a collection =
| Nil
| Cons of 'a * 'a collection
let empty () = Nil
end
let test = C.empty ()
The type you get for test is '_a C.collection, instead of the 'a C.collection that you would expect. It is not a polymorphic type ('_a is a monomorphic inference variable that is not yet fully determined), and you won't be happy with it in most cases.
This is because C.empty () is not a value, so its type is not generalized (~ made polymorphic). To benefit from the relaxed value restriction, you have to mark the abstract type 'a collection covariant:
module type S = sig
type +'a collection
val empty : unit -> 'a collection
end
Of course this only happens because the module C is sealed with the signature S : module C : S = .... If the module C was not given an explicit signature, the type-system would infer the most general variance (here covariance) and one wouldn't notice that.
Programming against an abstract interface is often useful (when defining a functor, or enforcing a phantom type discipline, or writing modular programs) so this sort of situation definitely happens and it is then useful to know about the relaxed value restriction.
That's an example of when you need to be aware of it to get more polymorphism, because you set up an abstraction boundary (a module signature with an abstract type) and it doesn't work automatically, you have explicitly to say that the abstract type is covariant.
In most cases it happens without your notice, when you manipulate polymorphic data structures. [] # [] only has the polymorphic type 'a list thanks to the relaxation.
A concrete but more advanced example is Oleg's Ber-MetaOCaml, which uses a type ('cl, 'ty) code to represent quoted expressions which are built piecewise. 'ty represents the type of the result of the quoted code, and 'cl is a kind of phantom region variable that guarantees that, when it remains polymorphic, the scoping of variable in quoted code is correct. As this relies on polymorphism in situations where quoted expressions are built by composing other quoted expressions (so are generally not values), it basically would not work at all without the relaxed value restriction (it's a side remark in his excellent yet technical document on type inference).
The question why the two examples given in the addendum are typed differently has puzzled me for a couple of days. Here is what I found by digging into the OCaml compiler's code (disclaimer: I'm neither an expert on OCaml nor on the ML type system).
Recap
# let _x = 3 in (fun () -> ref None);; (* (1) *)
- : unit -> 'a option ref = <fun>
is given a polymorphic type (think ∀ α. unit → α option ref) while
# let _x = ref 3 in (fun () -> ref None);; (* (2) *)
- : unit -> '_a option ref = <fun>
is given a monomorphic type (think unit → α option ref, that is, the type variable α is not universally quantified).
Intuition
For the purposes of type checking, the OCaml compiler sees no difference between example (2) and
# let r = ref None in (fun () -> r);; (* (3) *)
- : unit -> '_a option ref = <fun>
since it doesn't look into the body of the let to see if the bound variable is actually used (as one might expect). But (3) clearly must be given a monomorphic type, otherwise a polymorphically typed reference cell could escape, potentially leading to unsound behaviour like memory corruption.
Expansiveness
To understand why (1) and (2) are typed the way they are, let's have a look at how the OCaml compiler actually checks whether a let expression is a value (i.e. "nonexpansive") or not (see is_nonexpansive):
let rec is_nonexpansive exp =
match exp.exp_desc with
(* ... *)
| Texp_let(rec_flag, pat_exp_list, body) ->
List.for_all (fun vb -> is_nonexpansive vb.vb_expr) pat_exp_list &&
is_nonexpansive body
| (* ... *)
So a let-expression is a value if both its body and all the bound variables are values.
In both examples given in the addendum, the body is fun () -> ref None, which is a function and hence a value. The difference between the two pieces of code is that 3 is a value while ref 3 is not. Therefore OCaml considers the first let a value but not the second.
Typing
Again looking at the code of the OCaml compiler, we can see that whether an expression is considered expansive determines how the type of the let-expressions is generalised (see type_expression):
(* Typing of toplevel expressions *)
let type_expression env sexp =
(* ... *)
let exp = type_exp env sexp in
(* ... *)
if is_nonexpansive exp then generalize exp.exp_type
else generalize_expansive env exp.exp_type;
(* ... *)
Since let _x = 3 in (fun () -> ref None) is nonexpansive, it is typed using generalize which gives it a polymorphic type. let _x = ref 3 in (fun () -> ref None), on the other hand, is typed via generalize_expansive, giving it a monomorphic type.
That's as far as I got. If you want to dig even deeper, reading Oleg Kiselyov's Efficient and Insightful Generalization alongside generalize and generalize_expansive may be a good start.
Many thanks to Leo White from OCaml Labs Cambridge for encouraging me to start digging!
Although I'm not very familiar with this theory, I have asked a question about it.
gasche provided me with a concise explanation. The example is just a part of OCaml's map module. Check it out!
Maybe he will be able to provide you with a better answer. #gasche

Casting to and from a type parameter in F#

I'm just starting out in F#, and some of the issues around casting are confusing me mightily. Unfortunately, my background reading to try to figure out why is confusing me even more, so I'm looking for some specific answers I can fit into the general explanations...
I've got a ReadOnlyCollection<'T> of enums, produced by this function:
let GetValues<'T when 'T :> Enum> () =
(new ReadOnlyCollection<'T>(Enum.GetValues (typeof<'T>) :?> 'T[])) :> IList<'T>
What I want to do with it is find all the bits of the enum that are used by its values (i.e., bitwise-or all the values in the list together), and return that as the generic enum type, 'T. The obvious way to do that seemed to me to be this:
let UsedBits<'T when 'T :> Enum> () =
GetValues<'T>()
|> Seq.fold (fun acc a -> acc ||| a) 0
...except that that fails to compile, with the error "The declared type parameter 'T' cannot be used here since the type parameter cannot be resolved at compile time."
I can get the actual job done by converting to Int32 first (which I don't really want to do, because I want this function to work on all enums regardless of underlying type), viz.:
let UsedBits<'T when 'T :> Enum> () =
GetValues<'T>()
|> Seq.map (fun a -> Convert.ToInt32(a))
|> Seq.fold (fun acc a -> acc ||| a) 0
...but then the result is produced as Int32. If I try to cast it back to 'T, I again get compilation errors.
I don't want to get too specific in my question because I'm not sure which specifics I should be asking about, so -- where's the flaw(s) in this approach? How should I be going about it?
(Edited to add:, post #Daniel's answer
Alas, this appears to be one of those situations where I don't understand the context well enough to understand the answer, so...
I think I understand what inline and the different constraint are doing in your answer, but being an F# newbie, would you mind awfully expanding on those things a little so I can check that my understanding isn't way off base? Thanks.
)
You could do this:
let GetValues<'T, 'U when 'T : enum<'U>>() =
Enum.GetValues(typeof<'T>) :?> 'T[]
let inline GetUsedBits() =
GetValues() |> Seq.reduce (|||)
inline allows a more flexible constraint, namely 'T (requires member ( ||| )). Without it, the compiler must choose a constraint that can be expressed in IL, or, if unable to do so, choose a concrete type. In this case it chooses int since it supports (|||).
Here's a simpler repro:
let Or a b = a ||| b //add 'inline' to compare
See Statically Resolved Type Parameters on MSDN for more info.

Calling an IO Monad inside an Arrow

Perhaps I'm going about this the wrong way, but I'm using HXT to read in some vertex data that I'd like to use in an array in HOpenGL. Vertex arrays need to be a Ptr which is created by calling newArray. Unfortunately newArray returns an IO Ptr, so I'm not sure how to go about using it inside an Arrow. I think I need something with a type declaration similar to IO a -> Arrow a?
The type IO a -> Arrow a doesn't make sense; Arrow is a type class, not a specific type, much like Monad or Num. Specifically, an instance of Arrow is a type constructor taking two parameters that describes things that can be composed like functions, matching types end-to-end. So, converting IO a to an arrow could perhaps be called a conceptual type error.
I'm not sure exactly what you're trying to do, but if you really want to be using IO operations as part of an Arrow, you need your Arrow instance to include that. The simplest form of that is to observe that functions with types like a -> m b for any Monad instance can be composed in the obvious way. The hxt package seems to provide a more complicated type:
newtype IOSLA s a b = IOSLA { runIOSLA :: s -> a -> IO (s, [b]) }
This is some mixture of the IO, State, and [] monads, attached to a function as above such that you can compose them going through all three Monads at each step. I haven't really used hxt much, but if these are the Arrows you're working with, it's pretty simple to lift an arbitrary IO function to serve as one--just pass the state value s through unchanged, and turn the output of the function into a singleton list. There may already be a function to do this for you, but I didn't see one at a brief glance.
Basically, you'd want something like this:
liftArrIO :: (a -> IO b) -> IOSLA s a b
liftArrIO f = IOSLA $ \s x -> fmap (\y -> (s, [y])) (f x)

Keeping type generic without η-expansion

What I'm doing: I'm writing a small interpreter system that can parse a file, turn it into a sequence of operations, and then feed thousands of data sets into that sequence to extract some final value from each. A compiled interpreter consists of a list of pure functions that take two arguments: a data set, and an execution context. Each function returns the modified execution context:
type ('data, 'context) interpreter = ('data -> 'context -> 'context) list
The compiler is essentially a tokenizer with a final token-to-instruction mapping step that uses a map description defined as follows:
type ('data, 'context) map = (string * ('data -> 'context -> 'context)) list
Typical interpreter usage looks like this:
let pocket_calc =
let map = [ "add", (fun d c -> c # add d) ;
"sub", (fun d c -> c # sub d) ;
"mul", (fun d c -> c # mul d) ]
in
Interpreter.parse map "path/to/file.txt"
let new_context = Interpreter.run pocket_calc data old_context
The problem: I'd like my pocket_calc interpreter to work with any class that supports add, sub and mul methods, and the corresponding data type (could be integers for one context class and floating-point numbers for another).
However, pocket_calc is defined as a value and not a function, so the type system does not make its type generic: the first time it's used, the 'data and 'context types are bound to the types of whatever data and context I first provide, and the interpreter becomes forever incompatible with any other data and context types.
A viable solution is to eta-expand the definition of the interpreter to allow its type parameters to be generic:
let pocket_calc data context =
let map = [ "add", (fun d c -> c # add d) ;
"sub", (fun d c -> c # sub d) ;
"mul", (fun d c -> c # mul d) ]
in
let interpreter = Interpreter.parse map "path/to/file.txt" in
Interpreter.run interpreter data context
However, this solution is unacceptable for several reasons:
It re-compiles the interpreter every time it's called, which significantly degrades performance. Even the mapping step (turning a token list into a interpreter using the map list) causes a noticeable slowdown.
My design relies on all interpreters being loaded at initialization time, because the compiler issues warnings whenever a token in the loaded file does not match a line in the map list, and I want to see all those warnings when the software launches (not when individual interpreters are eventually run).
I sometimes want to reuse a given map list in several interpreters, whether on its own or by prepending additional instructions (for instance, "div").
The questions: is there any way to make the type parametric other than eta-expansion? Maybe some clever trick involving module signatures or inheritance? If that's impossible, is there any way to alleviate the three issues I have mentioned above in order to make eta-expansion an acceptable solution? Thank you!
A viable solution is to eta-expand the
definition of the interpreter to allow
its type parameters to be generic:
let pocket_calc data context =
let map = [ "add", (fun d c -> c # add d) ;
"sub", (fun d c -> c # sub d) ;
"mul", (fun d c -> c # mul d) ]
in
let interpreter = Interpreter.parse map "path/to/file.txt" in
Interpreter.run interpreter data context
However, this solution is unacceptable
for several reasons:
It re-compiles the interpreter every time it's called, which
significantly degrades performance.
Even the mapping step (turning a token
list into a interpreter using the map
list) causes a noticeable slowdown.
It recompiles the interpreter every time because you are doing it wrong. The proper form is more something like this (and technically, if the partial interpretation of Interpreter.run to interpreter can do some computations, you should move it out of the fun too).
let pocket_calc =
let map = [ "add", (fun d c -> c # add d) ;
"sub", (fun d c -> c # sub d) ;
"mul", (fun d c -> c # mul d) ]
in
let interpreter = Interpreter.parse map "path/to/file.txt" in
fun data context -> Interpreter.run interpreter data context
I think your problem lies in a lack of polymorphism in your operations, which you would like to have a closed parametric type (works for all data supporting the following arithmetic primitives) instead of having a type parameter representing a fixed data type.
However, it's a bit difficult to ensure it's exactly this, because your code is not self-contained enough to test it.
Assuming the given type for primitives :
type 'a primitives = <
add : 'a -> 'a;
mul : 'a -> 'a;
sub : 'a -> 'a;
>
You can use the first-order polymorphism provided by structures and objects :
type op = { op : 'a . 'a -> 'a primitives -> 'a }
let map = [ "add", { op = fun d c -> c # add d } ;
"sub", { op = fun d c -> c # sub d } ;
"mul", { op = fun d c -> c # mul d } ];;
You get back the following data-agnostic type :
val map : (string * op) list
Edit: regarding your comment about different operation types, I'm not sure which level of flexibility you want. I don't think you could mix operations over different primitives in the same list, and still benefit from the specifities of each : at best, you could only transform an "operation over add/sub/mul" into an "operation over add/sub/mul/div" (as we're contravariant in the primitives type), but certainly not much.
On a more pragmatic level, it's true that, with that design, you need a different "operation" type for each primitives type. You could easily, however, build a functor parametrized by the primitives type and returning the operation type.
I don't know how one would expose a direct subtyping relation between different primitive types. The problem is that this would need a subtyping relation at the functor level, which I don't think we have in Caml. You could, however, using a simpler form of explicit subtyping (instead of casting a :> b, use a function a -> b), build second functor, contravariant, that, given a map from a primitive type to the other, would build a map from one operation type to the other.
It's entirely possible that, with a different and clever representation of the type evolved, a much simpler solution is possible. First-class modules of 3.12 might also come in play, but they tend to be helpful for first-class existential types, whereas here we rhater use universal types.
Interpretive overhead and operation reifications
Besides your local typing problem, I'm not sure you're heading the right way. You're trying to eliminate interpretive overhead by building, "ahead of time" (before using the operations), a closure corresponding to a in-language representation of your operation.
In my experience, this approach doesn't generally get rid of interpretive overhead, it rather moves it to another layer. If you create your closures naïvely, you will have the parsing flow of control reproduced at the closure layer : the closure will call other closures, etc., as your parsing code "interpreted" the input when creating the closure. You eliminated the cost of parsing, but the possibly suboptimal flow of control is still the same. Additionnaly, closures tend to be a pain to manipulate directly : you have to be very careful about comparison operations for example, serialization, etc.
I think you may be interested in the long term in an intermediate "reified" language representing your operations : a simple algebraic data type for arithmetic operations, that you would build from your textual representation. You can still try to build closures "ahead of time" from it, though I'm not sure the performances are much better than directly interpreting it, if the in-memory representation is decent. Moreover, it will be much easier to plug in intermediary analyzers/transformers to optimize your operations, for example going from an "associative binary operations" model to a "n-ary operations" model, which could be more efficiently evaluated.