How to use a pp function (formatter -> 'a -> unit) as an arg to printf? - ocaml

I am using a 3rd party module which exposes a function:
val pp : Format.formatter -> 'a -> unit
Unfortunately it doesn't expose a to_string (or show) function.
I want to find a way to use the result of pp in a format string, something like:
let output = Format.sprintf "Result: %s" (SomeModule.pp fmt myval)
But pp writes to fmt and returns unit so of course this is not valid.
I can tell I need to somehow make a formatter to pass to pp that writes to a string buffer, that I can then get contents of as a string, which I can then pass as an arg to sprintf
The use of pp like functions for making types printable seems pretty ubiquitous in OCaml (e.g. ppx_deriving show generates them) so I feel like there should be a simple way to achieve this, but I'm currently missing it.

By using asprintf instead, it's possible to use the %a format specifier to pass two arguments, a printer function and the value to be printed, which will then be formatted accordingly and inserted in its place:
let output = Format.asprintf "Result: %a" SomeModule.pp myval
The reason asprintf has to be used instead of sprintf is that the latter specifies an "input source" (the second argument of the format type) of type unit, while the former uses a formatter. This is what's going to be passed to the printer.
It still alludes me why there's a need for sprintf though, rather than just having asprintf. Perhaps there's some performance-related reason for it, but my guess is that it's just an artifact of history.

Related

Is there a function that can make a string representation of any type?

I was desperately looking for the last hour for a method in the OCaml Library which converts an 'a to a string:
'a -> string
Is there something in the library which I just haven't found? Or do I have to do it different (writing everything by my own)?
It is not possible to write a printing function show of type 'a -> string in OCaml.
Indeed, types are erased after compilation in OCaml. (They are in fact erased after the typechecking which is one of the early phase of the compilation pipeline).
Consequently, a function of type 'a -> _ can either:
ignore its argument:
let f _ = "<something>"
peek at the memory representation of a value
let f x = if Obj.is_block x then "<block>" else "<immediate>"
Even peeking at the memory representation of a value has limited utility since many different types will share the same memory representation.
If you want to print a type, you need to create a printer for this type. You can either do this by hand using the Fmt library (or the Format module in the standard library)
type tree = Leaf of int | Node of { left:tree; right: tree }
let pp ppf tree = match tree with
| Leaf d -> Fmt.fp ppf "Leaf %d" d
| Node n -> Fmt.fp ppf "Node { left:%a; right:%a}" pp n.left pp n.right
or by using a ppx (a small preprocessing extension for OCaml) like https://github.com/ocaml-ppx/ppx_deriving.
type tree = Leaf of int | Node of { left:tree; right: tree } [##deriving show]
If you just want a quick hacky solution, you can use dump from theBatteries library. It doesn't work for all cases, but it does work for primitives, lists, etc. It accesses the underlying raw memory representation, hence is able to overcome (to some extent) the difficulties mentioned in the other answers.
You can use it like this (after installing it via opam install batteries):
# #require "batteries";;
# Batteries.dump 1;;
- : string = "1"
# Batteries.dump 1.2;;
- : string = "1.2"
# Batteries.dump [1;2;3];;
- : string = "[1; 2; 3]"
If you want a more "proper" solution, use ppx_deriving as recommended by #octachron. It is much more reliable/maintainable/customizable.
What you are looking for is a meaningful function of type 'a. 'a -> string, with parametric polymorphism (i.e. a single function that can operate the same for all possible types 'a, even those that didn’t exist when the function was created). This is not possible in OCaml. Here are explications depending on your programming background.
Coming from Haskell
If you were expecting such a function because you are familiar with the Haskell function show, then notice that its type is actually show :: Show a => a -> String. It uses an instance of the typeclass Show a, which is implicitly inserted by the compiler at call sites. This is not parametric polymorphism, this is ad-hoc polymorphism (show is overloaded, if you want). There is no such feature in OCaml (yet? there are projects for the future of the language, look for “modular implicits” or “modular explicits”).
Coming from OOP
If you were expecting such a function because you are familiar with OO languages in which every value is an object with a method toString, then this is not the case of OCaml. OCaml does not use the object model pervasively, and run-time representation of OCaml values retains no (or very few) notion of type. I refer you to #octachron’s answer.
Again, toString in OOP is not parametric polymorphism but overloading: there is not a single method toString which is defined for all possible types. Instead there are multiple — possibly very different — implementations of a method of the same name. In some OO languages, programmers try to follow the discipline of implementing a method by that name for every class they define, but it is only a coding practice. One could very well create objects that do not have such a method.
[ Actually, the notions involved in both worlds are pretty similar: Haskell requires an instance of a typeclass Show a providing a function show; OOP requires an object of a class Stringifiable (for instance) providing a method toString. Or, of course, an instance/object of a descendent typeclass/class. ]
Another possibility is to use https://github.com/ocaml-ppx/ppx_deriving with will create the function of Path.To.My.Super.Type.t -> string you can then use with your value. However you still need to track the path of the type by hand but it is better than nothing.
Another project provide feature similar to Batterie https://github.com/reasonml/reason-native/blob/master/src/console/README.md (I haven't tested Batterie so can't give opinion) They have the same limitation: they introspect the runtime encoding so can't get something really useable. I think it was done with windows/browser in mind so if cross plat is required I will test this one before (unless batterie is already pulled). and even if the code source is in reason you can use with same API in OCaml.

how to print in ML

I've searched and found several people asking this question, but I can't find an explicit answer.
How can I print a non-string in sml?
For example if I have an instance of an ADT, i.e., of a type declared by datatype, and I would like to print the value for debugging. Am I responsible for writing a function which converts such an object to a string, and then print the string? Or is there some printer library I should use? Or is there some sort of printObject or toString function?
Also how can I print other non-string objects such as true and false?
It would appear that sml knows how to print such objects, because when I compile a file using C-l in emacs, I see output such as the following, showing that sml does know how to print the values.
[opening /Users/jimka/Repos/mciml/ex1.1.sml]
type key = string
datatype tree = LEAF | TREE of tree * string * tree
val empty = LEAF : tree
val insert = fn : key * tree -> tree
val member = fn : key * tree -> bool
val t1 = TREE (LEAF,"a",LEAF) : tree
val t2 = TREE (LEAF,"a",TREE (LEAF,"c",LEAF)) : tree
val t3 = TREE (LEAF,"a",TREE (TREE (LEAF,"b",LEAF),"c",LEAF)) : tree
val it = true : bool
val it = () : unit
How can I print a non-string in sml?
As I understand it, this is not possible (in a portable way). Depending on the implementation you're using it may expose a function that does this.
Also how can I print other non-string objects such as true and false?
Many types with corresponding basis library structures (e.g., int and Int) have a toString function, so you could print a bool b via print (Bool.toString b) and similarity with Int.toString for an int.
Some implementation specific thoughts:
For PolyML, you can use the function PolyML.print to print values of arbitrary types (though you may need to explicitly type annotate; the type of the argument should not have any type variables).
For SML/NJ, you might try taking a look at the approach discussed here https://sourceforge.net/p/smlnj/mailman/message/21897190/, though this seems like more trouble than it's worth.
For MLton, I'm not aware of anything like a polymorphic function, but they have a couple guides on implementing printf or similar.
It looks like Moscow ML supports a function Meta.printVal, but only in an interactive session. I'm not sure what support SML# has for this sort of thing.
Am I responsible for writing a function which converts such an object to a string, and then print the string?
Generally speaking, yes.
It would appear that sml knows how to print such objects
Depending on your SML implementation this is enabled because the REPL has access to more information than a program normally might. For instance, SML/NJ is able to do this because the REPL has access to type information not available elsewhere (for a source, see John Reppy's statements in the linked mailman thread).
You might also find MLton's TypeIndexedValues example page helpful for this sort of thing, though I haven't closely examined it for quality myself.

Use a pretty printer to write a to_string function

I have defined a big pretty printer pp: out_channel -> t -> unit over a big type t. Therefore, I can use it like Printf.fprintf stdout "%a" x where x: t, or chain printing like Printf.fprintf chan "%a" pp x where chan: out_channel.
Now I need to convert what is printed to a string or a text. Does anyone know if there is a way to leverage/use the function pp rather than writing a function to_string: t -> unit from scratch?
Format.asprintf should suit your needs, if you pp is implemented for Format.formatter type instead of out_channel. The Format.formatter is a more general type and should be preferred to the concrete out_channel. In fact, a sort of a standard type for pretty printer is the Format.formatter -> 'a -> unit type, at least it is required by the #install_printer directive in OCaml toplevel, debugger and other facilities. Functions of the same type are used in Core library to implement Pretty_printer interface.
So, if you will reimplement your pp function to work with the Format module (usually for this it would be enough just to open Format module), then you can reuse it. The functions, that print to out_channel module can't be retargetered to print into string. So it is better not to write them.
To make this work you need something that looks to OCaml like an out channel, but keeps the data in a string (or buffer) instead. There's nothing like this in OCaml, unfortunately.

OCaml equivalent to f# "%A"

in F#, the following is a no brainer:
let l = [1;2;3;4]
let s = sprintf "%A" l
where "%A" prints a formatted version of virtually any common, even recursive data structure.
Is there something similarly easy in ocaml?
There is something close, the %a specificator accepts two arguments, the first is a pretty printer for type 'a, and the second is a value of type 'a. The type of the printer, depends on the kind of used printf function. For example,
open Core_kernel.Std
open Format
printf "%a" Int63.pp Int63.one
Of course, this depends heavily on a good support from a library. If there is no pp function, provided for the type, then it is pretty useless.
Also there is a custom_printf syntax extension available for both - pp and ppx. In this extension you place a module name in the place of specificator. The module must have a to_string function. The ppx version, requires an exclamation mark before the format string:
printf !"%{Int63}" Int63.one
There is also a dump function, available over the Internet. In particular you can find it in the Batteries library. It recurse over the data representation and print it in a more or less human readable representation. But this is not relate to the formatted output.

user defined type for strings which starts with Letter

I want to have user-defined type in Ocaml which represents strings which starts with English letter and afterwards can have letters or digits. Is it possible to define such custom type?
Jeffrey Scofield is right: there is no way in OCaml to define a type that would be the subset of strings verifying a given condition. You might however simulate that to some extent with a module and abstract or private data type, as in:
module Ident : sig
type t = private string
val create: string -> t
val (^): t -> t -> t
(* declare, and define below other functions as needed *)
end = struct
type t = string
let create s = (* do some check *) s
let (^) s1 s2 = create (s1 ^ s2)
end;;
Of course, the create function should check that the first char of s is a letter and the other ones letters or digits and raise an exception if this is not the case, but this is left a an exercise. This way, you know that any s of type Ident.t respects the conditions checked in create: by making the type synonym private in the signature, you ensure that you must go through one of the functions of Ident to create such value. Conversely (s:>string) is recognized as a string, hence you can still use all built-in functions over it (but you'll get back string, not Ident.t).
Note however that there is particular issue with string: they are mutable (although that is bound to change in the upcoming 4.02 version), so that you can alter an element of Ident.t afterwards:
let foo = "x0";;
let bar = Ident.create foo;;
foo.[0] <- '5';;
bar;;
will produce
- : Ident.t = "50"
If you restrict yourself to never modify a string in place (again this will be the default in the next OCaml's version), this cannot happen.
It's a little hard to answer, but I think the most straightforward answer is no. You want the type to be constrained by values, and this isn't something that's possible in OCaml. You need a language with dependent types for that.
You can define an OCaml type that represents such strings, but its values wouldn't also be strings. You couldn't use strings like "a15" as values of the type, or use the built-in ^ operator on them, etc. A value might look like S(Aa, [B1; B5]) (say). This is far too cumbersome to be useful.