I have been using Int.of_string, but I'm wondering what the lightest dependency footprint way of parsing an int, float, or other primitive types from a string, that returns a Result or similar type, rather than an exception, when the string is not a proper value?
My first guess is a parser combinator library. Is there anything within a standard library like Base or Core that provides very simple parser combinator functionality? If not, would you go with angstrom?
Are you looking for something like:
# Scanf.sscanf "123abc" "%d" (fun x -> x);;
- : int = 123
Scanf simply stops when the input stops making sense for the format you request. But it still throws an exception if the input is totaly incomprehensible, e.g.:
# Scanf.sscanf "abc" "%d" (fun x -> x);;
Exception:
Scanf.Scan_failure
"scanf: bad input at char number 0: character 'a' is not a decimal digit".
Related
I am using a 3rd party module which exposes a function:
val pp : Format.formatter -> 'a -> unit
Unfortunately it doesn't expose a to_string (or show) function.
I want to find a way to use the result of pp in a format string, something like:
let output = Format.sprintf "Result: %s" (SomeModule.pp fmt myval)
But pp writes to fmt and returns unit so of course this is not valid.
I can tell I need to somehow make a formatter to pass to pp that writes to a string buffer, that I can then get contents of as a string, which I can then pass as an arg to sprintf
The use of pp like functions for making types printable seems pretty ubiquitous in OCaml (e.g. ppx_deriving show generates them) so I feel like there should be a simple way to achieve this, but I'm currently missing it.
By using asprintf instead, it's possible to use the %a format specifier to pass two arguments, a printer function and the value to be printed, which will then be formatted accordingly and inserted in its place:
let output = Format.asprintf "Result: %a" SomeModule.pp myval
The reason asprintf has to be used instead of sprintf is that the latter specifies an "input source" (the second argument of the format type) of type unit, while the former uses a formatter. This is what's going to be passed to the printer.
It still alludes me why there's a need for sprintf though, rather than just having asprintf. Perhaps there's some performance-related reason for it, but my guess is that it's just an artifact of history.
I use vscode, with extensions of "OCaml and Reason IDE"
Here is my result in utop:
utop # 1. = 1. ;;
Line 1, characters 0-2:
Error: This expression has type float but an expression was expected of type
int
And also for String:
utop # "Me" = "Me";;
Line 1, characters 0-4:
Error: This expression has type string but an expression was expected of type
int
Same for anything but int:
utop # 2 = 2 ;;
- : bool = true
">" "<" also have the same symptom. I don't know what actually happens. Can anyone help me out ? Thanks a lot!
You are probably using JaneStreet Base library. Maybe you imported it like that:
open Base;;
Base tries to limit exceptions to functions that have explicit _exn suffix, so it shadows the built-in polymorphic equality (=) which can raise an exception on some inputs (for example, if you compare structures containing functions).
You can get polymorphic equality back as follows:
let (=) = Poly.(=);;
Or you can use it with a local import: Poly.(x = y).
There are pros and cons to polymorphic comparison.
The consensus seems to be that using monomorphic comparison (for example, String.equal, etc) is a more robust choice, even though it is less convenient.
I have defined a big pretty printer pp: out_channel -> t -> unit over a big type t. Therefore, I can use it like Printf.fprintf stdout "%a" x where x: t, or chain printing like Printf.fprintf chan "%a" pp x where chan: out_channel.
Now I need to convert what is printed to a string or a text. Does anyone know if there is a way to leverage/use the function pp rather than writing a function to_string: t -> unit from scratch?
Format.asprintf should suit your needs, if you pp is implemented for Format.formatter type instead of out_channel. The Format.formatter is a more general type and should be preferred to the concrete out_channel. In fact, a sort of a standard type for pretty printer is the Format.formatter -> 'a -> unit type, at least it is required by the #install_printer directive in OCaml toplevel, debugger and other facilities. Functions of the same type are used in Core library to implement Pretty_printer interface.
So, if you will reimplement your pp function to work with the Format module (usually for this it would be enough just to open Format module), then you can reuse it. The functions, that print to out_channel module can't be retargetered to print into string. So it is better not to write them.
To make this work you need something that looks to OCaml like an out channel, but keeps the data in a string (or buffer) instead. There's nothing like this in OCaml, unfortunately.
in F#, the following is a no brainer:
let l = [1;2;3;4]
let s = sprintf "%A" l
where "%A" prints a formatted version of virtually any common, even recursive data structure.
Is there something similarly easy in ocaml?
There is something close, the %a specificator accepts two arguments, the first is a pretty printer for type 'a, and the second is a value of type 'a. The type of the printer, depends on the kind of used printf function. For example,
open Core_kernel.Std
open Format
printf "%a" Int63.pp Int63.one
Of course, this depends heavily on a good support from a library. If there is no pp function, provided for the type, then it is pretty useless.
Also there is a custom_printf syntax extension available for both - pp and ppx. In this extension you place a module name in the place of specificator. The module must have a to_string function. The ppx version, requires an exclamation mark before the format string:
printf !"%{Int63}" Int63.one
There is also a dump function, available over the Internet. In particular you can find it in the Batteries library. It recurse over the data representation and print it in a more or less human readable representation. But this is not relate to the formatted output.
I want to have user-defined type in Ocaml which represents strings which starts with English letter and afterwards can have letters or digits. Is it possible to define such custom type?
Jeffrey Scofield is right: there is no way in OCaml to define a type that would be the subset of strings verifying a given condition. You might however simulate that to some extent with a module and abstract or private data type, as in:
module Ident : sig
type t = private string
val create: string -> t
val (^): t -> t -> t
(* declare, and define below other functions as needed *)
end = struct
type t = string
let create s = (* do some check *) s
let (^) s1 s2 = create (s1 ^ s2)
end;;
Of course, the create function should check that the first char of s is a letter and the other ones letters or digits and raise an exception if this is not the case, but this is left a an exercise. This way, you know that any s of type Ident.t respects the conditions checked in create: by making the type synonym private in the signature, you ensure that you must go through one of the functions of Ident to create such value. Conversely (s:>string) is recognized as a string, hence you can still use all built-in functions over it (but you'll get back string, not Ident.t).
Note however that there is particular issue with string: they are mutable (although that is bound to change in the upcoming 4.02 version), so that you can alter an element of Ident.t afterwards:
let foo = "x0";;
let bar = Ident.create foo;;
foo.[0] <- '5';;
bar;;
will produce
- : Ident.t = "50"
If you restrict yourself to never modify a string in place (again this will be the default in the next OCaml's version), this cannot happen.
It's a little hard to answer, but I think the most straightforward answer is no. You want the type to be constrained by values, and this isn't something that's possible in OCaml. You need a language with dependent types for that.
You can define an OCaml type that represents such strings, but its values wouldn't also be strings. You couldn't use strings like "a15" as values of the type, or use the built-in ^ operator on them, etc. A value might look like S(Aa, [B1; B5]) (say). This is far too cumbersome to be useful.