Simple timing profiler for functional languages - profiling

I needed a simple timing profiler to estimate the runtime of some parts of my program (written in OCaml, but I believe this could apply to other functional languages), and I couldn't find a very simple solution, similar to what one would code in an imperative language, using functions such as timer.start/timer.stop. So I tried one using lazy evaluation, and it works quite well for what I need, however I didn't find any references to this method, so I wonder it the approach is flawed or if there is a simpler solution.
So, the question is: do you know about similar implementations for functional languages (especially OCaml)? If so, please indicate them to me, I'd like to borrow some of their ideas to improve my "poorer man's profiler" (I've seen this question but it didn't help me much). From what I've seen, GHC already has a way to collect timing information, so it's probably not an issue for Haskell.
By the way, I tried doing timing profiling as indicated in the OCaml manual (17.4), but it was too "low-level" for what I needed: it gives lots of information at the C function level, which make it harder to evaluate precisely which part of the OCaml code is the culprit.
Below follows my implementation in OCaml (note that I need to add the "lazy" expression everytime I want to measure the time, but at the same time I can finely control how much information I need).
open Unix (* for the timers *)
(** 'timers' associates keys (strings) to time counters,
to allow for multiple simultaneous measurements. *)
let timers : (string, (float * float)) Hashtbl.t = Hashtbl.create 1
(** starts the timer associated with key <name> *)
let timer_start (name : string) : unit =
let now = Unix.times () in
Hashtbl.replace timers name (now.tms_utime, now.tms_stime)
(** Returns time elapsed between the corresponding call to
timer_start and this call *)
let timer_stop (name : string) : float =
try
let now = Unix.times () in
let t = Hashtbl.find timers name in
(now.tms_utime -. fst t) +. (now.tms_stime -. snd t)
with
Not_found -> 0.0
(** Wrapper for the timer function using lazy evaluation *)
let time (s : string) (e : 'a Lazy.t) : 'a =
timer_start s;
let a = Lazy.force e in
let t2 = timer_stop s in
(* outputs timing information *)
Printf.printf "TIMER,%s,%f\n" s t2; a
(** Example *)
let rec fibo n =
match n with
| 0 -> 1
| 1 -> 1
| n' -> fibo (n - 1) + fibo (n - 2)
let main =
let f = time "fibo" (lazy (fibo 42)) in
Printf.printf "f = %d\n" f

Unix.times measures CPU time, not wall-clock time. So this is suitable only for computational code that spends all of its time in CPU. And BTW hashtbl is not needed, even for multiple simultaneous measurements, just return the start time in timer_start and substract it in timer_stop.

Merging the ideas from #Jeffrey_Scofield and #ygrek, the "poorest man's timing profiler" is indeed so simple it would barely require mention at all, which would explain why I hadn't found it. So I've merged their answers and produced a much simpler version:
open Unix (* for the timers *)
(* Wrapper for the timer function using a "unit -> 'a" thunk *)
let time (s : string) (e : unit -> 'a) : 'a =
let tstart = Unix.times () in
let a = e () in
let tend = Unix.times () in
let delta = (tend.tms_utime -. tstart.tms_utime) +.
(tend.tms_stime -. tstart.tms_stime) in
(* outputs timing information *)
Printf.printf "TIMER,%s,%f\n" s delta; a
(* Example *)
let rec fibo n =
match n with
| 0 -> 1
| 1 -> 1
| n' -> fibo (n - 1) + fibo (n - 2)
let main =
let f = time "fibo" (fun () -> fibo 42) in
Printf.printf "f = %d\n" f

Related

How much does Hashtbl.find affect performance?

When I measure the execution time with Hashtbl.find the program is 16x slower than without it. Why is that?
Note that the equivalent code in Node does not show as much difference with or without the lookup table (Map or Object) (only 3x slower)
OCaml code:
let fib =
let table = Hashtbl.create 1000 in
let rec f n =
try Hashtbl.find table n
with Not_found -> (
match n with
| 0 -> 0
| 1 -> 1
| n ->
let r = f (n - 1) + f (n - 2) in
(* Hashtbl.add table n r ; *)
r
)
in
f
The Hashtbl.add is commented on purpose, I'm just interested in the performance cost of he Hashtable find.
The Hashtbl.find function is not free even when applied to an empty hash table because it computes the hash of the provided key. Since you're using a polymorphic hash table implementation, a generic (implemented in C) hash function is used. These all incur some overhead w.r.t to the default payload of a Fibonacci function, which is only three arithmetic operations (i.e., an overhead of 20x3=60 arithmetic operations).
If we will use the functorial interface to provide a more efficient hashing function, we will reduce the overhead to something that is close to x3:
module Table = Hashtbl.Make(struct
type t = int
let equal : int -> int -> bool = fun x y -> x = y [##inline]
let hash x = x [##inline]
end)
let table = Table.create 127
let fib1 x =
let rec f n = match n with
| 0 -> 0
| 1 -> 1
| n -> match Table.find_opt table n with
| Some x -> x
| None ->
let r = f (n - 1) + f (n - 2) in
(* Hashtbl.add table n r ; *)
r in
f x
Note, that I also switched from using exceptions to the option type. Setting up exception handlers inside of a recursive function implies extra overhead on each recursive call. Basically, the try statement has a runtime cost.
If we will compare the running time of implementation with hash tables (fib1) and without (fib2), we will get the following numbers (in ms, on mine 2Ghz machine, for n=32)
fib1: 53.3791
fib2: 18.1501
This gives us an overhead of x3 (6 arithmetic operations on top of the Fibonacci kernel itself), which more or less corresponds to the overhead of the modulo operation (two arithmetic operations) as well as three extra calls (the find itself, our hash function, and the Array.length function.
You can also try the hash table implementation provided by the Janestreet Core library, which is usually more efficient.

How to create a big number of threads in OCaml?

I found a topic in the Racket group about the performance of channel creating.
I want to write a OCaml's version to test.
let post (c,x) = Event.sync (Event.send c x);;
let accept c = Event.sync (Event.receive c);;
let get_chan c = let n = accept c in print_int n;print_newline ();;
let chan_trans (old_chan, new_chan) =
let s = accept old_chan in
post (new_chan,(s+1));;
let rec whisper count init_val =
let rec aux n chan =
if n >= count then chan
else
let new_chan = Event.new_channel ()
in Thread.create chan_trans (chan, new_chan);
aux (n+1) new_chan
in let leftest_chan = Event.new_channel ()
in let t0 = Thread.create post (leftest_chan, init_val)
in let rightest_chan = aux 0 leftest_chan
in get_chan rightest_chan;;
whisper 10000 1;;
The question is, when I tested for whisper 1000 1, it produced 1001 as expected. However, when I tried to test whisper 10000 1, there's an error as
Fatal error: exception Sys_error("Thread.create: Resource temporarily unavailable")
I used this command to compile and run
ocamlc -thread unix.cma threads.cma -o prog whisper.ml&&./prog -I
+threads
OCaml Thread module uses the real system (kernel) threads. The total number of threads is bounded by the kernel:
cat /proc/sys/kernel/threads-max
251422
You can increase this of course,
echo 100000 > /proc/sys/kernel/threads-max
but a better approach would be to treat threads as a resource and manage them correspondingly.
let rec whisper count init_val =
let rec aux n t chan =
if n >= count then chan
else
let new_chan = Event.new_channel () in
let t' = Thread.create chan_trans (chan, new_chan) in
Thread.join t;
aux (n+1) t' new_chan in
let leftest_chan = Event.new_channel () in
let t = Thread.create post (leftest_chan, init_val) in
let rightest_chan = aux 0 t leftest_chan in
get_chan rightest_chan
In that case it will run with any size of the pipeline. For example:
$ ocamlbuild -use-ocamlfind -tag thread -pkg threads ev.native
$ time ./ev.native
100001
real 0m1.581s
But this implementation of Chinese Whispers is very crude and inefficient. You shouldn't use heavyweight native threads for this (and neither go uses them). Instead, you should use cooperative lightweight threads from Lwt or Async libraries. This would be much efficient and nice.
Implementation with Lwt
This implementation follows closely the Go implementation from the blog post, but I think that we can do this more efficient and concise in OCaml without using mailboxes (but I'm not sure whether it will conform to the rules of the benchmark).
open Lwt.Infix
let whispers n =
let rec whisper i p =
if i < n then
Lwt_mvar.take p >>= fun x ->
whisper (i+1) (Lwt_mvar.create (x+1))
else Lwt_mvar.take p in
whisper 0 (Lwt_mvar.create 1)
let () = print_int ## Lwt_main.run (whispers 100000)
The results are:
$ ocamlbuild -use-ocamlfind -tag thread -pkg lwt.unix lev.native --
$ time ./lev.native
100001
real 0m0.007s
To compare with Go implementation on mine machine:
$ go build whispers.go
$ time ./whispers
100001
real 0m0.952s
"Slow" implementation
The code above is a completely honest reimplementation of the original Go version. But one of the reasons why it so fast, is that OCaml and Lwt is very clever, and although it creates 100_000 threads and 100_001 channels, no threads are ever got yielded to a background, since every time the whisper is called the channel already contains data, so the thread is in a ready state. As a result, this is just an efficient loop, that creates threads and channels. It can create a million threads in 50 ms.
So this is an idiomatic and correct way of doing things. But lets for the sake of true comparison mimick Go behavior. The following implementation will first eagerly create in the heap 100_001 channels, and 100_000 threads, waiting to transfer data from left to right channel. And only afterward it will put a value into the leftmost channel to provoke a chain of reaction. This would basically mimick what is happening in Go underneath the hood.
let whispers n =
let rec loop i p =
if i < n then
let p' = Lwt_mvar.create_empty () in
let _t =
Lwt_mvar.take p >>= fun x ->
Lwt_mvar.put p' (x+1) in
loop (i+1) p'
else Lwt_mvar.take p in
let p0 = Lwt_mvar.create_empty () in
let t = loop 1 p0 in
Lwt_mvar.put p0 1 >>= fun () -> t
$ time ./lev.native
100001
real 0m0.111s
So it is slightly slower, in fact it is 20 times slower than the previous implementation (I've used 1 million of threads to compare them), but it is still 10 times faster than the Go.
Reading the linked post it seems you might want to use lwt which is a "cooperative threads library for OCaml". The result would look something like this:
let whisper left right =
let%lwt n = Lwt_mvar.take right in
Lwt_mvar.put left (n+1)
let main () =
let n = 100_000 in
let%lwt () = Lwt_io.printf "With %d mvars!\n" n in
let leftmost = Lwt_mvar.create_empty () in
let rec setup_whispers left i =
if i >= n
then left
else let right = Lwt_mvar.create_empty () in
let () = Lwt.async (fun () -> whisper left right) in
setup_whispers right (i+1) in
let rightmost = setup_whispers leftmost 0 in
let%lwt () = Lwt_mvar.put rightmost 1 in
let%lwt res = Lwt_mvar.take leftmost in
Lwt_io.printf "%d\n" res
let () = Lwt_main.run (main ())
And then compiling and running it
$ ocamlbuild -use-ocamlfind -pkg lwt,lwt.ppx,lwt.unix whisper.native
$ time ./whisper.native
With 100000 mvars!
100001
real 0m0.169s
user 0m0.156s
sys 0m0.008s

How to take sublist without first and last item with F#?

I have sorted list of integer values:
let ls = [1..4]
How can I get a sublist without first and the last element? (In the most optimal way)
The expected result is [2; 3].
This is what I have so far, and yeah, it's working, but I in my opinion it's just not the best approach.
[1..4] |> List.tail |> List.rev |> List.tail |> List.sort
A somewhat long answer incoming in response to your innocently worded qualifier: "In the most optimal way"
Optimal in terms of what?
Performance? (Most likely)
Performance but also include GC performance?
Memory usage?
x86?
x64?
And so on...
So I decided to measure some aspects of the problem.
I measured the different answers (added a non-idiomatic version as well) in this thread in various different context.
Without further ado here is the program I used to measure
open System
open System.Diagnostics
open System.IO
module so29100251 =
// Daystate solution (OP)
module Daystate =
// Applied minor fixes to it
let trim = function
| [] | [_] | [_;_] -> []
| ls -> ls |> List.tail |> List.rev |> List.tail |> List.rev
// kaefer solution
module kaefer =
type 'a State = Zero | One | Other of 'a
let skipFirstAndLast xss =
let rec aux acc = function
| _, [] -> List.rev acc
| Zero, x::xs -> aux acc (One, xs)
| One, x::xs -> aux acc (Other x, xs)
| (Other prev), x::xs -> aux (prev :: acc) (Other x, xs)
aux [] (Zero, xss)
// Petr solution
module Petr =
let rec trimImpl ls acc =
match ls, acc with
| [], _ -> acc
| h::[], acc -> List.rev acc
| h::n::t, [] -> trimImpl t [n]
| h::t, acc -> trimImpl t (h::acc)
let trim ls = trimImpl ls []
// NonIdiomatic solution
module NonIdiomatic =
let trim (hint : int) (ls : 'T list) =
// trims last of rest
// Can't ask for ls.Length as that is O(n)
let ra = ResizeArray<_> (hint)
// Can't use for x in list do as it relies on .GetEnumerator ()
let mutable c = ls
while not c.IsEmpty do
ra.Add c.Head
c <- c.Tail
let count = ra.Count
let mutable result = []
for i in (count - 2)..(-1)..1 do
result <- ra.[i]::result
result
open so29100251
type Time = MilliSeconds of int64
type TestKind<'T> =
| Functional of 'T
| MeasurePerformance of int*int
[<EntryPoint>]
let main argv =
let factor = 10000000
// let maxHint = Int32.MaxValue
let maxHint = 100
let time (action : unit -> 'T) : 'T*Time =
let sw = Stopwatch ()
sw.Start ()
let r = action ()
sw.Stop ()
r, MilliSeconds sw.ElapsedMilliseconds
let adapt fn hint ls = fn ls
let trimmers =
[|
"Daystate" , adapt Daystate.trim
"kaefer" , adapt kaefer.skipFirstAndLast
"Petr" , adapt Petr.trim
"NonIdiomatic" , NonIdiomatic.trim
|]
#if DEBUG
let functionalTestCases =
[|
Functional [] , "empty" , []
Functional [] , "singleton" , [1]
Functional [] , "duoton" , [1;2]
Functional [2] , "triplet" , [1;2;3]
Functional [2;3] , "quartet" , [1;2;3;4]
|]
let performanceMeasurements = [||]
#else
let functionalTestCases = [||]
let performanceMeasurements =
[|
"small" , 10
"big" , 1000
"bigger" , 100000
// "huge" , 10000000
|] |> Array.map (fun (name, size) -> MeasurePerformance (size, (factor / size)) , name , [for x in 1..size -> x])
#endif
let testCases =
[|
functionalTestCases
performanceMeasurements
|] |> Array.concat
use tsv = File.CreateText ("result.tsv")
tsv.WriteLine (sprintf "TRIMMER\tTESTCASE\tSIZE\tHINT\tRUNS\tMEMORY_BEFORE\tMEMORY_AFTER\tGC_TIME\tRUN_TIME")
for trimName, trim in trimmers do
for testKind, testCaseName, testCase in testCases do
match testKind with
| Functional expected ->
let actual = trim 0 testCase
if actual = expected then
printfn "SUCCESS: Functional test of %s trim on testcase %s successful" trimName testCaseName
else
printfn "FAILURE: Functional test of %s trim on testcase %s failed" trimName testCaseName
| MeasurePerformance (size,testRuns) ->
let hint = min size maxHint
let before = GC.GetTotalMemory(true)
printfn "MEASURE: Running performance measurement on %s trim using testcase %s..." trimName testCaseName
let timeMe () =
for x in 1..testRuns do
ignore <| trim hint testCase
let _, MilliSeconds ms = time timeMe
let after = GC.GetTotalMemory(false)
let timeGC () =
ignore <| GC.GetTotalMemory(true)
let _, MilliSeconds msGC = time timeMe
printfn "...%d ms (%d runs), %d (before) %d (after) %d ms (GC)" ms testRuns before after msGC
tsv.WriteLine (sprintf "%s\t%s\t%d\t%d\t%d\t%d\t%d\t%d\t%d" trimName testCaseName size hint testRuns before after msGC ms)
0
I then measured the execution time and GC time on x64 and max size hint allowed:
(size hints is only used by the non-idiomatic version)
x86 and max size hint allowed:
x64 and max 100 hint allowed:
x86 and max 100 hint allowed:
Looking at the performance charts we can note some somewhat surprising things:
All variants are iterating 10000000 times. One would expect the execution time to not differ between the different variants but they do.
The crusty old x86 scores consistently better overall. I won't speculate why.
OPs initial version while seemingly wasteful scores pretty good. It's probably helped by that List.rev is very optimized (IIRC it does some safe cheating available only to F# devs)
The kaefer version while on paper a better solution seems to score the worst. I think it's because it allocates extra State objects which are heap based. (This should obviously not be interpreted as a criticism of kaefers skills)
The non-idiomatic solution scores good with good size hints but not as good as I expected. It might be that building the final lists is what costs most cycles. It might also be that tail recursive functions over lists are more efficient than while loops as IIRC pattern matching are more effective than calling List.Tail/List.Head/List.IsEmpty
GC time is almost as big as the execution time.
I expected the GC time of the non-idiomatic solution to be significantly lower than the rest. However, while the ResizeArray<_> are probably quick to collect the list objects aren't.
On x86 arch the performance difference between Petr solution and the non-idiomatic one might not warrant the extra complexity.
Some final thoughts:
OPs original solution did pretty good
Garbage Collection takes time
Always measure...
Hopefully it was somewhat interesting
Edit:
The GC performance measurement numbers should not be over-interpreted into some thing more than: "GC can be expensive"
I later changed from a while loop to tail-recursion over a list which did improve the performance somewhat but not enough to warrant an update of the charts.
This is one of the ways:
let rec trim ls acc =
match ls, acc with
| [], _ -> acc
| h::[], acc -> List.rev acc
| h::n::t, [] -> trim t [n]
| h::t, acc -> trim t (h::acc)
let reslt = trim ls []
You didn't require standard library functions to achieve this, your're just asking for an efficient way. Defining a recursive function with an accumulator which holds the intermediate results would then appear a viable solution, even when the list has to be reversed at its termination.
I'm providing a custom Discriminated Union to keep track of the state, this is modelled along the lines of the Option type with an extra case.
type 'a State = Zero | One | Other of 'a
let skipFirstAndLast xss =
let rec aux acc = function
| _, [] -> List.rev acc
| Zero, x::xs -> aux acc (One, xs)
| One, x::xs -> aux acc (Other x, xs)
| (Other prev), x::xs -> aux (prev :: acc) (Other x, xs)
aux [] (Zero, xss)
[1..4] |> skipFirstAndLast // val it : int list = [2; 3]

Efficient input in OCaml

Suppose I am writing an OCaml program and my input will be a large stream of integers separated by spaces i.e.
let string = input_line stdin;;
will return a string which looks like e.g. "2 4 34 765 5 ..." Now, the program itself will take a further two values i and j which specify a small subsequence of this input on which the main procedure will take place (let's say that the main procedure is the find the maximum of this sublist). In other words, the whole stream will be inputted into the program but the program will only end up acting on a small subset of the input.
My question is: what is the best way to translate the relevant part of the input stream into something usable i.e. a string of ints? One option would be to convert the whole input string into a list of ints using
let list = List.map int_of_string(Str.split (Str.regexp_string " ") string;;
and then once the bounds i and j have been entered one easily locates the relevant sublist and its maximum. The problem is that the initial pre-processing of the large stream is immensely time-consuming.
Is there an efficient way of locating the small sublist directly from the large stream i.e. processing the input along with the main procedure?
OCaml's standard library is rather small. It provides necessary and sufficient set of orthogonal features, as should do any good standard library. But, usually, this is not enough for a casual user. That's why there exist libraries, that do the stuff, that is rather common.
I would like to mention two the most prominent libraries: Jane Street's Core library and Batteries included (aka Core and Batteries).
Both libraries provides a bunch of high-level I/O functions, but there exists a little problem. It is not possible or even reasonable to try to address any use case in a library. Otherwise the library's interface wont be terse and comprehensible. And your case is non-standard. There is a convention, a tacit agreement between data engineers, to represent a set of things with a set of lines in a file. And to represent one "thing" (or a feature) with a line. So, if you have a dataset where each element is a scalar, you should represent it as a sequence of scalars separated by a newline. Several elements on a single line is only for multidimensional features.
So, with a proper representation, your problem can be solve as simple as (with Core):
open Core.Std
let () =
let filename = "data" in
let max_number =
let open In_channel in
with_file filename
~f:(fold_lines ~init:0
~f:(fun m s -> Int.(max m ## of_string s))) in
printf "Max number is %s is %d\n" filename max_number
You can compile and run this program with corebuild test.byte -- assuming that code is in a file name test.byte and core library is installed (with opam install core if you're using opam).
Also, there exists an excellent library Lwt, that provides a monadic high-level interface to the I/O. With this library, you can parse a set of scalars in a following way:
open Lwt
let program =
let filename = "data" in
let lines = Lwt_io.lines_of_file filename in
Lwt_stream.fold (fun s m -> max m ## int_of_string s) lines 0 >>=
Lwt_io.printf "Max number is %s is %d\n" filename
let () = Lwt_main.run program
This program can be compiled and run with ocamlbuild -package lwt.unix test.byte --, if lwt library is installed on your system (opam install lwt).
So, that is not to say, that your problem cannot be solved (or is hard to be solved) in OCaml, it is just to mention, that you should start with a proper representation. But, suppose, you do not own the representation, and cannot change it. Let's look, how this can be solved efficiently with OCaml. As previous examples represent, in general your problem can be described as a channel folding, i.e. an consequential application of a function f to each value in a file. So, we can define a function fold_channel, that will read an integer value from a channel and apply a function to it and the previously read value. Of course, this function can be further abstracted, by lifting the format argument, but for the demonstration purpose, I suppose, this will be enough.
let rec fold_channel f init ic =
try Scanf.fscanf ic "%u " (fun s -> fold_channel f (f s init) ic)
with End_of_file -> init
let () =
let max_value = open_in "atad" |> fold_channel max 0 in
Printf.printf "max value is %u\n" max_value
Although, I should note that this implementation is not for a heavy duty work. It is even not tail-recursive. If you need really efficient lexer, you can use ocaml's lexer generator, for example.
Update 1
Since there is a word "efficient" in the title, and everybody likes benchmarks, I've decided to compare this three implementations. Of course, since pure OCaml implementation is not tail-recursive it is not comparable to others. You may wonder, why it is not tail-recursive, as all calls to fold_channel is in a tail position. The problem is with exception handler - on each call to the fold channel, we need to remember the init value, since we're going to return it. This is a common issue with recursion and exceptions, you may google it for more examples and explanations.
So, at first we need to fix the third implementation. We will use a common trick with option value.
let id x = x
let read_int ic =
try Some (Scanf.fscanf ic "%u " id) with End_of_file -> None
let rec fold_channel f init ic =
match read_int ic with
| Some s -> fold_channel f (f s init) ic
| None -> init
let () =
let max_value = open_in "atad" |> fold_channel max 0 in
Printf.printf "max value is %u\n" max_value
So, with a new tail-recursive implementation, let's try them all on a big-data. 100_000_000 numbers is a big data for my 7 years old laptop. I've also added a C implementations as a baseline, and an OCaml clone of the C implementation:
let () =
let m = ref 0 in
try
let ic = open_in "atad" in
while true do
let n = Scanf.fscanf ic "%d " (fun x -> x) in
m := max n !m;
done
with End_of_file ->
Printf.printf "max value is %u\n" !m;
close_in ic
Update 2
Yet another implementation, that uses ocamllex. It consists of two files, a lexer specification lex_int.mll
{}
let digit = ['0'-'9']
let space = [' ' '\t' '\n']*
rule next = parse
| eof {None}
| space {next lexbuf}
| digit+ as n {Some (int_of_string n)}
{}
And the implementation:
let rec fold_channel f init buf =
match Lex_int.next buf with
| Some s -> fold_channel f (f s init) buf
| None -> init
let () =
let max_value = open_in "atad" |>
Lexing.from_channel |>
fold_channel max 0 in
Printf.printf "max value is %u\n" max_value
And here are the results:
implementation time ratio rate (MB/s)
plain C 22 s 1.0 12.5
ocamllex 33 s 1.5 8.4
Core 62 s 2.8 4.5
C-like OCaml 83 s 3.7 3.3
fold_channel 84 s 3.8 3.3
Lwt 143 s 6.5 1.9
P.S. You can see, that in this particular case Lwt is an outlier. This doesn't mean that Lwt is slow, it is just not its granularity. And I would like to assure you, that to my experience Lwt is a well suited tool for a HPC. For example, in one of my programs it processes a 30 MB/s network stream in a real-time.
Update 3
By the way, I've tried to address the problem in an abstract way, and I didn't provide a solution for your particular example (with j and k). Since, folding is a generalization of the iteration, it can be easily solved by extending the state (parameter init) to hold a counter and check whether it is contained in a range, that was specified by a user. But, this leads to an interesting consequence: what to do, when you have outran the range? Of course, you can continue to the end, just ignoring the output. Or you can non-locally exit from a function with an exception, something like raise (Done m). Core library provides such facility with a with_return function, that allows you to break out of your computation at any point.
open Core.Std
let () =
let filename = "data" in
let b1,b2 = Int.(of_string Sys.argv.(1), of_string Sys.argv.(2)) in
let range = Interval.Int.create b1 b2 in
let _,max_number =
let open In_channel in
with_return begin fun call ->
with_file filename
~f:(fold_lines ~init:(0,0)
~f:(fun (i,m) s ->
match Interval.Int.compare_value range i with
| `Below -> i+1,m
| `Within -> i+1, Int.(max m ## of_string s)
| `Above -> call.return (i,m)
| `Interval_is_empty -> failwith "empty interval"))
end in
printf "Max number is %s is %d\n" filename max_number
You may use the Scanf module family of functions. For instance, Scanf.fscanf let you read tokens from a channel according to a string format (which is a special type in OCaml).
Your program can be decomposed in two functions:
one which skip a number i of tokens from the input channel,
one which extract the maximum integer out of a number j from a channel
Let's write these:
let rec skip_tokens c i =
match i with
| i when i > 0 -> Scanf.fscanf c "%s " (fun _ -> skip_tokens c ## pred i)
| _ -> ()
let rec get_max c j m =
match j with
| j when j > 0 -> Scanf.fscanf c "%d " (fun x -> max m x |> get_max c (pred j))
| _ -> m
Note the space after the token format indicator in the string which tells the scanner to also swallow all the spaces and carriage returns in between tokens.
All you need to do now is to combine them. Here's a small program you can run from the CLI which takes the i and j parameters, expects a stream of tokens, and print out the maximum value as wanted:
let _ =
let i = int_of_string Sys.argv.(1)
and j = int_of_string Sys.argv.(2) in
skip_tokens stdin (pred i);
get_max stdin j min_int |> print_int;
print_newline ()
You could probably write more flexible combinators by extracting the recursive part out. I'll leave this as an exercise for the reader.

Linear types in OCaml

Rust has a linear type system. Is there any (good) way to simulate this in OCaml? E.g., when using ocaml-lua, I want to make sure some functions are called only when Lua is in a specific state (table on top of stack, etc).
Edit: Here's a recent paper about resource polymorphism relevant to the question: https://arxiv.org/abs/1803.02796
Edit 2: There are also a number of articles about session types in OCaml available, including syntax extensions to provide some syntactic sugar.
As suggested by John Rivers, you can use a monadic style to represent
"effectful" computation in a way that hides the linear constraint in
the effect API. Below is one example where a type ('a, 'st) t is
used to represent computation using a file handle (whose identity is
implicit/unspoken to guarantee that it cannot be duplicated), will
product a result of type 'a and leave the file handle in the state
'st (a phantom type being either "open" or "close"). You have to use
the run of the monad¹ to actually do anything, and its type ensure
that the file handles are correctly closed after use.
module File : sig
type ('a, 'st) t
type open_st = Open
type close_st = Close
val bind : ('a, 's1) t -> ('a -> ('b, 's2) t) -> ('b, 's2) t
val open_ : string -> (unit, open_st) t
val read : (string, open_st) t
val close : (unit, close_st) t
val run : ('a, close_st) t -> 'a
end = struct
type ('a, 'st) t = unit -> 'a
type open_st = Open
type close_st = Close
let run m = m ()
let bind m f = fun () ->
let x = run m in
run (f x)
let close = fun () ->
print_endline "[lib] close"
let read = fun () ->
let result = "toto" in
print_endline ("[lib] read " ^ result);
result
let open_ path = fun () ->
print_endline ("[lib] open " ^ path)
end
let test =
let open File in
let (>>=) = bind in
run begin
open_ "/tmp/foo" >>= fun () ->
read >>= fun content ->
print_endline ("[user] read " ^ content);
close
end
(* starting with OCaml 4.13, you can use binding operators:
( let* ) instead of ( >>= ) *)
let test =
let open File in
let ( let* ) = bind in
run begin
let* () = open_ "/tmp/foo" in
let* content = read in
print_endline ("[user] read " ^ content);
close
end
Of course, this is only meant to give you a taste of the style of
API. For more serious uses, see Oleg's monadic
regions examples.
You may also be interested in the research programming language
Mezzo, which aims to
be a variant of ML with finer-grained control of state (and related
effectful patterns) through a linear typing discipline with separated
resources. Note that it is only a research experiment for now, not
actually aimed at users. ATS is also relevant,
though finally less ML-like. Rust may actually be a reasonable
"practical" counterpart to these experiments.
¹: it is actually not a monad because it has no return/unit combinator, but the point is to force type-controlled sequencing as the monadic bind operator does. It could have a map, though.