I'd like to interrupt a call if it takes too long to compute, like this
try
do_something ()
with Too_long -> something_else ()
Is it possible to do something like that in OCaml? The function do_something may not be modified.
In general the only way to interrupt a function is to use a signal, as Basile suggested. Unfortunately the control flow will be transferred to a signal handler, so that you will be unable to return a value that you like. To get a more fine-grained control, you can run you do_something in separate thread. A first approximation would be the following function:
exception Timeout
let with_timeout timeout f =
let result = ref None in
let finished = Condition.create () in
let guard = Mutex.create () in
let set x =
Mutex.lock guard;
result := Some x;
Mutex.unlock guard in
Mutex.lock guard;
let work () =
let x = f () in
set x;
Condition.signal finished in
let delay () =
Thread.delay timeout;
Condition.signal finished in
let task = Thread.create work () in
let wait = Thread.create delay () in
Condition.wait finished guard;
match !result with
| None ->
Thread.kill task;
raise Timeout
| Some x ->
Thread.kill wait;
x
The solution with threads as well as with signal function has some drawbacks. For example, threads are switched in OCaml in specific iterruption points, in general this is any allocations. So if your code doesn't perform any allocations or external calls, then it may never yield to other thread and will run forever. A good example of such function is let rec f () = f (). In this is your case, then you should run your function in another process instead of thread. There're many libraries for multiprocessing in OCaml, to name a few:
parmap
forkwork
async-parallel
lwt-parallel
There is no built-in facility to perform this precise operation in the standard library, but it is rather straightforward to implement. Using the Thread module, run one thread to perform your main program and a monitoring thread that will kill the program if it lasts too long. Here is a starting implementation:
type 'a state =
| Running
| Finished of 'a
| Failed of exn
| Cancelled of 'a
let bounded_run d f g x =
let state = ref Running in
let p = ref None in
let m = ref None in
let cancel t' = match !t' with
| Some(t) -> Thread.kill t
| None -> ()
in
let program () =
(try state := Finished(f x)
with exn -> state := Failed (exn));
cancel m;
in
let monitor () =
Thread.delay d;
match !state with
| Running -> cancel p; state := Cancelled(g x)
| _ -> ()
in
p := Some(Thread.create program ());
m := Some(Thread.create monitor p);
(match !m with
| None -> ()
| Some(t) -> Thread.join t);
!state
The call bounded_run d f g x runs f x for at most d seconds and returns Finished(f x) if the computation runs in the given time. It might return Failed(exn) if the computation throws an exception. When the computation lasts too long, the returned value is Cancelled(g x).
This implementation has many defaults, for instance, the state and the returned values should have different types (the value Running should not be possible in the returned type), it does not use mutexes to prevent concurrent accesses to the p and m variables holding references to the threads we use. While it is rough at the edges, this should get you started, but for more advanced usage, you should also learn Event or 3rd party libraries such as Lwt or Async – the former will require you to change your function.
(I guess that you are on Linux)
Read more about signal(7)-s. You could use Ocaml's Sys.signal for Sys.sigalarm and Unix module (notably Unix.setitimer)
Related
I'm writing code to monitor the content of a file. When the program reaches the end of the the file I want it to terminate cleanly.
let log () : input_channel Lwt.t =
openfile "log" [O_RDONLY] 0 >>= fun fd ->
Lwt.return (of_fd input fd);;
let rec loop (ic: input_channel) = Lwt_io.read_line ic >>= fun text ->
Lwt_io.printl text >>= fun _ -> loop ic;;
let monitor () : unit Lwt.t = log () >>= loop;;
let handler : exn -> unit Lwt.t = fun e -> match e with
| End_of_file -> let (p: unit Lwt.t), r = Lwt.wait() in p
| x -> Lwt.fail x;;
let main () : unit Lwt.t = Lwt.catch monitor handler;;
let _ = Lwt_main.run (main ());;
However, when reading a file and reaching the end, the program does not terminate, it just hangs and I have to escape with Ctrl+c. I am not sure what is going on under the hood with bind but I figured whatever it's doing, eventually Lwt_io.readline ic should eventually hit the end of the file and return an End_of_file exception, which presumably would get passed over to the handler, etc.
If I had to guess at a resolution, I would think maybe in the last bind of the definition of >>= I would include some if check. But I'd be checking, I think, whether Lwt_io.read_line returned End_of_file, which I though should be handled by the handler.
The Lwt.wait function creates a promise which could only be resolved using the second element of the returned pair, basically, this function will never terminate:
let never_ready () =
let (p,_) = Lwt.wait in
p
and this is exactly what you've written.
Concerning a graceful termination, ideally, you should do this in the loop function so that you can close the channel and prevent leaking of the valuable resources, e.g.,
let rec loop (ic: input_channel) =
Lwt_io.read_line ic >>= function
| exception End_of_file ->
Lwt.close ic
| text->
Lwt_io.printl text >>= fun () ->
loop ic
The minimum change to your code would be, however, to use Lwt.return () instead of Lwt.wait in the body of your handler.
I want to make a function which can make a pause less then 1 sec.
So I have made this function :
let pause(n:float)=
Unix.select [] [] [] n
;;
And I use it like this :
ignore(pause(0.1));
And the top level return me this error :
Exception: Unix.Unix_error (Unix.EINTR, "select", "").
What should I do ?
P.S. I have also tried with Thread.delay but I have same error.
This means a signal was sent to your program during the call to select.
Assuming you are using OCaml 4.03 or higher, you can use the Unix.sleepf function which does exactly what you expect:
# Unix.sleepf;;
- : float -> unit = <fun>
On older versions of OCaml, you can use the setitimer function, which will require some signal handling:
exception Alarm
let () = Sys.set_signal Sys.sigalrm ( Sys.Signal_handle (fun _ -> raise Alarm) )
let pause f =
let _ = Unix.setitimer Unix.ITIMER_REAL
{ Unix.it_interval = 0.; Unix.it_value = f; } in
try Unix.sleep (int_of_float (ceil f)) with
| Alarm -> ()
Note that this is not really thread safe and would conflict with other uses of SIGALRM. I strongly suggest you switch to the most recent version of OCaml.
I found a topic in the Racket group about the performance of channel creating.
I want to write a OCaml's version to test.
let post (c,x) = Event.sync (Event.send c x);;
let accept c = Event.sync (Event.receive c);;
let get_chan c = let n = accept c in print_int n;print_newline ();;
let chan_trans (old_chan, new_chan) =
let s = accept old_chan in
post (new_chan,(s+1));;
let rec whisper count init_val =
let rec aux n chan =
if n >= count then chan
else
let new_chan = Event.new_channel ()
in Thread.create chan_trans (chan, new_chan);
aux (n+1) new_chan
in let leftest_chan = Event.new_channel ()
in let t0 = Thread.create post (leftest_chan, init_val)
in let rightest_chan = aux 0 leftest_chan
in get_chan rightest_chan;;
whisper 10000 1;;
The question is, when I tested for whisper 1000 1, it produced 1001 as expected. However, when I tried to test whisper 10000 1, there's an error as
Fatal error: exception Sys_error("Thread.create: Resource temporarily unavailable")
I used this command to compile and run
ocamlc -thread unix.cma threads.cma -o prog whisper.ml&&./prog -I
+threads
OCaml Thread module uses the real system (kernel) threads. The total number of threads is bounded by the kernel:
cat /proc/sys/kernel/threads-max
251422
You can increase this of course,
echo 100000 > /proc/sys/kernel/threads-max
but a better approach would be to treat threads as a resource and manage them correspondingly.
let rec whisper count init_val =
let rec aux n t chan =
if n >= count then chan
else
let new_chan = Event.new_channel () in
let t' = Thread.create chan_trans (chan, new_chan) in
Thread.join t;
aux (n+1) t' new_chan in
let leftest_chan = Event.new_channel () in
let t = Thread.create post (leftest_chan, init_val) in
let rightest_chan = aux 0 t leftest_chan in
get_chan rightest_chan
In that case it will run with any size of the pipeline. For example:
$ ocamlbuild -use-ocamlfind -tag thread -pkg threads ev.native
$ time ./ev.native
100001
real 0m1.581s
But this implementation of Chinese Whispers is very crude and inefficient. You shouldn't use heavyweight native threads for this (and neither go uses them). Instead, you should use cooperative lightweight threads from Lwt or Async libraries. This would be much efficient and nice.
Implementation with Lwt
This implementation follows closely the Go implementation from the blog post, but I think that we can do this more efficient and concise in OCaml without using mailboxes (but I'm not sure whether it will conform to the rules of the benchmark).
open Lwt.Infix
let whispers n =
let rec whisper i p =
if i < n then
Lwt_mvar.take p >>= fun x ->
whisper (i+1) (Lwt_mvar.create (x+1))
else Lwt_mvar.take p in
whisper 0 (Lwt_mvar.create 1)
let () = print_int ## Lwt_main.run (whispers 100000)
The results are:
$ ocamlbuild -use-ocamlfind -tag thread -pkg lwt.unix lev.native --
$ time ./lev.native
100001
real 0m0.007s
To compare with Go implementation on mine machine:
$ go build whispers.go
$ time ./whispers
100001
real 0m0.952s
"Slow" implementation
The code above is a completely honest reimplementation of the original Go version. But one of the reasons why it so fast, is that OCaml and Lwt is very clever, and although it creates 100_000 threads and 100_001 channels, no threads are ever got yielded to a background, since every time the whisper is called the channel already contains data, so the thread is in a ready state. As a result, this is just an efficient loop, that creates threads and channels. It can create a million threads in 50 ms.
So this is an idiomatic and correct way of doing things. But lets for the sake of true comparison mimick Go behavior. The following implementation will first eagerly create in the heap 100_001 channels, and 100_000 threads, waiting to transfer data from left to right channel. And only afterward it will put a value into the leftmost channel to provoke a chain of reaction. This would basically mimick what is happening in Go underneath the hood.
let whispers n =
let rec loop i p =
if i < n then
let p' = Lwt_mvar.create_empty () in
let _t =
Lwt_mvar.take p >>= fun x ->
Lwt_mvar.put p' (x+1) in
loop (i+1) p'
else Lwt_mvar.take p in
let p0 = Lwt_mvar.create_empty () in
let t = loop 1 p0 in
Lwt_mvar.put p0 1 >>= fun () -> t
$ time ./lev.native
100001
real 0m0.111s
So it is slightly slower, in fact it is 20 times slower than the previous implementation (I've used 1 million of threads to compare them), but it is still 10 times faster than the Go.
Reading the linked post it seems you might want to use lwt which is a "cooperative threads library for OCaml". The result would look something like this:
let whisper left right =
let%lwt n = Lwt_mvar.take right in
Lwt_mvar.put left (n+1)
let main () =
let n = 100_000 in
let%lwt () = Lwt_io.printf "With %d mvars!\n" n in
let leftmost = Lwt_mvar.create_empty () in
let rec setup_whispers left i =
if i >= n
then left
else let right = Lwt_mvar.create_empty () in
let () = Lwt.async (fun () -> whisper left right) in
setup_whispers right (i+1) in
let rightmost = setup_whispers leftmost 0 in
let%lwt () = Lwt_mvar.put rightmost 1 in
let%lwt res = Lwt_mvar.take leftmost in
Lwt_io.printf "%d\n" res
let () = Lwt_main.run (main ())
And then compiling and running it
$ ocamlbuild -use-ocamlfind -pkg lwt,lwt.ppx,lwt.unix whisper.native
$ time ./whisper.native
With 100000 mvars!
100001
real 0m0.169s
user 0m0.156s
sys 0m0.008s
I'm trying to write a function that tries to evaluate a function, but stops after a specific timeout.
I tried to use Deferred.any, which returns a deferred that is fulfilled when one of the underlying deferred is fulfilled.
type 'a output = OK of 'a | Exn of exn
let fun_test msg f eq (inp,ans) =
let outp = wait_for (Deferred.any
[ return (try OK (f inp) with e -> Exn e)
; (after (Core.Std.sec 0.0) >>| (fun () -> Exn TIMEOUT))])
in {msg = msg;inp = inp;outp = outp;ans = ans;pass = eq outp ans}
I was not sure how to extract a value from the deferred monad, so I wrote a function 'wait_for' which just spins until the underlying value is determined.
let rec wait_for x =
match Deferred.peek x with
| None -> wait_for x
| Some done -> done;;
This did not work. After reading through the Async chapter of Real World OCaml, I realized I needed to start the scheduler. However I'm not sure where I would call Schedule.go in my code. I do not see where the type go : ?raise_unhandled_exn:bool -> unit -> Core.Std.never_returns would fit into code where you actually want your asynchronous code to return. The documentation for go says "Async programs do not exit until shutdown is called."
I was beginning to doubt I had taken the entirely wrong approach to the problem until I found a very similar solution to that same problem on this Cornell website
let timeout (thunk:unit -> 'a Deferred.t) (n:float) : ('a option) Deferred.t
= Deferred.any
[ after (sec n) >>| (fun () -> None) ;
thunk () >>= (fun x -> Some x) ]
Anyway, I'm not quite sure my use of wait_for is correct. Is there a canonical way to extract a value from the deferred monad? Also how do I start the scheduler?
Update:
I tried writing a timeout function using only Core.Std.Thread and Core.Std.Mutex.
let rec wait_for lck ptr =
Core.Std.Thread.delay 0.25;
Core.Std.Mutex.lock lck;
(match !ptr with
| None -> Core.Std.Mutex.unlock lck; wait_for lck ptr
| Some x -> Core.Std.Mutex.unlock lck; x);;
let timeout t f =
let lck = Core.Std.Mutex.create () in
let ptr = ref None in
let _ = Core.Std.Thread.create
(fun () -> Core.Std.Thread.delay t;
Core.Std.Mutex.lock lck;
(match !ptr with
| None -> ptr := Some (Exn TIMEOUT)
| Some _ -> ());
Core.Std.Mutex.unlock lck;) () in
let _ = Core.Std.Thread.create
(fun () -> let x = f () in
Core.Std.Mutex.lock lck;
(match !ptr with
| None -> ptr := Some x
| Some _ -> ());
Core.Std.Mutex.unlock lck;) () in
wait_for lck ptr
I think this is pretty close to working. It works on computations like let rec loop x = print_string ".\n"; loop x, but it does not work on computations like let rec loop x = loop x. I believe the problem right now is that if the computation f () loops infinitely, then its thread is never preempted, so none of other threads can notice the timeout has expired. If the thread does IO like printing a string, then the thread does get preempted. Also I don't know how to kill a thread, I couldn't find such a function in the documentation for Core.Std.Thread
The solution I came up with is
let kill pid sign =
try Unix.kill pid sign with
| Unix.Unix_error (e,f,p) -> debug_print ((Unix.error_message e)^"|"^f^"|"^p)
| e -> raise e;;
let timeout f arg time default =
let pipe_r,pipe_w = Unix.pipe () in
(match Unix.fork () with
| 0 -> let x = Some (f arg) in
let oc = Unix.out_channel_of_descr pipe_w in
Marshal.to_channel oc x [];
close_out oc;
exit 0
| pid0 ->
(match Unix.fork () with
| 0 -> Unix.sleep time;
kill pid0 Sys.sigkill;
let oc = Unix.out_channel_of_descr pipe_w in
Marshal.to_channel oc default [];
close_out oc;
exit 0
| pid1 -> let ic = Unix.in_channel_of_descr pipe_r in
let result = (Marshal.from_channel ic : 'b option) in
result ));;
I think I might be creating two zombie processes with this though. But it is the only solution that works on let rec loop x = loop x when compiled using ocamlopt (The solution using Unix.alarm given here works when compiled with ocamlc but not when compiled with ocamlopt).
I was trying to find an example about how to use TryScan, but haven't found any, could you help me?
What I would like to do (quite simplified example): I have a MailboxProcessor that accepts
two types of mesages.
First one GetState returns current state.
GetState messages are sent quite frequently
The other UpdateState is very expensive (time consuming) - e.g. downloading something from internet and then updates the state accordingly.
UpdateState is called only rarely.
My problem is - messages GetState are blocked and wait until preceding UpdateState are served. That's why I tried to use TryScan to process all GetState messages, but with no luck.
My example code:
type Msg = GetState of AsyncReplyChannel<int> | UpdateState
let mbox = MailboxProcessor.Start(fun mbox ->
let rec loop state = async {
// this TryScan doesn't work as expected
// it should process GetState messages and then continue
mbox.TryScan(fun m ->
match m with
| GetState(chnl) ->
printfn "G processing TryScan"
chnl.Reply(state)
Some(async { return! loop state})
| _ -> None
) |> ignore
let! msg = mbox.Receive()
match msg with
| UpdateState ->
printfn "U processing"
// something very time consuming here...
async { do! Async.Sleep(1000) } |> Async.RunSynchronously
return! loop (state+1)
| GetState(chnl) ->
printfn "G processing"
chnl.Reply(state)
return! loop state
}
loop 0
)
[async { for i in 1..10 do
printfn " U"
mbox.Post(UpdateState)
async { do! Async.Sleep(200) } |> Async.RunSynchronously
};
async { // wait some time so that several `UpdateState` messages are fired
async { do! Async.Sleep(500) } |> Async.RunSynchronously
for i in 1..20 do
printfn "G"
printfn "%d" (mbox.PostAndReply(GetState))
}] |> Async.Parallel |> Async.RunSynchronously
If you try to run the code, you will see, that GetState message is not almost processed, because it waits for the result. On the other hand UpdateState is only fire-and-forget, thus blocking effectively getting state.
Edit
Current solution that works for me is this one:
type Msg = GetState of AsyncReplyChannel<int> | UpdateState
let mbox = MailboxProcessor.Start(fun mbox ->
let rec loop state = async {
// this TryScan doesn't work as expected
// it should process GetState messages and then continue
let! res = mbox.TryScan((function
| GetState(chnl) -> Some(async {
chnl.Reply(state)
return state
})
| _ -> None
), 5)
match res with
| None ->
let! msg = mbox.Receive()
match msg with
| UpdateState ->
async { do! Async.Sleep(1000) } |> Async.RunSynchronously
return! loop (state+1)
| _ -> return! loop state
| Some n -> return! loop n
}
loop 0
)
Reactions to comments: the idea with other MailboxProcessor or ThreadPool that executes UpdateState in parallel is great, but I don't need it currently.
All I wanted to do is to process all GetState messages and after that the others. I don't care that during processing UpdateState the agent is blocked.
I'll show you what was the problem on the output:
// GetState messages are delayed 500 ms - see do! Async.Sleep(500)
// each UpdateState is sent after 200ms
// each GetState is sent immediatelly! (not real example, but illustrates the problem)
U 200ms <-- issue UpdateState
U processing <-- process UpdateState, it takes 1sec, so other
U 200ms 5 requests are sent; sent means, that it is
U 200ms fire-and-forget message - it doesn't wait for any result
and therefore it can send every 200ms one UpdateState message
G <-- first GetState sent, but waiting for reply - so all
previous UpdateState messages have to be processed! = 3 seconds
and AFTER all the UpdateState messages are processed, result
is returned and new GetState can be sent.
U 200ms
U 200ms because each UpdateState takes 1 second
U 200ms
U processing
U
U
U
U
U processing
G processing <-- now first GetState is processed! so late? uh..
U processing <-- takes 1sec
3
G
U processing <-- takes 1sec
U processing <-- takes 1sec
U processing <-- takes 1sec
U processing <-- takes 1sec
U processing <-- takes 1sec
U processing <-- takes 1sec
G processing <-- after MANY seconds, second GetState is processed!
10
G
G processing
// from this line, only GetState are issued and processed, because
// there is no UpdateState message in the queue, neither it is sent
I don't think that the TryScan method will help you in this scenario. It allows you to specify timeout to be used while waiting for messages. Once some message is received, it will start processing the message (ignoring the timeout).
For example, if you wanted to wait for some specific message, but perform some other checking every second (while waiting) you could write:
let loop () = async {
let! res = mbox.TryScan(function
| ImportantMessage -> Some(async {
// process message
return 0
})
| _ -> None)
match res with
| None ->
// perform some check & continue waiting
return! loop ()
| Some n ->
// ImportantMessage was received and processed
}
What can you do to avoid blocking the mailbox processor when processing the UpdateState message? The mailbox processor is (logically) single-threaded - you probably don't want to cancel the processing of UpdateState message, so the best option is to start processing it in background and wait until the processing completes. The code that processes UpdateState can then send some message back to the mailbox (e.g. UpdateStateCompleted).
Here is a sketch how this might look:
let rec loop (state) = async {
let! msg = mbox.Receive()
match msg with
| GetState(repl) ->
repl.Reply(state)
return! scanning state
| UpdateState ->
async {
// complex calculation (runs in parallel)
mbox.Post(UpdateStateCompleted newState) }
|> Async.Start
| UpdateStateCompleted newState ->
// Received new state from background workflow
return! loop newState }
Now that the background task is running in parallel, you need to be careful about mutable state. Also, if you send UpdateState messages faster than you can process them, you'll be in trouble. This can be fixed, for example, by ignoring or queueing requests when you're already processing previous one.
DON'T USE TRYSCAN!!!
Unfortunately, the TryScan function in the current version of F# is broken in two ways. Firstly, the whole point is to specify a timeout but the implementation does not actually honor it. Specifically, irrelevant messages reset the timer. Secondly, as with the other Scan function, the message queue is examined under a lock that prevents any other threads from posting for the duration of the scan, which can be an arbitrarily long time. Consequently, the TryScan function itself tends to lock-up concurrent systems and can even introduce deadlocks because the caller's code is evaluated inside the lock (e.g. posting from the function argument to Scan or TryScan can deadlock the agent when the code under the lock blocks waiting to acquire the lock it is already under).
I used TryScan in an early prototype of my production code and it caused no end of problems. However, I managed to architect around it and the resulting architecture was actually better. In essence, I eagerly Receive all messages and filter using my own local queue.
As Tomas mentioned MailboxProcessor is single threaded. You will need another MailboxProcessor to run the updates on a separate thread from the state getter.
#nowarn "40"
type Msg =
| GetState of AsyncReplyChannel<int>
| UpdateState
let runner_UpdateState = MailboxProcessor.Start(fun mbox ->
let rec loop = async {
let! state = mbox.Receive()
printfn "U start processing %d" !state
// something very time consuming here...
do! Async.Sleep 100
printfn "U done processing %d" !state
state := !state + 1
do! loop
}
loop
)
let mbox = MailboxProcessor.Start(fun mbox ->
// we need a mutiple state if another thread can change it at any time
let state = ref 0
let rec loop = async {
let! msg = mbox.Receive()
match msg with
| UpdateState -> runner_UpdateState.Post state
| GetState chnl -> chnl.Reply !state
return! loop
}
loop)
[
async {
for i in 1..10 do
mbox.Post UpdateState
do! Async.Sleep 200
};
async {
// wait some time so that several `UpdateState` messages are fired
do! Async.Sleep 1000
for i in 1..20 do
printfn "G %d" (mbox.PostAndReply GetState)
do! Async.Sleep 50
}
]
|> Async.Parallel
|> Async.RunSynchronously
|> ignore
System.Console.ReadLine() |> ignore
output:
U start processing 0
U done processing 0
U start processing 1
U done processing 1
U start processing 2
U done processing 2
U start processing 3
U done processing 3
U start processing 4
U done processing 4
G 5
U start processing 5
G 5
U done processing 5
G 5
G 6
U start processing 6
G 6
G 6
U done processing 6
G 7
U start processing 7
G 7
G 7
U done processing 7
G 8
G U start processing 8
8
G 8
U done processing 8
G 9
G 9
U start processing 9
G 9
U done processing 9
G 9
G 10
G 10
G 10
G 10
You could also use ThreadPool.
open System.Threading
type Msg =
| GetState of AsyncReplyChannel<int>
| SetState of int
| UpdateState
let mbox = MailboxProcessor.Start(fun mbox ->
let rec loop state = async {
let! msg = mbox.Receive()
match msg with
| UpdateState ->
ThreadPool.QueueUserWorkItem((fun obj ->
let state = obj :?> int
printfn "U start processing %d" state
Async.Sleep 100 |> Async.RunSynchronously
printfn "U done processing %d" state
mbox.Post(SetState(state + 1))
), state)
|> ignore
| GetState chnl ->
chnl.Reply state
| SetState newState ->
return! loop newState
return! loop state
}
loop 0)
[
async {
for i in 1..10 do
mbox.Post UpdateState
do! Async.Sleep 200
};
async {
// wait some time so that several `UpdateState` messages are fired
do! Async.Sleep 1000
for i in 1..20 do
printfn "G %d" (mbox.PostAndReply GetState)
do! Async.Sleep 50
}
]
|> Async.Parallel
|> Async.RunSynchronously
|> ignore
System.Console.ReadLine() |> ignore