Ocaml lwt read stdout from other process - ocaml

I'm trying to build a new frontend in Ocaml for a terminal based application. The main idea is the spawn a new process with Lwt:
let cmd = shell "./otherterminalapp" in
let p = open_process_full cmd;
And then later write stuff to the process' stdin, to execute commands in the external app.
Lwt_io.write_line p#stdin "some command" >>= (fun _ -> Lwt_io.flush p#stdin)
When I read the result from the command back in with Lwt_io.read_line_opt. How do I read till there aren't any lines left?
The problem I'm encountering is that my program just hangs at a certain point. When I read with read_line_opt, while I reached the end it seems like it's just waiting for the process to redirect new output.
How can I approach this?
A concrete example of what I'm trying to do:
(The terminal based application is ocamldebug)
Program source code:
open Lwt
open Lwt_unix
open Lwt_process
let () =
let run () =
let cmd = shell "ocamldebug test.d.byte" in
let dbgr = open_process_full cmd in
(((((((Lwt_io.write_line dbgr#stdin "info modules") >>=
(fun _ -> Lwt_io.flush dbgr#stdin))
>>= (fun _ -> Lwt_io.read_line_opt dbgr#stdout))
>>=
(fun s ->
(match s with
| Some l -> print_endline l
| None -> print_endline "nothing here! ");
Lwt_io.read_line_opt dbgr#stdout))
>>=
(fun s ->
(match s with
| Some l -> print_endline l
| None -> print_endline "nothing here! ");
Lwt_io.read_line_opt dbgr#stdout))
>>=
(fun s ->
(match s with
| Some l -> print_endline l
| None -> print_endline "nothing here! ");
Lwt_io.read_line_opt dbgr#stdout))
>>=
(fun s ->
(match s with
| Some l -> print_endline l
| None -> print_endline "nothing here! ");
Lwt_io.read_line_opt dbgr#stdout))
>>=
(fun s ->
(match s with
| Some l -> print_endline l
| None -> print_endline "nothing here! ");
Lwt.return ()) in
Lwt_main.run (run ())
If you would normally run ocamldebug with test.d.byte, you get the
following in your terminal:
OCaml Debugger version 4.03.0
(ocd) info modules
Loading program... done.
Used modules:
Std_exit Test Pervasives CamlinternalFormatBasics
(ocd)
When I execute the above program, I get the following printed:
OCaml Debugger version 4.03.0
(ocd) Loading program... Used modules:
Std_exit Test Pervasives CamlinternalFormatBasics
And here it just hangs..., my program doesn't exit. Even when I do
Ctrl-c/Ctrl-c in my terminal, there's an active ocamlrun process. The terminal however becomes responsive though.
I am missing something obvious here?

A call to Lwt.read_line_opt returns a deferred value, that will be determined in the future as Some data once the channel reads a newline-terminated string, or with None if the channel was closed. The channel will be closed if there was an end-of-file condition. For the regular files, the end-of-file condition occurs when the file pointer reaches the end of the file. For the pipes, that are used to communicate with the subprocess, the end-of-file condition occurs when the opposite side closes the file descriptor associated with the pipe.
The ocamldebug program doesn't close its inputs or outputs. It is an interactive program, that is ready to interact with a user for the infinite amount of time, or until a user closes the program, by either hitting Ctrl-D or using the quit command.
In your scenario, you wrote the info modules command into the channel's input. The process responded with the three lines (where each line is a piece of data terminated with the newline). Then the subprocess started to wait for the next input. You're not seeing the (ocd) prompt, because it is not terminated by the newline character. The program didn't hang-up. It is still waiting for the output from the subprocess, and the subprocess is waiting for the input from you (a dead lock).
If you really need to distinguish outputs from different commands, then you need to track the prompt in the subprocess output. Since the prompt is not terminated by the newline, you can't rely on the read_line* family of functions, since they are line buffered. You need to read all available characters and find the prompt in them manually.
On the other hand, if you do not really need to distinguish between the outputs of different commands, then you can ignore the prompt (actually, you may even filter it out, for the nicer output). In that case, you will have two concurrent subroutines - one would be responsible for feeding input, and another will read all the output, and dump it, without actually carrying about the contents of the data.

Related

Ocaml: Exception: End_of_file

I want to read a number but when i try to compile it it gives me Exception: End_of_file in the line read_int()
What am I doing wrong?
let k = read_int() ;;
let exercicio k=
Printf.printf "%d\n" k;
;;
If you want an online IDE which accept inputs, check : https://betterocaml.ml .
So far the best you will be able to find on internet. If you want a local install on windows check also https://github.com/gmattis/SimpleOCaml/.
The only issue with betterOCaml is that it doesn't handle infinite while loop, so be sure to save your file before running any programme with a while inside.

Why does a pipe to another process need to be closed plus set_close_on_exec to really close?

So, I was trying to use OCaml to communicate with a Python process. I wanted to pipe the Python program to the Python interpreter's stdin, and then read the Python program's output back in the OCaml process.
I was able to solve it like this:
let py_program = {|
import time
while True:
print('hi from Python', flush=True)
time.sleep(0.25)
|}
let exec_py_program () =
let cmd = "", [|"python3"; "-"|] in
let pipe_out_fd, pipe_out_fd_unix = Lwt_unix.pipe_out () in
(* Close the 1st time *)
let () = Lwt_unix.set_close_on_exec pipe_out_fd_unix in
let redir = `FD_move pipe_out_fd in
let py_stream = Lwt_process.pread_lines ~stdin:redir cmd in
let%lwt n = Lwt_unix.write_string pipe_out_fd_unix py_program 0 (String.length py_program) in
if n < String.length py_program then failwith "Failed to write python to pipe" else
let rec read_back () =
match%lwt Lwt_stream.get py_stream with
| Some str ->
let%lwt () = Lwt_io.printl ## "Got: " ^ str in
read_back ()
| None -> Lwt.return ()
in
(* Close the 2nd time *)
let%lwt () = Lwt_unix.close pipe_out_fd_unix in
read_back ()
I use "set_close_on_exec" to close the file descriptor corresponding to the pipe mapped to the Python process's stdin near the comment "Close the 1st time", and close the pipe again after sending over the Python program again ("Close the 2nd time"). "set_close_on_exec" supposedly closes the file descriptor "when the process calls exec on another process".
If I leave either of these lines out, the Python process indefinitely keeps reading from its stdin and never begins executing, so "hi from Python" is never received. So my question is, why are these both necessary? It was mostly a guess on my part.
Starting a program on a POSIX operating system (like Linux) is done in two steps. First, the process launching the program is forked, which creates a copy of the running process. Then, the new process is replaced by the new program using a call to exec. When the process is forked both resulting processes inherit all open file descriptors. Hence, to actually close a file descriptor it must be closed in both processes.
Setting the close-on-exec flag, causes the process to close the corresponding file descriptor as soon as exec is called. Hence, when you set this flag, only the old process has the open file descriptor after the program was started.
See also this question.

OCaml string length limitation when reading from stdin\file

As part of a Compiler Principles course I'm taking in my university, we're writing a compiler that's implemented in OCaml, which compiles Scheme code into CISC-like assembly (which is just C macros).
the basic operation of the compiler is such:
Read a *.scm file and convert it to an OCaml string.
Parse the string and perform various analyses.
Run a code generator on the AST output from the semantic analyzer, that outputs text into a *.c file.
Compile that file with GCC and run it in the terminal.
Well, all is good and well, except for this: I'm trying to read an input file, that's around 4000 lines long, and is basically one huge expressions that's a mix of Scheme if & and.
I'm executing the compiler via utop. When I try to read the input file, I immediately get a stack overflow error message. It is my initial guess that the file is just to large for OCaml to handle, but I wasn't able to find any documentation that would support this theory.
Any suggestions?
The maximum string length is given by Sys.max_string_length. For a 32-bit system, it's quite short: 16777211. For a 64-bit system, it's 144115188075855863.
Unless you're using a 32-bit system, and your 4000-line file is over 16MB, I don't think you're hitting the string length limit.
A stack overflow is not what you'd expect to see when a string is too long.
It's more likely that you have infinite recursion, or possibly just a very deeply nested computation.
Well, it turns out that the limitation was the amount of maximum ram the OCaml is configured to use.
I ran the following command in the terminal in order to increase the quota:
export OCAMLRUNPARAM="l=5555555555"
This worked like a charm - I managed to read and compile the input file almost instantaneously.
For reference purposes, this is the code that reads the file:
let file_to_string input_file =
let in_channel = open_in input_file in
let rec run () =
try
let ch = input_char in_channel in ch :: (run ())
with End_of_file ->
( close_in in_channel;
[] )
in list_to_string (run ());;
where list_to_string is:
let list_to_string s =
let rec loop s n =
match s with
| [] -> String.make n '?'
| car :: cdr ->
let result = loop cdr (n + 1) in
String.set result n car;
result
in
loop s 0;;
funny thing is - I wrote file_to_string in tail recursion. This prevented the stack overflow, but for some reason went into an infinite loop. Oh, well...

How to trigger and handle a click event?

In the following code, I try to handle a click event on a checkbox. I expect to see the word "hello" printed in the javascript console, but instead I see nothing. How can I modify the code to get the print statement to execute?
let checkGroupByRounds = Dom_html.createInput ~_type:(Js.string "checkbox") doc in
Lwt_js_events.clicks checkGroupByRounds (fun event event_loop ->
Lwt.return (Printf.printf "hello"));
Dom.appendChild container checkGroupByRounds;
You need to flush the standard output with a new line Printf.printf "hello\n" or an explicit flush flush stdout.

OCaml - Fatal error: exception Sys_error("Broken pipe") when using `| head` on output containing many lines

I have text file with many lines. I want to write a simple OCaml program that will process this file line by line and maybe print the line.
For writing this program, I first created a smaller file, with fewer lines - so that program will finish executing faster.
$ wc -l input/master
214745 input/master
$ head -50 input/master > input/small-master
Here is the simple boilerplate filter.ml program I wrote:
open Core.Std;;
open Printf;;
open Core.In_channel;;
if Array.length Sys.argv >= 2 then begin
let rec process_lines ?ix master_file =
let ix = match ix with
| None -> 0
| Some x -> x
in
match input_line master_file with
| Some line -> (
if ix > 9 then printf "%d == %s\n" ix line;
process_lines ~ix:(ix+1) master_file
)
| None -> close master_file
in
let master_file = create Sys.argv.(1) in
process_lines master_file
end
It takes the input file's location as a command line argument, creates a file-handle for reading this file and calls the recursive function process_lines with this file-handle as an argument.
process_lines uses the optional argument ix to count the line numbers as it reads from the file-handle line by line. process_lines simply prints the line that was read from the file_handle to the standard output.
Then, when, I execute the program on the smaller input file and pipe the output to the Linux head command everything works fine:
$ ./filter.native input/small-master |head -2
10 == 1000032|BINCH JAMES G|4|2012-11-13|edgar/data/1000032/0001181431-12-058269.txt
11 == 1000032|BINCH JAMES G|4|2012-12-03|edgar/data/1000032/0001181431-12-061825.txt
And, when, I execute the program on the larger file I see a broken-pipe error:
$ ./filter.native input/master |head -2
10 == 1000032|BINCH JAMES G|4|2012-11-13|edgar/data/1000032/0001181431-12-058269.txt
11 == 1000032|BINCH JAMES G|4|2012-12-03|edgar/data/1000032/0001181431-12-061825.txt
Fatal error: exception Sys_error("Broken pipe")
Raised by primitive operation at file "pervasives.ml", line 264, characters 2-40
Called from file "printf.ml", line 615, characters 15-25
Called from file "find.ml", line 13, characters 21-48
Called from file "find.ml", line 19, characters 2-27
I learnt that such broken pipe errors will occur when the reader of a pipe (head command in this case) exits before the writer of the pipe (my OCaml program in this case) has done writing. Which is why I will never get such an error if I used the tail command as the reader.
However, why didn't the broken-pipe error occur when the file had lesser number of lines ?
The broken pipe signal is a basic part of the Unix design. When you have a pipeline a | b where b reads only a small amount of data, you don't want a to waste its time writing after b has read all it needs. To make this happen, Unix sends the broken pipe signal to a process that writes to a pipe that nobody is reading. In the usual case, this causes the program to exit silently (i.e., it kills the program), which is just what you want.
In this hypothetical example, b exits after reading a few lines, which means nobody is reading the pipe. The next time a tries to write more output, it gets sent the broken pipe signal and exits.
In your case a is your program and b is head.
It appears that the OCaml runtime is noticing the signal and is not exiting silently. You could consider this a flaw, or maybe it's good to know whenever a signal has terminated your program. The best way to fix it would be to catch the signal yourself and exit silently.
The reason it doesn't happen for the small file is that the whole output fits into the pipe. (A pipe represents a buffer of 64K bytes or so.) Your program just writes its data and exits; there's not enough time for your program to try to write to a pipe with no reader.