How to easily read lines from stdin? - ocaml

Some time ago, I decided to solve a simple task on HackerRank but using OCaml and Core, in order to learn them. In one of the tasks, I'm supposed to read data from standard input:
The first line contains an integer, denoting the number of entries
in the phone book. Each of the subsequent lines describes an entry in
the form of space-separated values on a single line. The first value
is a friend's name, and the second value is an -digit phone number.
After the lines of phone book entries, there are an unknown number of
lines of queries. Each line (query) contains a to look up, and you
must continue reading lines until there is no more input.
The main issues:
I don't know how many lines there will be
Last line don't ends by newline, so I can't just read scanf "%s\n" until End_of_file
And my code became messy:
open Core.Std
open Printf
open Scanf
let read_numbers n =
let phone_book = String.Table.create () ~size:n in
for i = 0 to (n - 1) do
match In_channel.input_line stdin with
| Some line -> (
match (String.split line ~on:' ') with
| key :: data :: _ -> Hashtbl.set phone_book ~key ~data
| _ -> failwith "This shouldn't happen"
)
| None -> failwith "This shouldn't happen"
done;
phone_book
let () =
let rec loop phone_book =
match In_channel.input_line stdin with
| Some line -> (
let s = match Hashtbl.find phone_book line with
| Some number -> sprintf "%s=%s" line number
| None -> "Not found"
in
printf "%s\n%!" s;
loop phone_book
)
| None -> ()
in
match In_channel.input_line stdin with
| Some n -> (
let phone_book = read_numbers (int_of_string n) in
loop phone_book
)
| None -> failwith "This shouldn't happen"
If I solve this task in Python, then code looks like this:
n = int(input())
book = dict([tuple(input().split(' ')) for _ in range(n)])
while True:
try:
name = input()
except EOFError:
break
else:
if name in book:
print('{}={}'.format(name, book[name]))
else:
print('Not found')
This is shorter and clearer than the OCaml code. Any advice on how to improve my OCaml code? And there two important things: I don't want to abandon OCaml, I just want to learn it; second - I want to use Core because of the same reason.

The direct implementation of the Python code in OCaml would look like this:
let exec name =
In_channel.(with_file name ~f:input_lines) |> function
| [] -> invalid_arg "Got empty file"
| x :: xs ->
let es,qs = List.split_n xs (Int.of_string x) in
let es = List.map es ~f:(fun entry -> match String.split ~on:' ' entry with
| [name; phone] -> name,phone
| _ -> invalid_arg "bad entry format") in
List.iter qs ~f:(fun name ->
match List.Assoc.find es name with
| None -> printf "Not found\n"
| Some phone -> printf "%s=%s\n" name phone)
However, OCaml is not a script-language for writing small scripts and one shot prototypes. It is the language for writing real software, that must be readable, supportable, testable, and maintainable. That's why we have types, modules, and all the stuff. So, if I were writing a production quality program, that is responsible for working with such input, then it will look very differently.
The general style that I personally employ, when I'm writing a program in a functional language is to follow these two simple rules:
When in doubt use more types.
Have fun (lots of fun).
I.e., allocate a type for each concept in the program domain, and use lots of small function.
The following code is twice as big, but is more readable, maintainable, and robust.
So, first of all, let's type: the entry is simply a record. I used a string type to represent a phone for simplicity.
type entry = {
name : string;
phone : string;
}
The query is not specified in the task, so let's just stub it with a string:
type query = Q of string
Now our parser state. We have three possible states: the Start state, a state Entry n, where we're parsing entries with n entries left so far, and Query state, when we're parsing queries.
type state =
| Start
| Entry of int
| Query
Now we need to write a function for each state, but first of all, let's define an error handling policy. For a simple program, I would suggest just to fail on a parser error. We will call a function named expect when our expectations fail:
let expect what got =
failwithf "Parser error: expected %s got %s\n" what got ()
Now the three parsing functions:
let parse_query s = Q s
let parse_entry s line = match String.split ~on:' ' line with
| [name;phone] -> {name;phone}
| _ -> expect "<name> <phone>" line
let parse_expected s =
try int_of_string s with exn ->
expect "<number-of-entries>" s
Now let's write the parser:
let parse (es,qs,state) input = match state with
| Start -> es,qs,Entry (parse_expected input)
| Entry 0 -> es,qs,Query
| Entry n -> parse_entry input :: es,qs,Entry (n-1)
| Query -> es, parse_query input :: qs,Query
And finally, let's read data from file:
let of_file name =
let es,qs,state =
In_channel.with_file name ~f:(fun ch ->
In_channel.fold_lines ch ~init:([],[],Start) ~f:parse) in
match state with
| Entry 0 | Query -> ()
| Start -> expect "<number-of-entries><br>..." "<empty>"
| Entry n -> expect (sprintf "%d entries" n) "fewer"
We also check that our state machine reached a proper finish state, that is it is either in Query or Entry 0 state.

As in Python, the key to a concise implementation is to let the standard library do most of the work; the following code uses Sequence.fold in lieu of Python's list comprehension. Also, using Pervasives.input_line rather than In_channel.input_line allows you to cut down on extraneous pattern matching (it will report an end of file condition as an exception rather than a None result).
open Core.Std
module Dict = Map.Make(String)
let n = int_of_string (input_line stdin)
let d = Sequence.fold
(Sequence.range 0 n)
~init:Dict.empty
~f:(fun d _ -> let line = input_line stdin in
Scanf.sscanf line "%s %s" (fun k v -> Dict.add d ~key:k ~data:v))
let () =
try while true do
let name = input_line stdin in
match Dict.find d name with
| Some number -> Printf.printf "%s=%s\n" name number
| None -> Printf.printf "Not found.\n"
done with End_of_file -> ()

Related

Ocaml cast string to list of tuples

I have the file "example.dat" with text "[(1,2); (3,4); (5,6)]". I need to get list of tuples from it. I know, how I can get it from list of ints.
# let f line = List.map int_of_string line;;
# open Printf
let file = "example.dat"
let () =
let ic = open_in file in
try
let line = input_line ic in
f line;
flush stdout;
close_in ic
with e ->
close_in_noerr ic;
raise e;;
How I must to change my functions?
Given a list of strings that represent ints, your function f returns a list of ints. It doesn't return a list of tuples.
You don't say whether you want to verify that the input has some kind of proper form. If you want to verify that it has the form of (say) a list of type (int * int) list in OCaml, this is a parsing problem that would take some work.
If you just want to extract the parts of the input line that look like ints, you can use regular expression processing from the Str module:
# let re = Str.regexp "[^0-9]+" in
Str.split re "[(1,2); (37,4); (5,6)]";;
- : string list = ["1"; "2"; "37"; "4"; "5"; "6"]
Then you can rewrite your function f to collect up each pair of ints into a tuple. I don't see a good way to use List.map for this. You might have to write your own recursive function or use List.fold_left.
Update
I will write you a function that changes a list of values into a list of pairs. I hope this isn't for a school assignment, in which case you should be figuring this out for yourself.
let rec mkpairs l =
match l with
| [] | [_] -> []
| a :: b :: rest -> (a, b) :: mkpairs rest
As you can see, this function silently discards the last element of the list if the list has an odd number of elements.
This function is not tail recursive. So that's something you could think about improving.
let open Genlex in
let open Stream in
let lexer = make_lexer ["["; "("; ","; ")"; ";"; "]";] in
let stream = lexer (of_string array_string) in
let fail () = failwith "Malformed string" in
let parse_tuple acc = match next stream with
| Int first -> ( match next stream with
| Kwd "," -> ( match next stream with
| Int second -> ( match next stream with
| Kwd ")" -> (first, second) :: acc
| _ -> fail () )
| _ -> fail () )
| _ -> fail () )
| _ -> fail ()
in
let rec parse_array acc =
match next stream with
| Kwd "(" -> parse_array (parse_tuple acc)
| Kwd ";" -> parse_array acc
| Kwd "]" -> acc
| _ -> fail ()
in
try
match next stream with
| Kwd "[" -> List.rev (parse_array [])
| _ -> fail ()
with Stream.Failure -> fail ();;

Simple parsing of strings in Ocaml

I'm not sure about the best way to approach this, so I figured I'd ask. I have a line like this :
NAME="/dev/sda" TYPE="disk" MODEL="KINGSTON SV300S3"
(gotten from lsblk with a few options) and I'd like to extract each field as simply as possible. Yes, I know lsblk has a very nice --json, but that's unfortunately a recent addition I can't use, we have some pretty old servers still in production.
Maybe using Str with some regex ? Google seems to say menhir a lot, I've never used it, but I'm afraid that might be a bit heavy just for a few variables like that ?
I've tried using String.split_on_char and String.slice, but it starts getting complicated when model contains spaces, String.split_on_char doesn't ignore spaces between double quotes of course.
For simple format like this, the Scanf module might be a viable alternative:
let extract s = Scanf.sscanf s "NAME=%S TYPE=%S MODEL=%S" (fun x y z -> x, y ,z);;
;; extract {|NAME="/dev/sda" TYPE="disk" MODEL="KINGSTON SV300S3"|}
yields
("/dev/sda", "disk", "KINGSTON SV300S3")
as expected.
While Str could probably do the trick, the lesser-known Genlex module from the standard library can come quite handy for not-too-heavy string manipulation, at least for formats that more or less obey OCaml's lexical convention. Basically, it will transform your char stream into a stream of tokens that you can parse much more easily. I imagine that the full output format of lsblk might require some refinements, but for your example, the following is sufficient:
let lexer = Genlex.make_lexer [ "=" ]
let test = "NAME=\"/dev/sda\" TYPE=\"disk\" MODEL=\"KINGSTON SV300S3\""
let test_stream = Stream.of_string test
let test_stream_token = lexer test_stream
let info =
let l = ref [] in
try
while true do
let kw = Stream.next test_stream_token in
let eq = Stream.next test_stream_token in
let v = Stream.next test_stream_token in
let kw =
match kw with Ident s -> s | _ -> failwith "Unrecognized pattern"
in
let () = match eq with Kwd "=" -> () | _ -> failwith "Expected '='" in
let v = match v with String s -> s | _ -> failwith "Expected string" in
l:=(kw,v)::!l
done;
assert false
with Stream.Failure -> List.rev !l
Basically, the main loop considers that the information contained in the input is a sequence of items of the form <key>="<value>", decomposed in three tokens by the Genlex-generated lexer.
It results in: [("NAME", "/dev/sda"); ("TYPE", "disk"); ("MODEL", "KINGSTON SV300S3")]
Got it :
let re = Str.regexp "NAME=\"\\(.*\\)\" TYPE=\"\\(.*\\)\" MODEL=\"\\(.*\\)\"" in
match Str.string_match re line 0 with
| false -> [`Null]
| true ->
let name = Str.matched_group 1 line in
let typ = Str.matched_group 2 line in
let model = Str.matched_group 3 line in
Printf.printf "%s, %s, %s\n" name typ model

Recursive function and exception handling

I am building a floating point calculator and I'm stuck. The fp calculator has prompt shape, so my problem is that where I handle the exceptions I leave the recursive function that keeps the prompt showing up and ends the execution:
let initialDictionary = ref EmptyDictionary;;
let launcher () =
print_string ("Welcome");
let rec aux dic =
try
print_string ("->");
aux ( execute dic (s token (Lexing.from_string (read_line () ))));
with
End_of_exec -> print_endline ("closing")
Var_not_assigned s -> printf "var %s not assigned" s
in aux !initialDictionary;;
Where:
val exec : dictionary -> Instruction -> dictionary;
type dictionary = (string,float)list
The point here is that as lists are immutable on ocaml, the only way I have to keep my dictionary stacking variables with its value is applying recursion taking the value dictionary from exec return.
So any ideas on how not to leave the "prompt like" execution while showing exceptions?
A read eval print loop interpreter has all these different phases which must be handled separately, and I think your implementation tried to do too much in a single step. Just breaking down the second part of your function will help untangle the different error handlers:
let launcher () =
let rec prompt dic =
print_string "->";
parse dic
and parse dic =
let res =
try
s token (Lexing.from_string ## read_line ())
with End_of_exec -> (print_encline "closing"; raise End_of_exec)
in evalloop dic res
and evalloop dic rep =
let dic =
try execute dic rep
with Var_not_assigned s -> (printf "var %s not assigned\n" s; dic)
in prompt dic
in prompt !InitialDictionary;;
Since each sub function will either tail call the next one or fail with an exception, the overall code should be tail recursive and optimised to a loop.
The good thing about this is that now you get a better picture of what is going on, and it makes it easier to change an individual step of the loop without touching the rest.
By the way, recall that read_line may trigger an End_of_file exception as well, but it should be trivial now to handle it now (left as an exercise).
Ok a friend of mine gave me a solution and thinking about it I cant figure out a more elegant way.
Let exec () =
Let _= print_string "->" in
( try
Let inst = s token (Lexing.from_string (read_line())) in
Try
Let r = exec_instruction !initialDictionary inst in
initialDictionary := r
With
| Var_not_assigned s -> printf "var %s not assigned. \n" s
| Com_not_implemented s -> printf " command %s not implemented. \n" s
| Function_not_implemented s -> printf " function %s not implemented. \n" s
With
| lexic_error -> print_endline "lexic error"
| Parsing. Parse_error -> print_endline "sintax error"
);;
Let rec program cont =
If cont then
Try
Exec (); program true
With
| End_of_exec -> print_endline "closing"; program false;;
Program true
The only thing that bothers me its the initialDictionary := r assignment, because exec_instruction returns a dictionary so it can be done recursive but it works anyways, ill figure out something hah.
Thank you for the help, if someone can see a brigther solution let me know.

prompt user to build a string list

I would like to build a string list by prompting the user for input. My end goal is to be able to parse a string list against a simple hash table using a simple routine.
`let list_find tbl ls =
List.iter (fun x ->
let mbr = if Hashtbl.mem tbl x then "aok" else "not found"
in
Printf.printf "%s %s\n" x mbr) ls ;;`
Building a string list is accomplished with the cons operator ::, but somehow I am not able to get the prompt to generate a string list. A simpe list function returns anything that is put into it as a list:
`let build_strlist x =
let rec aux x = match x with
| [] -> []
| hd :: tl -> hd :: aux tl
in
aux x ;;`
Thus far, I have been able to set the prompt, but building the string list did not go so well. I am inclined to think I should be using Buffer or Scanning.in_channel. This is what I have thus far:
`#load "unix.cma" ;;
let prompt () = Unix.isatty Unix.stdin && Unix.isatty Unix.stdout ;;
let build_strlist () =
let rec loop () =
let eof = ref false in
try
while not !eof do
if prompt () then print_endline "enter input ";
let line = read_line () in
if line = "-1" then eof := true
else
let rec build x = match x with
| [] -> []
| hd :: tl -> hd :: build tl
in
Printf.printf "you've entered %s\n" (List.iter (build line));
done
with End_of_file -> ()
in
loop () ;;`
I am getting an error the keyword "line" has the type string, but an expression was expected of type 'a list. Should I be building the string list using Buffer.create buf and then Buffer.add_string buf prepending [ followed by quotes " another " and a semicolon? This seems to be an overkill. Maybe I should just return a string list and ignore any attempts to "peek at what we have"? Printing will be done after checking the hash table.
I would like to have a prompt routine so that I can use ocaml for scripting and user interaction. I found some ideas on-line which allowed me to write the skeleton above.
I would probably break down the problem in several steps:
get the list of strings
process it (in your example, simply print it back)
1st step can be achieved with a recursive function as follow:
let build_strlist' () =
let rec loop l =
if prompt () then (
print_string "enter input: ";
match read_line () with
"-1" -> l
| s -> loop (s::l)
) else l
in loop [];;
See how that function loops on itself and build up the list l as it goes. As you mentioned in your comment, I dropped the imperative part of your code to keep the functional recursion only. You could have achieved the same by keeping instead the imperative part and leaving out the recursion, but recursion feels more natural to me, and if written correctly, leads to mostly the same machine code.
Once you have the list, simply apply a List.iter to it with the ad hoc printing function as you did in your original function.

OCaml error: wrong type of expression in constructor

I have a function save that take standard input, which is used individually like this:
./try < input.txt (* save function is in try file *)
input.txt
2
3
10 29 23
22 14 9
and now i put the function into another file called path.ml which is a part of my interpreter. Now I have a problem in defining the type of Save function and this is because save function has type in_channel, but when i write
type term = Save of in_channel
ocamlc complain about the parameter in the command function.
How can i fix this error? This is the reason why in my last question posted on stackoverflow, I asked for the way to express a variable that accept any type. I understand the answers but actually it doesn't help much in make the code running.
This is my code:
(* Data types *)
open Printf
type term = Print_line_in_file of int*string
| Print of string
| Save of in_channel (* error here *)
;;
let input_line_opt ic =
try Some (input_line ic)
with End_of_file -> None
let nth_line n filename =
let ic = open_in filename in
let rec aux i =
match input_line_opt ic with
| Some line ->
if i = n then begin
close_in ic;
(line)
end else aux (succ i)
| None ->
close_in ic;
failwith "end of file reached"
in
aux 1
(* get all lines *)
let k = ref 1
let first = ref ""
let second = ref ""
let sequence = ref []
let append_item lst a = lst # [a]
let save () =
try
while true do
let line = input_line stdin in
if k = ref 1
then
begin
first := line;
incr k;
end else
if k = ref 2
then
begin
second := line;
incr k;
end else
begin
sequence := append_item !sequence line;
incr k;
end
done;
None
with
End_of_file -> None;;
let rec command term = match term with
| Print (n) -> print_endline n
| Print_line_in_file (n, f) -> print_endline (nth_line n f)
| Save () -> save ()
;;
EDIT
Error in code:
Save of in_channel:
Error: This pattern matches values of type unit
but a pattern was expected which matches values of type in_channel
Save of unit:
Error: This expression has type 'a option
but an expression was expected of type unit
There are many errors in this code, so it's hard to know where to start.
One problem is this: your save function has type unit -> 'a option. So it's not the same type as the other branches of your final match. The fix is straightforward: save should return (), not None. In OCaml these are completely different things.
The immediate problem seems to be that you have Save () in your match, but have declared Save as taking an input channel. Your current code doesn't have any way to pass the input channel to the save function, but if it did, you would want something more like this in your match:
| Save ch -> save ch
Errors like this suggest (to me) that you're not so familiar with OCaml's type system. It would probably save you a lot of trouble if you went through a tutorial of some kind before writing much more code. You can find tutorials at http://ocaml.org.