Simple parsing of strings in Ocaml - ocaml

I'm not sure about the best way to approach this, so I figured I'd ask. I have a line like this :
NAME="/dev/sda" TYPE="disk" MODEL="KINGSTON SV300S3"
(gotten from lsblk with a few options) and I'd like to extract each field as simply as possible. Yes, I know lsblk has a very nice --json, but that's unfortunately a recent addition I can't use, we have some pretty old servers still in production.
Maybe using Str with some regex ? Google seems to say menhir a lot, I've never used it, but I'm afraid that might be a bit heavy just for a few variables like that ?
I've tried using String.split_on_char and String.slice, but it starts getting complicated when model contains spaces, String.split_on_char doesn't ignore spaces between double quotes of course.

For simple format like this, the Scanf module might be a viable alternative:
let extract s = Scanf.sscanf s "NAME=%S TYPE=%S MODEL=%S" (fun x y z -> x, y ,z);;
;; extract {|NAME="/dev/sda" TYPE="disk" MODEL="KINGSTON SV300S3"|}
yields
("/dev/sda", "disk", "KINGSTON SV300S3")
as expected.

While Str could probably do the trick, the lesser-known Genlex module from the standard library can come quite handy for not-too-heavy string manipulation, at least for formats that more or less obey OCaml's lexical convention. Basically, it will transform your char stream into a stream of tokens that you can parse much more easily. I imagine that the full output format of lsblk might require some refinements, but for your example, the following is sufficient:
let lexer = Genlex.make_lexer [ "=" ]
let test = "NAME=\"/dev/sda\" TYPE=\"disk\" MODEL=\"KINGSTON SV300S3\""
let test_stream = Stream.of_string test
let test_stream_token = lexer test_stream
let info =
let l = ref [] in
try
while true do
let kw = Stream.next test_stream_token in
let eq = Stream.next test_stream_token in
let v = Stream.next test_stream_token in
let kw =
match kw with Ident s -> s | _ -> failwith "Unrecognized pattern"
in
let () = match eq with Kwd "=" -> () | _ -> failwith "Expected '='" in
let v = match v with String s -> s | _ -> failwith "Expected string" in
l:=(kw,v)::!l
done;
assert false
with Stream.Failure -> List.rev !l
Basically, the main loop considers that the information contained in the input is a sequence of items of the form <key>="<value>", decomposed in three tokens by the Genlex-generated lexer.
It results in: [("NAME", "/dev/sda"); ("TYPE", "disk"); ("MODEL", "KINGSTON SV300S3")]

Got it :
let re = Str.regexp "NAME=\"\\(.*\\)\" TYPE=\"\\(.*\\)\" MODEL=\"\\(.*\\)\"" in
match Str.string_match re line 0 with
| false -> [`Null]
| true ->
let name = Str.matched_group 1 line in
let typ = Str.matched_group 2 line in
let model = Str.matched_group 3 line in
Printf.printf "%s, %s, %s\n" name typ model

Related

Use meta-programming in F* for a syntactic check on a function argument

I would like to write a function that enforces that its argument is, syntactically, a constant string. Here's what I tried:
module Test
module R = FStar.Reflection
let is_literal (t: R.term) =
match R.inspect_ln t with
| R.Tv_Const (R.C_String _) -> true
| _ -> false
let check_literal (s: string { normalize (is_literal (`s)) }) =
()
let test () =
check_literal ""; // should work
let s = "" in
check_literal s // should not work
However, I'm pretty sure static quotations (with `) are not what I want, but instead dynamic quotations with quote. But this would put my precondition into the Tac effect. Is there any way to do what I want in the current state of things?
I don't know if you finally found a solution, but what about implicit meta arguments?
They somehow allow running Tac code at function invocation time, making quote usable.
Changing your code a bit doing so seems to work:
module Is_lit
open FStar.Tactics
let is_literal (t: term) =
match inspect_ln t with
| Tv_Const (C_String _) -> true
| _ -> false
let check_literal (s: string)
(#[(if (normalize_term (is_literal (quote s)))
then exact (`())
else fail "not a litteral")
] witness: unit)
: unit =
()
// success
let _ = check_literal "hey"
// failure
[#expect_failure]
let _ = let s = "hey" in check_literal s

How to easily read lines from stdin?

Some time ago, I decided to solve a simple task on HackerRank but using OCaml and Core, in order to learn them. In one of the tasks, I'm supposed to read data from standard input:
The first line contains an integer, denoting the number of entries
in the phone book. Each of the subsequent lines describes an entry in
the form of space-separated values on a single line. The first value
is a friend's name, and the second value is an -digit phone number.
After the lines of phone book entries, there are an unknown number of
lines of queries. Each line (query) contains a to look up, and you
must continue reading lines until there is no more input.
The main issues:
I don't know how many lines there will be
Last line don't ends by newline, so I can't just read scanf "%s\n" until End_of_file
And my code became messy:
open Core.Std
open Printf
open Scanf
let read_numbers n =
let phone_book = String.Table.create () ~size:n in
for i = 0 to (n - 1) do
match In_channel.input_line stdin with
| Some line -> (
match (String.split line ~on:' ') with
| key :: data :: _ -> Hashtbl.set phone_book ~key ~data
| _ -> failwith "This shouldn't happen"
)
| None -> failwith "This shouldn't happen"
done;
phone_book
let () =
let rec loop phone_book =
match In_channel.input_line stdin with
| Some line -> (
let s = match Hashtbl.find phone_book line with
| Some number -> sprintf "%s=%s" line number
| None -> "Not found"
in
printf "%s\n%!" s;
loop phone_book
)
| None -> ()
in
match In_channel.input_line stdin with
| Some n -> (
let phone_book = read_numbers (int_of_string n) in
loop phone_book
)
| None -> failwith "This shouldn't happen"
If I solve this task in Python, then code looks like this:
n = int(input())
book = dict([tuple(input().split(' ')) for _ in range(n)])
while True:
try:
name = input()
except EOFError:
break
else:
if name in book:
print('{}={}'.format(name, book[name]))
else:
print('Not found')
This is shorter and clearer than the OCaml code. Any advice on how to improve my OCaml code? And there two important things: I don't want to abandon OCaml, I just want to learn it; second - I want to use Core because of the same reason.
The direct implementation of the Python code in OCaml would look like this:
let exec name =
In_channel.(with_file name ~f:input_lines) |> function
| [] -> invalid_arg "Got empty file"
| x :: xs ->
let es,qs = List.split_n xs (Int.of_string x) in
let es = List.map es ~f:(fun entry -> match String.split ~on:' ' entry with
| [name; phone] -> name,phone
| _ -> invalid_arg "bad entry format") in
List.iter qs ~f:(fun name ->
match List.Assoc.find es name with
| None -> printf "Not found\n"
| Some phone -> printf "%s=%s\n" name phone)
However, OCaml is not a script-language for writing small scripts and one shot prototypes. It is the language for writing real software, that must be readable, supportable, testable, and maintainable. That's why we have types, modules, and all the stuff. So, if I were writing a production quality program, that is responsible for working with such input, then it will look very differently.
The general style that I personally employ, when I'm writing a program in a functional language is to follow these two simple rules:
When in doubt use more types.
Have fun (lots of fun).
I.e., allocate a type for each concept in the program domain, and use lots of small function.
The following code is twice as big, but is more readable, maintainable, and robust.
So, first of all, let's type: the entry is simply a record. I used a string type to represent a phone for simplicity.
type entry = {
name : string;
phone : string;
}
The query is not specified in the task, so let's just stub it with a string:
type query = Q of string
Now our parser state. We have three possible states: the Start state, a state Entry n, where we're parsing entries with n entries left so far, and Query state, when we're parsing queries.
type state =
| Start
| Entry of int
| Query
Now we need to write a function for each state, but first of all, let's define an error handling policy. For a simple program, I would suggest just to fail on a parser error. We will call a function named expect when our expectations fail:
let expect what got =
failwithf "Parser error: expected %s got %s\n" what got ()
Now the three parsing functions:
let parse_query s = Q s
let parse_entry s line = match String.split ~on:' ' line with
| [name;phone] -> {name;phone}
| _ -> expect "<name> <phone>" line
let parse_expected s =
try int_of_string s with exn ->
expect "<number-of-entries>" s
Now let's write the parser:
let parse (es,qs,state) input = match state with
| Start -> es,qs,Entry (parse_expected input)
| Entry 0 -> es,qs,Query
| Entry n -> parse_entry input :: es,qs,Entry (n-1)
| Query -> es, parse_query input :: qs,Query
And finally, let's read data from file:
let of_file name =
let es,qs,state =
In_channel.with_file name ~f:(fun ch ->
In_channel.fold_lines ch ~init:([],[],Start) ~f:parse) in
match state with
| Entry 0 | Query -> ()
| Start -> expect "<number-of-entries><br>..." "<empty>"
| Entry n -> expect (sprintf "%d entries" n) "fewer"
We also check that our state machine reached a proper finish state, that is it is either in Query or Entry 0 state.
As in Python, the key to a concise implementation is to let the standard library do most of the work; the following code uses Sequence.fold in lieu of Python's list comprehension. Also, using Pervasives.input_line rather than In_channel.input_line allows you to cut down on extraneous pattern matching (it will report an end of file condition as an exception rather than a None result).
open Core.Std
module Dict = Map.Make(String)
let n = int_of_string (input_line stdin)
let d = Sequence.fold
(Sequence.range 0 n)
~init:Dict.empty
~f:(fun d _ -> let line = input_line stdin in
Scanf.sscanf line "%s %s" (fun k v -> Dict.add d ~key:k ~data:v))
let () =
try while true do
let name = input_line stdin in
match Dict.find d name with
| Some number -> Printf.printf "%s=%s\n" name number
| None -> Printf.printf "Not found.\n"
done with End_of_file -> ()

Extracting data from a tuple in OCaml

I'm trying to use the CIL library to parse C source code. I'm searching for a particular function using its name.
let cil_func = Caml.List.find (fun g ->
match g with
| GFun(f,_) when (equal f.svar.vname func) -> true
| _ -> false
) cil_file.globals in
let body g = match g with GFun(f,_) -> f.sbody in
dumpBlock defaultCilPrinter stdout 1 (body cil_func)
So I have a type GFun of fundec * location, and I'm trying to get the sbody attribute of fundec.
It seems redundant to do a second pattern match, not to mention, the compiler complains that it's not exhaustive. Is there a better way of doing this?
You can define your own function that returns just the fundec:
let rec find_fundec fname = function
| [] -> raise Not_found
| GFun (f, _) :: _ when equal (f.svar.vname fname) -> f (* ? *)
| _ :: t -> find_fundec fname t
Then your code looks more like this:
let cil_fundec = find_fundec func cil_file.globals in
dumpBlock defaultCilPrinter stdout 1 cil_fundec.sbody
For what it's worth, the line marked (* ? *) looks wrong to me. I don't see why f.svar.vname would be a function. I'm just copying your code there.
Update
Fixed an error (one I often make), sorry.

OCaml error: wrong type of expression in constructor

I have a function save that take standard input, which is used individually like this:
./try < input.txt (* save function is in try file *)
input.txt
2
3
10 29 23
22 14 9
and now i put the function into another file called path.ml which is a part of my interpreter. Now I have a problem in defining the type of Save function and this is because save function has type in_channel, but when i write
type term = Save of in_channel
ocamlc complain about the parameter in the command function.
How can i fix this error? This is the reason why in my last question posted on stackoverflow, I asked for the way to express a variable that accept any type. I understand the answers but actually it doesn't help much in make the code running.
This is my code:
(* Data types *)
open Printf
type term = Print_line_in_file of int*string
| Print of string
| Save of in_channel (* error here *)
;;
let input_line_opt ic =
try Some (input_line ic)
with End_of_file -> None
let nth_line n filename =
let ic = open_in filename in
let rec aux i =
match input_line_opt ic with
| Some line ->
if i = n then begin
close_in ic;
(line)
end else aux (succ i)
| None ->
close_in ic;
failwith "end of file reached"
in
aux 1
(* get all lines *)
let k = ref 1
let first = ref ""
let second = ref ""
let sequence = ref []
let append_item lst a = lst # [a]
let save () =
try
while true do
let line = input_line stdin in
if k = ref 1
then
begin
first := line;
incr k;
end else
if k = ref 2
then
begin
second := line;
incr k;
end else
begin
sequence := append_item !sequence line;
incr k;
end
done;
None
with
End_of_file -> None;;
let rec command term = match term with
| Print (n) -> print_endline n
| Print_line_in_file (n, f) -> print_endline (nth_line n f)
| Save () -> save ()
;;
EDIT
Error in code:
Save of in_channel:
Error: This pattern matches values of type unit
but a pattern was expected which matches values of type in_channel
Save of unit:
Error: This expression has type 'a option
but an expression was expected of type unit
There are many errors in this code, so it's hard to know where to start.
One problem is this: your save function has type unit -> 'a option. So it's not the same type as the other branches of your final match. The fix is straightforward: save should return (), not None. In OCaml these are completely different things.
The immediate problem seems to be that you have Save () in your match, but have declared Save as taking an input channel. Your current code doesn't have any way to pass the input channel to the save function, but if it did, you would want something more like this in your match:
| Save ch -> save ch
Errors like this suggest (to me) that you're not so familiar with OCaml's type system. It would probably save you a lot of trouble if you went through a tutorial of some kind before writing much more code. You can find tutorials at http://ocaml.org.

F# Active Pattern List.filter or equivalent

I have a records of types
type tradeLeg = {
id : int ;
tradeId : int ;
legActivity : LegActivityType ;
actedOn : DateTime ;
estimates : legComponents ;
entryType : ShareOrDollarBased ;
confirmedPrice: DollarsPerShare option;
actuals : legComponents option ;
type trade = {
id : int ;
securityId : int ;
ricCode : string ;
tradeActivity : TradeType ;
enteredOn : DateTime ;
closedOn : DateTime ;
tradeLegs : tradeLeg list ;
}
Obviously the tradeLegs are a type off of a trade. A leg may be settled or unsettled (or unsettled but price confirmed) - thus I have defined the active pattern:
let (|LegIsSettled|LegIsConfirmed|LegIsUnsettled|) (l: tradeLeg) =
if Helper.exists l.actuals then LegIsSettled
elif Helper.exists l.confirmedPrice then LegIsConfirmed
else LegIsUnsettled
and then to determine if a trade is settled (based on all legs matching LegIsSettled pattern:
let (|TradeIsSettled|TradeIsUnsettled|) (t: trade) =
if List.exists (
fun l ->
match l with
| LegIsSettled -> false
| _ -> true) t.tradeLegs then TradeIsSettled
else TradeIsUnsettled
I can see some advantages of this use of active patterns, however i would think there is a more efficient way to see if any item of a list either matches (or doesn't) an actie pattern without having to write a lambda expression specifically for it, and using List.exist.
Question is two fold:
is there a more concise way to express this?
is there a way to abstract the functionality / expression
(fun l ->
match l with
| LegIsSettled -> false
| _ -> true)
Such that
let itemMatchesPattern pattern item =
match item with
| pattern -> true
| _ -> false
such I could write (as I am reusing this design-pattern):
let curriedItemMatchesPattern = itemMatchesPattern LegIsSettled
if List.exists curriedItemMatchesPattern t.tradeLegs then TradeIsSettled
else TradeIsUnsettled
Thoughts?
To answer your question about active patterns, let me use a simpler example:
let (|Odd|Even|) n =
if n % 2 = 0 then Even else Odd
When you declare a pattern that has multiple options using (|Odd|Even|), then the compiler understands it as a function that returns a value of type Choice<unit, unit>. So, the active pattern that you can work with is the whole combination |Odd|Even| and not just two constructs that you could use independently (such as |Odd| and |Even|).
It is possible to treat active patterns as first class functions, but if you're using patterns with multiple options, you cannot do much with it:
let pattern = (|Odd|Even|);;
val pattern : int -> Choice
You can write function that tests whether a value matches a specified pattern, but you'd need a lot of functions (because there are many Choice types overloaded by the number of type parameters):
let is1Of2 pattern item =
match pattern item with
| Choice1Of2 _ -> true
| _ -> false
> is1Of2 (|Odd|Even|) 1
val it : true
Something like this would work in your case, but it is far from being perfect.
You can do a little better job if you declare multiple partial active patterns (but then you of course loose some nice aspects of full active patterns such as completeness checking):
let (|Odd|_|) n =
if n % 2 = 0 then None else Some()
let (|Even|_|) n =
if n % 2 = 0 then Some() else None
Now you can write a function that checks whether a value matches pattern:
let matches pattern value =
match pattern value with
| Some _ -> true
| None -> false
> matches (|Odd|_|) 1;;
val it : bool = true
> matches (|Even|_|) 2;;
val it : bool = true
Summary While there may be some more or less elegant way to achieve what you need, I'd probably consider whether active patterns give you any big advantage over using standard functions. It may be a better idea to implenent the code using functions first and then decide which of the constructs would be useful as active patterns and add active patterns later. In this case, the usual code wouldn't look much worse:
type LegResult = LegIsSettled | LegIsConfirmed | LegIsUnsettled
let getLegStatus (l: tradeLeg) =
if Helper.exists l.actuals then LegIsSettled
elif Helper.exists l.confirmedPrice then LegIsConfirmed
else LegIsUnsettled
// Later in the code you would use pattern matching
match getLegStatus trade with
| LegIsSettled -> // ...
| LegIsUnSettled -> // ...
// But you can still use higher-order functions too
trades |> List.exist (fun t -> getLegStatus t = LegIsSettled)
// Which can be rewritten (if you like point-free style):
trades |> List.exist (getLegStatus >> ((=) LegIsSettled))
// Or you can write helper function (which is more readable):
let legStatusIs check trade = getLegStatus trade = check
trades |> List.exist (legStatusIs LegIsSettled)
In addition to Tomas's points on the actual details of active patterns, note that you can always shorten fun x -> match x with |... to function | ..., which will save a few keystrokes as well as the need to make up a potentially meaningless identifier.