How to parse a string into a code structure with TemplateHaskell? - templates

Right now, I have the following piece of code in my project:
embedNarration :: String -> Q Exp
embedNarration file =
let text = unsafePerformIO $ readFile file
parsedMaybe = parseNarration text
succ x = case x of
Left x -> throw $ ErrorCall x
Right x' -> x'
parsed = succ parsedMaybe
res = do parser <- [|((fromRight undefined) . parseNarration)|]
d <- return $ seq parsed text
return $ AppE parser $! LitE $! StringL d
in res
Which is jury-rigged from Data.FileEmbed module's source code. The intention of the code is to generate a Narration (which is a data structure defined in-code) from a resource file.
Right now this quite ugly piece that I don't fully understand tries to parse the resource file; throws a compile-time error if the parse is unsuccessful; or, if the parse is successful, embeds the following piece into the source code:
((fromRight undefined) . parseNarration $ "THE ENTIRE RESOURCE FILE")
Where parseNarration is a function :: String -> Either String Narration
The problem here is the double parsing - the resource file is parsed once during compile time to ensure it's valid, and then the second time during runtime from a string literal. Ideally, I want to, instead of a string literal and a call to the parser, for TemplateHaskell to directly substitute a Narration, so that the parser would only exist during compile-time. But I have no idea how to do this. Surface-level guides to TemplateHaskell and trying to jury-rigg the code further haven't been successful. Is it possible to do? If yes, how?

Related

OCAML Taking multiple arguments from stdin and operating on them one by one

I have written an interpreter using ocamllex and ocamlyacc, the lexer and the parser work correctly but currently they only parse the last .txt argument it receives as oppose to all of them in turn. For example, ./interpret one.txt two.txt three.txt only parses three.txt as oppose to parsing one.txt and then two.txt and then three.txt which is what I want. So for example the parse results are as follows:
one.txt -> "1"
two.txt -> "2"
three.txt -> "3"
On calling ./interpret one.txt two.txt three.txt the current output is: 3 but I want it to be 123
Here is my main class which deals with the stdin and stdout
open Lexer
open Parser
open Arg
open Printf
let toParse c =
try let lexbuf = Lexing.from_channel c in
parser_main lexer_main lexbuf
with Parsing.Parse_error -> failwith "Parse failure!" ;;
let argument = ref stdin in
let prog p = argument := open_in p in
let usage = "./interpreter FILE" in
parse [] prog usage ;
let parsed = toParse !argument in
let result = eval parsed in
let _ = parsed in
flush stdout;
Thanks for your time
There's not really enough code here to be able to help.
If I assume that the output is written by eval, then I see only one call to eval. But there's nothing here that deals with filenames from the command line, so it's hard to say more.
If you are planning to read input from files, then there's no reason to be using stdin for anything as far as I can tell.
(I know this is a very minor point, but this code doesn't constitute a class. Other languages use classes for everything, but this is a module.)
Update
Here's a module that works something like the Unix cat command; it writes out the contents of all the files from the command line one after the next.
let cat () =
for i = 1 to Array.length Sys.argv - 1 do
let ic = open_in Sys.argv.(i) in
let rec loop () =
match input_line ic with
| line -> output_string stdout (line ^ "\n"); loop ()
| exception End_of_file -> ()
in
loop ();
close_in ic
done
let () = cat ()
Here's how it looks when you compile and run it.
$ ocamlc -o mycat mycat.ml
$ echo test line 1 > file1
$ echo test line 2 > file2
$ ./mycat file1 file2
test line 1
test line 2

Get the input string that raises parsing error inside the parser

I have a frontend written in menhir which tries to parse an expression: from a string to an expression AST. The entry point of the frontend Parser_e.main is called in several different places in my OCaml code. So I would like to be able to catch possible errors inside the frontend rather than outside. When catching an error, a particular important information I want to show is the entire input string that the frontend cannot parse. (Errors from the lexer are very rare, because the frontend can almost read everything).
So I tried to follow this thread, and to print more information when there is an error. In parser_e.mly, I have added
exception LexErr of string
exception ParseErr of string
let error msg start finish =
Printf.sprintf "(line %d: char %d..%d): %s" start.pos_lnum
(start.pos_cnum - start.pos_bol) (finish.pos_cnum - finish.pos_bol) msg
let parse_error msg nterm =
raise (ParseErr (error msg (rhs_start_pos nterm) (rhs_end_pos nterm)))
e_expression:
/* empty */ { EE_empty }
| INTEGER { EE_integer $1 }
| DOUBLE { EE_double $1 }
...
| error { parse_error "e_expression" 1; ERR "" }
But it still does not have the input string as information. Does anyone if there is any function I am missing to get that?
In the context of an error you can extract a location of failed lexeme in a format of two positions, using Parsing.symbol_start_pos and Parsing.symbol_end_pos functions. Unfortunately Parsing module doesn't really provide an access to the lexeme as a string, but if the input was stored in file then it is possible to extract it manually or print an error in a compiler style, that a descent IDE will understand and highlight it manually. A module Parser_error is below. It defines function Parser_error.throw that will raise an Parser_error.T exception. The exception caries a diagnostic message and a position of a failed lexeme. Several handy functions are provided to extract this lexeme from a file, or to generate a fileposition message. If your input is not stored in a file, then you can use string_of_exn function that accepts the input as a string and the Parser_error.T exception, and extracts the offending substring from it. This is an example of a parser that uses this exception for error reporting.
open Lexing
(** T(message,start,finish) parser failed with a [message] on an
input specified by [start] and [finish] position.*)
exception T of (string * position * position)
(** [throw msg] raise a [Parser_error.T] exception with corresponding
message. Must be called in a semantic action of a production rule *)
let throw my_unique_msg =
let check_pos f = try f () with _ -> dummy_pos in
Printexc.(print_raw_backtrace stderr (get_raw_backtrace ()));
let sp = check_pos Parsing.symbol_start_pos in
let ep = check_pos Parsing.symbol_end_pos in
raise (T (my_unique_msg,sp,ep))
(** [fileposition start finish] creates a string describing a position
of an lexeme specified by [start] and [finish] file positions. The
message has the same format as OCaml and GNU compilers, so it is
recognized by most IDE, e.g., Emacs. *)
let fileposition err_s err_e =
Printf.sprintf
"\nFile \"%s\", line %d, at character %d-%d\n"
err_s.pos_fname err_s.pos_lnum err_s.pos_cnum err_e.pos_cnum
(** [string_of_exn line exn] given a [line] in a file, extract a failed
lexeme form the exception [exn] and create a string denoting the
parsing error in a format similar to the format used by OCaml
compiler, i.e., with fancy underlying. *)
let string_of_exn line (msg,err_s,err_e) =
let b = Buffer.create 42 in
if err_s.pos_fname <> "" then
Buffer.add_string b (fileposition err_s err_e);
Buffer.add_string b
(Printf.sprintf "Parse error: %s\n%s\n" msg line);
let start = max 0 (err_s.pos_cnum - err_s.pos_bol) in
for i=1 to start do
Buffer.add_char b ' '
done;
let diff = max 1 (err_e.pos_cnum - err_s.pos_cnum) in
for i=1 to diff do
Buffer.add_char b '^'
done;
Buffer.contents b
(** [extract_line err] a helper function that will extract a line from
a file designated by the parsing error exception *)
let extract_line err =
let line = ref "" in
try
let ic = open_in err.pos_fname in
for i=0 to max 0 (err.pos_lnum - 1) do
line := input_line ic
done;
close_in ic;
!line
with exn -> !line
(** [to_string exn] converts an exception to a string *)
let to_string ((msg,err,_) as exn) =
let line = extract_line err in
string_of_exn line exn
Here is an example, that shows how to use in case if there is no file, and input is from a stream or interactive (shell-like) source:
let parse_command line =
try
let lbuf = Lexing.from_string line in
`Ok Parser.statement Lexer.tokens lbuf
with
| Parsing.Parse_error -> `Fail "Parse error"
| Parser_error.T exn -> `Fail (Parser_error.string_of_exn line exn)

How to run program in OCaml toplevel with input from file?

I know that in order to load a program in OCaml one has to type #use "source_code_file.ml" in toplevel where source_code_file.ml is the file we want to use.
My program reads input from stdin. In the command line i have a txt file that with redirection is used to act as stdin. Can i do this in toplevel? I would like to this because in toplevel i can easily see what type variables have and if things are initialized with the correct values.
If you're on a Unix-like system you can use Unix.dup2 to do almost any kind of input redirection. Here is a function with_stdin that takes an input file name, a function, and a value. It calls the function with standard input redirected from the named file.
let with_stdin fname f x =
let oldstdin = Unix.dup Unix.stdin in
let newstdin = Unix.openfile fname [Unix.O_RDONLY] 0 in
Unix.dup2 newstdin Unix.stdin;
Unix.close newstdin;
let res = f x in
Unix.dup2 oldstdin Unix.stdin;
Unix.close oldstdin;
res
If your function doesn't consume the entire input the leftover input will confuse the toplevel. Here's an example that does consume its entire input:
# let rec linecount c =
try ignore (read_line ()); linecount (c + 1)
with End_of_file -> c;;
val linecount : int -> int = <fun>
# with_stdin "/etc/passwd" linecount 0;;
- : int = 86
#
This technique is too simple if you wanted to interleave interactions with the toplevel with calls to your function to consume just part of its input. I suspect that would make things too complicated to be worth the effort. It would be much easier (and perhaps better overall) to rewrite your code to work with an explicitly specified input channel.

building a lexical analyser using ml-lex

I need to create a new instance of a lexer tied to the standard input stream.
However, when I type in
val lexer = makeLexer( fn n => inputLine( stdIn ) );
I get an error that I don't understand:
stdIn:1.5-11.13 Error: operator and operand don't agree [tycon mismatch]
operator domain: int -> string
operand: int -> string option
in expression:
(makeLexer is a function name present in my source code)
inputLine returns a string option, and my guess is a string is expected.
What you want to do is either have makeLexer take a string option, like so:
fun makeLexer NONE = <whatever you want to do when stream is empty>
| makeLexer (SOME s) = <the normal body makeLexer, working on the string s>
or change your line to:
val lexer = makeLexer( fn n => valOf ( inputLine( stdIn ) ) );
valOf takes an option type and unpacks it.
Note that, since inputLine returns NONE when the stream is empty, it's probably a better idea to use the first approach, rather than the second.
An example of how to make an interactive stream is given on page 38 (or 32 in the paper) of the User's Guide to ML-Lex and ML-Yacc
The example code could be simpler by using inputLine.
So I would use the example given by Sebastian, keeping in mind that inputLine might return NONE using stdIn atleast if the user presses CTRL-D.
val lexer =
let
fun input f =
case TextIO.inputLine f of
SOME s => s
| NONE => raise Fail "Implement proper error handling."
in
Mlex.makeLexer (fn (n:int) => input TextIO.stdIn)
end
Also the calculator example on page 40 (34 in the paper) shows how to use this in a whole
In general the user guide contains some nice examples and explanations.

Haskell - Concat a list of strings

Im trying to create a list of strings using some recursion.
Basically i want to take a part of a string up to a certain point. Create a list from that and then process the rest of the string through recursion.
type DocName = FilePath
type Line = (Int,String)
type Document = [Line]
splitLines :: String -> Document
splitLines [] = []
splitLines str | length str == 0 = []
| otherwise = zip [0..(length listStr)] listStr
where
listStr = [getLine] ++ splitLines getRest
getLine = (takeWhile (/='\n') str)
getRest = (dropWhile (=='\n') (dropWhile (/='\n') str))
Thats what i got. But it just concats the strings back together since they are list of characters themselves. But i want to create a list of strings.
["test","123"] if the input was "test\n123\n"
Thanks
If you try to compile your code, you'll get an error message telling you that in the line
listStr = [getLine] ++ splitLines getRest
splitLines getRest has type Document, but it should have type [String]. This is easy enough to understand, since [getLine] is a list of strings (well a list of one string) and so it can only be concatenated with another list of strings, not a list of int-string-tuples.
So to fix this we can use map to replace each int-string-tuple in the Document with only the string to get a list of strings, i.e.:
listStr = [getLine] ++ map snd (splitLines getRest)
After changing the line to the above your code will compile and run just fine.
But it just concats the strings back together since they are list of characters themselves.
I'm not sure why you think that.
The reason your code did not compile was because of the type of splitLines as I explained above. Once you fix that error, the code behaves exactly as you want it to, returning a list of integer-string-tuples. At no point are strings concatenated.
Well, if you wrote this just to practice recursion then it is fine once you fix error mentioned by sepp2k. But in real code, I would prefer -
splitLines str = zip [0..] (lines str)
Or even
splitLines = zip [0..] . lines