copying files in sml - sml

I am trying to learn input output in sml.In an effort to copy strings of lsthat are the same as s1 into the file l2 I did the following.I am getting some errors I can not really understand.Can someone help me out.
fun test(l2:string,ls:string list,s1:string) = if (String.isSubstring(s1 hd(ls))) then
(TextIO.openOut l2; TextIO.inputLine hd(ls))::test(l2,tl(ls),s1) else
test(l2,tl(ls),s1);

Here are some general hints:
Name your variables something meaningful, like filename, lines and line.
The function TextIO.inputLine takes as argument a value of type instream.
When you write TextIO.inputLine hd(ls), what this is actually interpreted as is
(TextIO.inputLine hd) ls, which means "treat hd as if it were an instream and
try and read a line from it, take that line and treat it as if it were a function,
and apply it on ls", which is of course complete nonsense.
The proper parenthesising in this case would be TextIO.inputLine (hd ls), which
still does not make sense, since we decided that ls is a string list, and so hd ls
will be a string and not an instream.
Here is something that resembles what you want to do, but opposite:
(* Open a file, read each line from file and return those that contain mySubstr *)
fun test (filename, mySubstr) =
let val instr = TextIO.openIn filename
fun loop () = case TextIO.inputLine instr of
SOME line => if String.isSubstring mySubstr line
then line :: loop () else loop ()
| NONE => []
val lines = loop ()
val _ = TextIO.closeIn instr
in lines end
You need to use TextIO.openOut and TextIO.output instead. TextIO.inputLine is one that reads from files.

Related

Scanning a file as a string

I have a built a function which takes as input a string and output a string.
Let's call it f.
I would like to scan the string into a file input.txt and apply my function on this string and write it on another file output.txt.
Other questions: If the file is too big, maybe the scanning is impossible. Thus I have a function f_line, and I would like to scan one by one each line of input.txt and apply this function to this line, and write each output in the file in the file output.txt.
How can I do that?
You basically want to map a file with your function to another file, much like you map lists, e.g.,
# List.map String.uppercase_ascii ["hello"; "world"];;
- : string list = ["HELLO"; "WORLD"]
In OCaml, files are read and written via an abstraction called a channel. Channels have directions, i.e., input channels are distinguished from the output channels. To open an input channel use the open_in function, to close it, use close_in. The corresponding functions for the output channels have the _out prefix.
To map two channels line by line, we need to read a line from one channel, apply our transformation f to each line and write to the output channel, until the first channel raises the End_of_file exception that indicates that there is no more data, e.g.,
let rec map_channels input output f =
match f (input_line input) with
| exception End_of_file -> flush output
| r ->
output_string output r;
output_char output '\n';
map_channels input output f
Now we can use this function to write a function that takes filenames, instead of channels, e.g.,
let map_files input output f =
if input = output
then invalid_arg "the input and output files must differ";
let input = open_in input in
let output = open_out output in
map_channels input output f;
close_in input;
close_out output
Notice, that we are checking that input and output files are different to prevent mapping the file to itself, which might end up in an infinite loop and may corrupt files.
I've finally found an easy solution with the following code :
let transform_files_by_line
(f_line : string -> string) (in_filename : string)
(out_filename : string) =
let input_chan = open_in in_filename
and output_chan = open_out out_filename
in
let rec transform_rec () =
let str = input_line input_chan in
output_string output_chan (f_line str) ;
transform_rec () ;
in
try (transform_rec ()) with
End_of_file -> (
close_in input_chan;
close_out output_chan;) ;;

Read a file line per line and store every line read in a single list

I'm a student and I've been given a exercice i've been struggling with for about a month or so.
I'm trying to write a function in Ocaml. This function must read a text file which has a word per line, and it must store all the words in a list.
But the problem is that this program must be a recursive one (which means no loops, no "while").
All I've been able to do so far is to create a function which reads the text file (pretty much like the BASH command "cat")
let dico filename =
let f = open_in filename in
let rec dico_rec () =
try
print_string (input_line f);
print_newline ();
dico_rec();
with End_of_file -> close_in f
in dico_rec() ;;
I just don't know how to do it. Ocaml is hardly my favourite language.
Here's an alternate definition of build_list that is tail recursive. You can use it instead of #MitchellGouzenko's definition if your inputs can have many lines.
let rec build_list l =
match input_line ic with
| line -> build_list (line :: l)
| exception End_of_file -> close_in ic; List.rev l
open Printf
let file = "example.dat"
let () =
let ic = open_in file in
let rec build_list infile =
try
let line = input_line infile in
line :: build_list(infile)
with End_of_file ->
close_in infile;
[] in
let rec print_list = function
[] -> ()
| e::l -> print_string e ; print_string " " ; print_list l in
print_list(build_list(ic))
Edit: The algorithm I previously proposed was unnecessarily complicated. Try to understand this one instead.
To understand this recursion, we assume that build_list works correctly. That is, assume build_list correctly takes an open file as an argument and returns a list of lines in the file.
Now, let's look at the function's body. It reads a line from the file and calls build_list again. If there are N lines in the file, calling build_list again should return a list of the remaining N-1 lines in the file (since we just read the first line ourselves). We append the line we just read to the list returned from build_list, and return the resulting list, which has all N lines.
The recursion continues until it hits the base case. The base case is when there's an End_of_file. In this case we return an empty list.

Read file line by line and store in a list

I have a input.txt file with few lines of text. I am trying to store those lines in a list l. I think I am doing correct but list l is not getting updated. please help.
let l = []
let () =
let ic = open_in "input.txt"
in
try
while true do
let line = input_line ic
in
let rec append(a, b) = match a with
|[] -> [b]
|c::cs -> c::append(cs,b)
in
append(l, line)
(* print_endline line *)
done
with End_of_file ->
close_in ic;;
Apart from Warning 10, I am not getting any error.
let l = []
Variables in OCaml are immutable, so no matter what code you write after this line, l will always be equal to [].
It looks like you are caught in imperative programming - a good thing to start with OCaml!
Typical functional and recursive programming would read a file like this:
Read a line, then append "read a line" to it. At End_of_File you finish the list with [].

Read a large file into string lines OCaml

I am basically trying to read a large file (around 10G) into a list of lines. The file contains a sequence of integer, something like this:
0x123456
0x123123
0x123123
.....
I used the method below to read files by default for my codebase, but it turns out to be quit slow (~12 minutes) at this scenario
let lines_from_file (filename : string) : string list =
let lines = ref [] in
let chan = open_in filename in
try
while true; do
lines := input_line chan :: !lines
done; []
with End_of_file ->
close_in chan;
List.rev !lines;;
I guess I need to read the file into memory, and then split them into lines (I am using a 128G server, so it should be fine for the memory space). But I still didn't understand whether OCaml provides such facility after searching the documents here.
So here is my question:
Given my situation, how to read files into string list in a fast way?
How about using stream? But I need to adjust related application code, then that could cause some time.
First of all you should consider whether you really need to have all the information at once in your memory. Maybe it is better to process file line-by-line?
If you really want to have it all at once in memory, then you can use Bigarray's map_file function to map a file as an array of characters. And then do something with it.
Also, as I see, this file contains numbers. Maybe it is better to allocate the array (or even better a bigarray) and the process each line in order and store integers in the (big)array.
I often use the two following function to read the lines of a file. Note that the function lines_from_files is tail-recursive.
let read_line i = try Some (input_line i) with End_of_file -> None
let lines_from_files filename =
let rec lines_from_files_aux i acc = match (read_line i) with
| None -> List.rev acc
| Some s -> lines_from_files_aux i (s :: acc) in
lines_from_files_aux (open_in filename) []
let () =
lines_from_files "foo"
|> List.iter (Printf.printf "lines = %s\n")
This should work:
let rec ints_from_file fdesc =
try
let l = input_line fdesc in
let l' = int_of_string l in
l' :: ints_from_file fdesc
with | _ -> []
This solution converts the strings to integers as they're read in (which should be a bit more memory efficient, and I assume this was going to be done to them eventually.
Also, because it is recursive, the file must be opened outside of the function call.

OCaml error: wrong type of expression in constructor

I have a function save that take standard input, which is used individually like this:
./try < input.txt (* save function is in try file *)
input.txt
2
3
10 29 23
22 14 9
and now i put the function into another file called path.ml which is a part of my interpreter. Now I have a problem in defining the type of Save function and this is because save function has type in_channel, but when i write
type term = Save of in_channel
ocamlc complain about the parameter in the command function.
How can i fix this error? This is the reason why in my last question posted on stackoverflow, I asked for the way to express a variable that accept any type. I understand the answers but actually it doesn't help much in make the code running.
This is my code:
(* Data types *)
open Printf
type term = Print_line_in_file of int*string
| Print of string
| Save of in_channel (* error here *)
;;
let input_line_opt ic =
try Some (input_line ic)
with End_of_file -> None
let nth_line n filename =
let ic = open_in filename in
let rec aux i =
match input_line_opt ic with
| Some line ->
if i = n then begin
close_in ic;
(line)
end else aux (succ i)
| None ->
close_in ic;
failwith "end of file reached"
in
aux 1
(* get all lines *)
let k = ref 1
let first = ref ""
let second = ref ""
let sequence = ref []
let append_item lst a = lst # [a]
let save () =
try
while true do
let line = input_line stdin in
if k = ref 1
then
begin
first := line;
incr k;
end else
if k = ref 2
then
begin
second := line;
incr k;
end else
begin
sequence := append_item !sequence line;
incr k;
end
done;
None
with
End_of_file -> None;;
let rec command term = match term with
| Print (n) -> print_endline n
| Print_line_in_file (n, f) -> print_endline (nth_line n f)
| Save () -> save ()
;;
EDIT
Error in code:
Save of in_channel:
Error: This pattern matches values of type unit
but a pattern was expected which matches values of type in_channel
Save of unit:
Error: This expression has type 'a option
but an expression was expected of type unit
There are many errors in this code, so it's hard to know where to start.
One problem is this: your save function has type unit -> 'a option. So it's not the same type as the other branches of your final match. The fix is straightforward: save should return (), not None. In OCaml these are completely different things.
The immediate problem seems to be that you have Save () in your match, but have declared Save as taking an input channel. Your current code doesn't have any way to pass the input channel to the save function, but if it did, you would want something more like this in your match:
| Save ch -> save ch
Errors like this suggest (to me) that you're not so familiar with OCaml's type system. It would probably save you a lot of trouble if you went through a tutorial of some kind before writing much more code. You can find tutorials at http://ocaml.org.