SML - Iterate through String - sml

I'm trying to find if sentences read from a file has some pattern.
So far, I've written the code that reads all the sentences from file line by line, and puts those sentences to an array.
val infile = "c:/input.txt" ;
fun readlist (infile : string) =
let val ins = TextIO.openIn infile
fun loop ins = case TextIO.inputLine ins of
SOME line => line :: loop ins
| NONE => []
in loop ins before TextIO.closeIn ins
end;
val pureGraph = readlist(infile);

Try to write a function that evaluates to true if the letter a is in a string. Use explode to get a list of Chars. Recurse or fold over that list until you find a or reach the end. When you have that function, generalize it to any character. This will probably lead you to an O(n^2) runtime complexity.
Another approach is to sort the character list, remove duplicates, zip it with the correct list of characters and compare each tuple with recursion/fold. This should run in O(n log n) time because of the sort.
A third approach is to fold over the character list with an array or a hash map. In the array or map you add in what the current character is. At the end you see if all characters were found. This approach should run in O(n) time if your hashmap is constant-time.

Divide and conquer your problem:
Write a function isPanagram : string -> bool that determines this for a single line.
One strategy could be: Start with the set of all letters. Loop through the string, and for each character in the string, remove it from the set until the end of the string, or the set is empty. If the set is empty, it is a panagram. This requires that you represent sets in some way, e.g. with a list or a binary search tree.
Consider looping through the string by index, rather than exploding it:
val allLetters = ...
fun remove x ... = ...
fun isEmpty ... = ...
fun isPanagram s =
let val len = size s
fun loop i missingLetters =
isEmpty missingLetters orelse
i < len andalso loop (i+1) (remove (String.sub (s, i)) missingLetters)
in loop 0 allLetters end
Write a function readLines : string -> string list that reads the content of a file and separates the lines into elements of a list:
fun isLinebreak c = c = #"\r" orelse c = #"\n"
fun readLines filename =
let val ins = TextIO.openIn filename
val data = TextIO.inputAll ins
val _ = TextIO.closeIn ins
in String.tokens isLinebreak data end
(Yes, reading the file one line at a time will save memory.)

The SML/NJ library has a number of data structures which can be used for things like sets and hash tables. They are not exactly well-documented, but this explains a bit how to use them. Using their set library, you can write something like this:
structure CharSet = RedBlackSetFn(struct
type ord_key = char
val compare = Char.compare
end)
val alphabet = CharSet.fromList (explode "ABCDEFGHIJKLMNOPQRSTUVWXYZ");
fun isPanagram s =
let val chars = CharSet.fromList (map Char.toUpper (explode s))
val letters = CharSet.intersection (chars,alphabet)
in CharSet.numItems letters = 26
end;
used like this:
- isPanagram "We promptly judged antique ivory buckles for the next prize.";
val it = true : bool
- isPanagram "We promptly judged antique plastic buckles for the next prize.";
val it = false : bool

Related

SMLNJ function that returns as a pair a string at the beginning of a list

So im really confused as i am new to sml and I am having trouble with syntax of how i want to create my function.
the instructions are as follows...
numberPrefix: char list → string * char list
Write a function named numberPrefix that returns (as a pair) a string representing the digit characters at the
beginning of the input list and the remaining characters after this prefix. You may use the Char.isDigit and
String.implode functions in your implementation.
For example,
numberPrefix [#"a", #"2", #"c", #" ", #"a"];
val it = ("", [#"a", #"2", #"c", #" ", #"a") : string * char list
numberPrefix [#"2", #"3", #" ", #"a"];
val it = ("23", [#" ", #"a"]) : string * char list
Here is my code so far...
fun numberPrefix(c:char list):string*char list =
case c of
[] => []
|(first::rest) => if isDigit first
then first::numberPrefix(rest)
else
;
I guess what i am trying to do is append first to a seperate list if it is indeed a digit, once i reach a member of the char list then i would like to return that list using String.implode, but I am banging my head on the idea of passing in a helper function or even just using the "let" expression. How can I essentially create a seperate list while also keeping track of where i am in the original list so that I can return the result in the proper format ?
First of all, the function should produce a pair, not a list.
The base case should be ("", []), not [], and you can't pass the recursive result around "untouched".
(You can pretty much tell this from the types alone. Pay attention to types; they want to help you.)
If you bind the result of recursing in a let, you can access its parts separately and rearrange them.
A directly recursive take might look like this:
fun numberPrefix [] = ("", [])
| numberPrefix (cs as (x::xs)) =
if Char.isDigit x
then let val (number, rest) = numberPrefix xs
in
((str x) ^ number, rest)
end
else ("", cs);
However, splitting a list in two based on a predicate – let's call it "splitOn", with the type ('a -> bool) -> 'a list -> 'a list * 'a list – is a reasonably useful operation, and if you had that function you would only need something like this:
fun numberPrefix xs = let val (nums, notnums) = splitOn Char.isDigit xs
in
(String.implode nums, notnums)
end;
(Splitting left as an exercise. I suspect that you have already implemented this splitting function, or its close relatives "takeWhile" and "dropWhile".)

Printing randomly from a list OCAML

How do I print each line in a text file only once but in a random order?
I have a text file that containts six individual lines and I am trying to print them to the screen randomly
Here is the code I have so far
open Scanf
open Printf
let id x = x
let const x = fun _ -> x
let read_line file = fscanf file "%s#\n" id
let is_eof file = try fscanf file "%0c" (const false) with End_of_file -> true
let _ =
let file = open_in "text.txt" in
while not (is_eof file) do
let s = read_line file in
printf "%s\n" s
done;
close_in file
I could append elements "s" into a list. Printing elements in a list can be as simple as following however, I am not sure how to print elements in the list randomly.
let rec print_list = function
[] -> ()
| e::l -> print_int e ; print_string " " ; print_list l
Sort your list with random comparator. For example by the following function.
let mix =
let random _ _ =
if Random.bool() then 1 else -1 in
List.sort random
Edit 1 (15.11.20)
List.sort implements Merge Sort algorithm. Merge Sort has stable O(n log n). Also steps count of this algorithm is not dependent on results of items comparison. It means our random function that is nondeterministic doesn't effect the time of List.sort work. (The following image is from wikipedia)
If our input data is list and we can't use mutable data structures - I think it is impossible to implement solution with better Big O than O(n log n) because of immutable list and necessity to have random access to items.
let's define a function that retrieve one element identified by its position in a list, and return a tuple (this_element, the_list_wo_this_element).
Ex : pick [0;2;4;6;8] 3
returns (6, [0;2;4;8)).
Then, by iterating on the resulted list (the rhs of the tuple above), you pick a random element from that list , until that list is empty.

SML program to delete char from string

I am newbie to SML, trying to write recursive program to delete chars from a string:
remCharR: char * string -> string
So far wrote this non-recursive prog. Need help to write recursive one.
- fun stripchars(string,chars) = let
= fun aux c =
= if String.isSubstring(str c) chars then
= ""
= else
= str c
= in
= String.translate aux string
= end
= ;
You have already found a very idiomatic way to do this. Explicit recursion is not a goal in itself, except perhaps in a learning environment. That is, explicit recursion is, compared to your current solution, encumbered with a description of the mechanics of how you achieve the result, but not what the result is.
Here is one way you can use explicit recursion by converting to a list:
fun remCharR (c, s) =
let fun rem [] = []
| rem (c'::cs) =
if c = c'
then rem cs
else c'::rem cs
in implode (rem (explode s)) end
The conversion to list (using explode) is inefficient, since you can iterate the elements of a string without creating a list of the same elements. Generating a list of non-removed chars is not necessarily a bad choice, though, since with immutable strings, you don't know exactly how long your end-result is going to be without first having traversed the string. The String.translate function produces a list of strings which it then concatenates. You could do something similar.
So if you replace the initial conversion to list with a string traversal (fold),
fun fold_string f e0 s =
let val max = String.size s
fun aux i e =
if i < max
then let val c = String.sub (s, i)
in aux (i+1) (f (c, e))
end
else e
in aux 0 e0 end
you could then create a string-based filter function (much alike the String.translate function you already found, but less general):
fun string_filter p s =
implode (fold_string (fn (c, res) => if p c then c::res else res) [] s)
fun remCharR (c, s) =
string_filter (fn c' => c <> c') s
Except, you'll notice, it accidentally reverses the string because it folds from the left; you can fold from the right (efficient, but different semantics) or reverse the list (inefficient). I'll leave that as an exercise for you to choose between and improve.
As you can see, in avoiding String.translate I've built other generic helper functions so that the remCharR function does not contain explicit recursion, but rather depends on more readable high-level functions.
Update: String.translate actually does some pretty smart things wrt. memory use.
Here is Moscow ML's version of String.translate:
fun translate f s =
Strbase.translate f (s, 0, size s);
with Strbase.translate looking like:
fun translate f (s,i,n) =
let val stop = i+n
fun h j res = if j>=stop then res
else h (j+1) (f(sub_ s j) :: res)
in revconcat(h i []) end;
and with the helper function revconcat:
fun revconcat strs =
let fun acc [] len = len
| acc (v1::vr) len = acc vr (size v1 + len)
val len = acc strs 0
val newstr = if len > maxlen then raise Size else mkstring_ len
fun copyall to [] = () (* Now: to = 0. *)
| copyall to (v1::vr) =
let val len1 = size v1
val to = to - len1
in blit_ v1 0 newstr to len1; copyall to vr end
in copyall len strs; newstr end;
So it first calculates the total length of the final string by summing the length of each sub-string generated by String.translate, and then it uses compiler-internal, mutable functions (mkstring_, blit_) to copy the translated strings into the final result string.
You can achieve a similar optimization when you know that each character in the input string will result in 0 or 1 characters in the output string. The String.translate function can't, since the result of a translate can be multiple characters. So an alternative implementation uses CharArray. For example:
Find the number of elements in the new string,
fun countP p s =
fold_string (fn (c, total) => if p c
then total + 1
else total) 0 s
Construct a temporary, mutable CharArray, update it and convert it to string:
fun string_filter p s =
let val newSize = countP p s
val charArr = CharArray.array (newSize, #"x")
fun update (c, (newPos, oldPos)) =
if p c
then ( CharArray.update (charArr, newPos, c) ; (newPos+1, oldPos+1) )
else (newPos, oldPos+1)
in fold_string update (0,0) s
; CharArray.vector charArr
end
fun remCharR (c, s) =
string_filter (fn c' => c <> c') s
You'll notice that remCharR is the same, only the implementation of string_filter varied, thanks to some degree of abstraction. This implementation uses recursion via fold_string, but is otherwise comparable to a for loop that updates the index of an array. So while it is recursive, it's also not very abstract.
Considering that you get optimizations comparable to these using String.translate without the low-level complexity of mutable arrays, I don't think this is worthwhile unless you start to experience performance problems.

Adding items to list SML

I'm very new to SML and I'm trying to add some items to a list
fun foo(inFile : string, outFile : string) = let
val file = TextIO.openIn inFile
val outStream = TextIO.openOut outFile
val contents = TextIO.inputAll file
val lines = String.tokens (fn c => c = #"\n") contents
val lines' = List.map splitFirstSpace lines
fun helper1(lis : string list) =
case lis of
[] => ( TextIO.closeIn file; TextIO.closeOut outStream)
| c::lis => ( TextIO.output(outStream, c);
helper1(lis))
fun helper(lis : (string * string) list, stack : string list) =
case lis of
[] => stack
| c::lis => ( act(#1 c, #2 c)::stack;
helper(lis, stack))
val x = helper(lines', [])
in
helper1(x)
end;
I'm getting a blank output file whenever I run the code and I'm having trouble figuring out why but I do know that the helper function is getting the proper values from the "act" function because I tested it by using print(action(...))
Thanks
The problem is with this part:
( act(#1 c, #2 c)::stack; helper(lis, stack) )
This is creating a new list and then immediately throwing it away before performing the recursive call. What you want to do instead is
helper(lis, act(#1 c, #2 c)::stack)
Additional hint: both your helper functions can be replaced by simple uses of List.app and List.foldl.
Edit: Further hint: In fact, you can write that as just
helper(lis, act(c)::stack)
because a function with "two arguments" is simply a function taking a pair.

Read a large file into string lines OCaml

I am basically trying to read a large file (around 10G) into a list of lines. The file contains a sequence of integer, something like this:
0x123456
0x123123
0x123123
.....
I used the method below to read files by default for my codebase, but it turns out to be quit slow (~12 minutes) at this scenario
let lines_from_file (filename : string) : string list =
let lines = ref [] in
let chan = open_in filename in
try
while true; do
lines := input_line chan :: !lines
done; []
with End_of_file ->
close_in chan;
List.rev !lines;;
I guess I need to read the file into memory, and then split them into lines (I am using a 128G server, so it should be fine for the memory space). But I still didn't understand whether OCaml provides such facility after searching the documents here.
So here is my question:
Given my situation, how to read files into string list in a fast way?
How about using stream? But I need to adjust related application code, then that could cause some time.
First of all you should consider whether you really need to have all the information at once in your memory. Maybe it is better to process file line-by-line?
If you really want to have it all at once in memory, then you can use Bigarray's map_file function to map a file as an array of characters. And then do something with it.
Also, as I see, this file contains numbers. Maybe it is better to allocate the array (or even better a bigarray) and the process each line in order and store integers in the (big)array.
I often use the two following function to read the lines of a file. Note that the function lines_from_files is tail-recursive.
let read_line i = try Some (input_line i) with End_of_file -> None
let lines_from_files filename =
let rec lines_from_files_aux i acc = match (read_line i) with
| None -> List.rev acc
| Some s -> lines_from_files_aux i (s :: acc) in
lines_from_files_aux (open_in filename) []
let () =
lines_from_files "foo"
|> List.iter (Printf.printf "lines = %s\n")
This should work:
let rec ints_from_file fdesc =
try
let l = input_line fdesc in
let l' = int_of_string l in
l' :: ints_from_file fdesc
with | _ -> []
This solution converts the strings to integers as they're read in (which should be a bit more memory efficient, and I assume this was going to be done to them eventually.
Also, because it is recursive, the file must be opened outside of the function call.