Suppose I have a list in sml which is very big then sml shows a few of the entries and then starts showing # character.
Could someone tell me how could I view the whole list?
Assuming this is SML/NJ you could use printLength, printDepth and friends from the Control.Print structure.
The following are a snippet from the documentation of the Control.Print structure:
printDepth
The depth of nesting of recursive data structure at which ellipsis begins.
printLength
The length of lists at which ellipsis begins.
stringDepth
The length of strings at which ellipsis begins.
Thus for example we can change how many elements of a list we wan't to be shown in the REPL, by changing the printLength reference
- Control.Print.printLength;
val it = ref 12 : int ref
- [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19];
val it = [1,2,3,4,5,6,7,8,9,10,11,12,...] : int list
- Control.Print.printLength := 18;
val it = () : unit
- [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19];
val it = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,...] : int list
- Control.Print.printLength := 100;
val it = () : unit
- [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19];
val it = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19] : int list
Note that for strings and data structures, the ellipsis is written as a hash '#' instead. This is for example seen with the below string. Note the '#' at the end of the val it = ... line, which is because the default print depth of strings are 70 characters:
- "Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.";
val it = "Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusm#" : string
- Control.Print.stringDepth;
val it = ref 70 : int ref
And lastly, an example of how this is seen in nested data structures:
- Control.Print.printDepth;
val it = ref 5 : int ref
- SOME (SOME (SOME (SOME (SOME (SOME (SOME 42))))));
val it = SOME (SOME (SOME (SOME (SOME #))))
: int option option option option option option option
- Control.Print.printDepth := 10;
val it = () : unit
- SOME (SOME (SOME (SOME (SOME (SOME (SOME 42))))));
val it = SOME (SOME (SOME (SOME (SOME (SOME (SOME 42))))))
: int option option option option option option option
The two suggested solutions will of cause print the entire list no matter how long it is.
You could do something like this:
(* Prints a list in its entirety.
* ls is a list of type 'a list
* f is a function that converts an 'a to string *)
fun printList f ls =
let
(* Prints the contents of the list neatly using f *)
fun printContents [] = ()
| printContents [x] = print (f x)
| printContents (x::xs) = (print (f x ^ ", "); printContents xs)
val _ = print "[";
val _ = printContents ls;
val _ = print "]\n"
in
()
end;
An example of its use:
val ls = List.tabulate (1000, fn n => n);
printList Int.toString ls;
If you want to automatically do it, I doubt you can. If I recall correctly, the pretty printers are implementation specific, and most likely do not allow a pretty-printers to be installed for polymorphic types.
Shorter version of Sabastian P.'s code:
fun printList f ls =
print ("[" ^ String.concatWith ", " (map f ls) ^ "]\n");
Related
I am currently trying to do a spellcheck, but am having some trouble dealing with certain cases.
For example, given the string: { else"--but, }, my spellcheck automatically reads this as an invalid word. However, since else and but are both correctly spelled, I don't want to mark this as incorrect.
Is there any way I can do this with regular expressions?
A more common case I am having trouble with is things like "waistcoat-pocket".
Rather than a regular expression, you should use unicode word segmentation. With the uuseg and uucp library, you can extract words and filter word boundaries with
let is_alphaword =
let alphachar = function
| `Malformed _ -> false
| `Uchar x ->
match Uucp.Break.word x with
| `LE | `Extend -> true
| _ -> false
in
Uutf.String.fold_utf_8 (fun acc _ x -> acc && alphachar x) true
(* Note that we are supposing strings to be utf-8 encoded *)
let words s =
let cons l x = if is_alphaword x then x :: l else l in
List.rev (Uuseg_string.fold_utf_8 `Word cons [] s)
This function splits the string words-by-words:
words "else\"--but";;
- : string list = ["else"; "but"]
words "waistcoat-pocket";;
- : string list = ["waistcoat"; "pocket"]
and works correctly in more general context
words "आ तवेता नि षीदतेन्द्रमभि पर गायत";;
- : string list =
["आ"; "तवेता"; "नि"; "षीदतेन्द्रमभि";
"पर"; "गायत"]
or
words "Étoile(de Barnard)";;
- : string list = ["Étoile"; "de"; "Barnard"]
Say I have lists like [1;2;3;4;5;6;9] and [1;2;3;9] and I want to write a pattern which captures lists which begin with 1 and end with 9, and also capture the values of the middle of the list. Is this possible to do with OCaml's pattern matching?
I've tried to write something like
match l with
| 1::middle::9
or
match l with
| 1::middle::9::[]
but I'm not sure that these are doing what I want, and are probably instead only matching 3 element lists. Is there an approach I can take to match things like this? Should I be using nested pattern matches?
There's no pattern that matches the end of a list, so there's no pattern like what you want. You can do two matches:
match l with
| 1 :: _ -> (
match List.rev l with
| 9 :: _ -> true
| _ -> false
)
| _ -> false
Finding the end of a list is a linear time operation. If your lists can be long, you might want to use a different data structure.
If you're just making checks on the first and last elements of a list, you may want to use conditional statements instead of pattern matching:
let is_valid l =
let open List in
let hd' = hd l in (* Get the first element of the list *)
let tl' = rev l |> hd in (* Get the last element of the list *)
if hd' = 1 && tl' = 9 then true else false
is_valid [1;2;3;4;5;6;9] (* bool = true *)
However, if you are trying to extract that middle pattern it may be worthwhile to use pattern matching. We can do something similar to what Jeffery suggested because of the reason he pointed out (pattern matching can't match the end of a list):
let is_valid l =
let open List in
match l with
| 1 :: mid -> (* `mid` holds list without the `1` *)
(match rev mid with (* `rev_mid` holds list without the 9 but reversed *)
| 9 :: rev_mid -> Some (rev rev_mid) (* reverse to get correct order *)
| _ -> None)
| _ -> None
is_valid [1;2;3;4;5;6;9] (* int list option = Some [2; 3; 4; 5; 6] *)
Then with this function, you can use it with simple pattern matching to look for the middle of valid lists:
match is_valid l with
| Some middle -> middle (* the middle of the list *)
| None -> [] (* nothing — list was invalid *)
I have seen some similar questions, but nothing that really helped me. Basically the title says it all. Using SML I want to take a string that I have, and make a list containing each letter found in the string. Any help would be greatly appreciated.
One possibility is to use the basic logic of quicksort to sort the letters while removing duplicates at the same time. Something like:
fun distinctChars []:char list = []
| distinctChars (c::cs) =
let val smaller = List.filter (fn x => x < c) cs
val bigger = List.filter (fn x => x > c) cs
in distinctChars smaller # [c] # distinctChars bigger
end
If the < and > in the definitions of smaller and bigger were to be replaced by <= and >= then it would simply be an implementation of quicksort (although not the most efficient one since it makes two passes over cs when a suitably defined auxiliary function could split into smaller and bigger in just one pass). The strict inequalities have the effect of throwing away duplicates.
To get what you want from here, do something like explode the string into a list of chars, remove non-alphabetical characters from the resulting list, while simultaneously converting to lower case, then invoke the above function -- ideally first refined so that it uses a custom split function rather than List.filter twice.
On Edit: # is an expensive operator and probably results in the naïve SML quicksort not being all that quick. You can use the above idea of a modified sort, but one that modifies mergesort instead of quicksort:
fun split ls =
let fun split' [] (xs,ys) = (xs,ys)
| split' (a::[]) (xs, ys) = (a::xs,ys)
| split' (a::b::cs) (xs, ys) = split' cs (a::xs, b::ys)
in split' ls ([],[])
end
fun mergeDistinct ([], ys) = ys:char list
| mergeDistinct (xs, []) = xs
| mergeDistinct (x::xs, y::ys) =
if x < y then x::mergeDistinct(xs,y::ys)
else if x > y then y::mergeDistinct(x::xs,ys)
else mergeDistinct(x::xs, ys)
fun distinctChars [] = []
| distinctChars [c] = [c]
| distinctChars chars =
let val (xs,ys) = split chars
in mergeDistinct (distinctChars xs, distinctChars ys)
end
You can get a list of all the letters in a few different ways:
val letters = [#"a",#"b",#"c",#"d",#"e",#"f",#"g",#"h",#"i",#"j",#"k",#"l",#"m",#"n",#"o",#"p",#"q",#"r",#"s",#"t",#"u",#"v",#"w",#"x",#"y",#"z"]
val letters = explode "abcdefghijklmnopqrstuvwxyz"
val letters = List.tabulate (26, fn i => chr (i + ord #"a"))
Update: Looking at your question and John's answer, I might have misunderstood your intention. An efficient way to iterate over a string and gather some result (e.g. a set of characters) could be to write a "foldr for strings":
fun string_foldr f acc0 s =
let val len = size s
fun loop i acc = if i < len then loop (i+1) (f (String.sub (s, i), acc)) else acc
in loop 0 acc0 end
Given an implementation of sets with at least setEmpty and setInsert, one could then write:
val setLetters = string_foldr (fn (c, ls) => setInsert ls c) setEmpty "some sentence"
The simplest solution I can think of:
To get the distinct elements of a list:
Take the head
Remove that value from the tail and get the distinct elements of the result.
Put 1 and 2 together.
In code:
(* Return the distinct elements of a list *)
fun distinct [] = []
| distinct (x::xs) = x :: distinct (List.filter (fn c => x <> c) xs);
(* All the distinct letters, in lower case. *)
fun letters s = distinct (List.map Char.toLower (List.filter Char.isAlpha (explode s)));
(* Variation: "point-free" style *)
val letters' = distinct o (List.map Char.toLower) o (List.filter Char.isAlpha) o explode;
This is probably not the most efficient solution, but it's uncomplicated.
I have a character list [#"h", #"i", #" ", #"h", #"i"] which I want to get the first word from this (the first character sequence before each space).
I've written a function which gives me this warning:
stdIn:13.1-13.42 Warning: type vars not generalized because of value
restriction are instantiated to dummy types (X1,X2,...)
Here is my code:
fun next [] = ([], [])
| next (hd::tl) = if(not(ord(hd) >= 97 andalso ord(hd) <= 122)) then ([], (hd::tl))
else
let
fun getword [] = [] | getword (hd::tl) = if(ord(hd) >= 97 andalso ord(hd) <= 122) then [hd]#getword tl else [];
in
next (getword (hd::tl))
end;
EDIT:
Expected input and output
next [#"h", #"i", #" ", #"h", #"i"] => ([#"h", #"i"], [#" ", #"h", #"i"])
Can anybody help me with this solution? Thanks!
This functionality already exists within the standard library:
val nexts = String.tokens Char.isSpace
val nexts_test = nexts "hi hi hi" = ["hi", "hi", "hi"]
But if you were to build such a function anyway, it seems that you return ([], []) sometimes and a single list at other times. Normally in a recursive function, you can build the result by doing e.g. c :: recursive_f cs, but this is assuming your function returns a single list. If, instead, it returns a tuple, you suddenly have to unpack this tuple using e.g. pattern matching in a let-expression:
let val (x, y) = recursive_f cs
in (c :: x, y + ...) end
Or you could use an extra argument inside a helper function (since the extra argument would change the type of the function) to store the word you're extracting, instead. A consequence of doing that is that you end up with the word in reverse and have to reverse it back when you're done recursing.
fun isLegal c = ord c >= 97 andalso ord c <= 122 (* Only lowercase ASCII letters *)
(* But why not use one of the following:
fun isLegal c = Char.isAlpha c
fun isLegal c = not (Char.isSpace c) *)
fun next input =
let fun extract (c::cs) word =
if isLegal c
then extract cs (c::word)
else (rev word, c::cs)
| extract [] word = (rev word, [])
in extract input [] end
val next_test_1 =
let val (w, r) = next (explode "hello world")
in (implode w, implode r) = ("hello", " world")
end
val next_test_2 = next [] = ([], [])
I'm new at OCaml (and still a novice in learning programming in general) and I have a quick question about checking what kind of string the next element in the string list is.
I want it to put a separator between each element of the string (except for the last one), but I can't figure out how to make the program 'know' that the last element is the last element.
Here is my code as it is now:
let rec join (separator: string) (l : string list) : string =
begin match l with
| []->""
| head::head2::list-> if head2=[] then head^(join separator list) else head^separator^(join separator list)
end
let test () : bool =
(join "," ["a";"b";"c"]) = "a,b,c"
;; run_test "test_join1" test
Thanks in advance!
You're almost there. The idea is breaking down the list in three cases where it has 0, 1 or at least 2 elements. When the list has more than one element, you're safe to insert separator into the output string:
let rec join (separator: string) (l : string list) : string =
begin match l with
| [] -> ""
| head::[] -> head
| head::list-> head^separator^(join separator list)
end
I have several comments about your function:
Type annotation is redundant. Because (^) is string concatenation operator, the type checker can infer types of separator, l and the output of the function easily.
No need to use begin/and pair. Since you have only one level of pattern matching, there is no confusion to the compiler.
You could use function to eliminate match l with part.
Therefore, your code could be shortened as:
let rec join sep l =
match l with
| [] -> ""
| x::[] -> x
| x::xs -> x ^ sep ^ join sep xs
or even more concise:
let rec join sep = function
| [] -> ""
| x::[] -> x
| x::xs -> x ^ sep ^ join sep xs
The empty list is [], the list with one element is [h] and the list with at least one element is h::t. So your function can be written as:
let rec join separator = function
| [] -> ""
| [h] -> h
| h::t -> h ^ separator ^ join separator t