OCaml function parameter pattern matching for strings - ocaml

I tried to pass a string in to get a reversed string. Why can't I do this:
let rec reverse x =
match x with
| "" -> ""
| e ^ s -> (reverse s) ^ e;;
The compiler says it's a syntax error. Can't I use ^ to destructure parameters?

The reason for this is that strings are not represented as a datatype in the same way as lists are. Therefore, while cons (::) is a constructor, ^ is not. Instead, strings are represented as a lower level type without a recursive definition (as lists are). There is a way to match strings as a list of characters, using a function from SML (which you can write in OCaml) called explode and implode which -- respectively -- take a string to a char list and vice versa. Here's an example implementation of them.

As Kristopher Micinski explained, you can't decompose strings using pattern matching as you do with lists.
But you can convert them to lists, using explode. Here's your reverse function with pattern matching using explode and its counterpart implode:
let rec reverse str =
match explode str with
[] -> ""
| h::t -> reverse (implode t) ^ string_of_char h
Use it like this:
let () =
let text = "Stack Overflow ♥ OCaml" in
Printf.printf "Regular: %s\n" text;
Printf.printf "Reversed: %s\n" (reverse text)
Which shows that it works for single-byte characters but not for multi-byte ones.
And here are explode and implode along with a helper method:
let string_of_char c = String.make 1 c
(* Converts a string to a list of chars *)
let explode str =
let rec explode_inner cur_index chars =
if cur_index < String.length str then
let new_char = str.[cur_index] in
explode_inner (cur_index + 1) (chars # [new_char])
else chars in
explode_inner 0 []
(* Converts a list of chars to a string *)
let rec implode chars =
match chars with
[] -> ""
| h::t -> string_of_char h ^ (implode t)

When you write a pattern matching expression, you cannot use arbitrary functions in your patterns. You can only use constructors, which look like unevaluated functions. For example, the function "+" is defined on integers. So the expression 1+2 is evaluated and gives 3; the function "+" is evaluated, so you cannot match on x+y. Here is an attempt to define a function on natural numbers that checks whether the number is zero:
let f x = match x with
| 0 -> false
| a+1 -> true
;;
This cannot work! For the same reason, your example with strings cannot work. The function "^" is evaluated on strings, it is not a constructor.
The matching on x+1 would work only if numbers were unevaluated symbolic expressions made out of the unevaluated operator + and a symbolic constant 1. This is not the case in OCAML. Integers are implemented directly through machine numbers.
When you match a variant type, you match on constructors, which are unevaluated expressions. For example:
# let f x = match x with
| Some x -> x+1
| None -> 0
;;
val f : int option -> int = <fun>
This works because the 'a option type is made out of a symbolic expression, such as Some x. Here, Some is not a function that is evaluated and gives some other value, but rather a "constructor", which you can think of as a function that is never evaluated. The expression Some 3 is not evaluated any further; it remains as it is. It is only on such functions that you can pattern-match.
Lists are also symbolic, unevaluated expressions built out of constructors; the constructor is ::. The result of x :: y :: [] is an unevaluated expression, which is represented by the list [x;y] only for cosmetic convenience. For this reason, you can pattern-match on lists.

Related

Turning list of integers into string in OCaml

How could I turn a list of integers, such as [1;2;3], into a single string "123" using fold?
Right now, I think I'm doing:
let int_list_to_string (s : int list) : string =
fold (fun s combine -> combine + .... ) ""
or something along these lines, where .... could be something similar to String.length (which I used in a different fold problem to count characters in a string) but I don't know if this is even remotely correct.
Thank you!
Your basic layout looks right to me. Many things need to be fixed up. Here are a few:
You have to pick a specific fold function to use, List.fold_left or List.fold_right.
The function to be folded takes two parameters. One is the accumulated result and the other is the next input from the list. The order depends on whether you use fold_left or fold_right. Your code sketch has two parameters but one of them is suspiciously named s. This will not be the same s as the input list. The names after fun are new parameter variables introduced at that point.
The OCaml operator for concatenating strings is ^, which is what you should use where you have + (possibly just a placeholder in your code).
You need to convert each int to a string before concatenating. There is a function named string_of_int that does this.
You have to apply the fold to a list. I.e., fold takes 3 arguments but you are supplying only 2 arguments in your code sketch.
Note that the fun needs to concatenate an acc and the next list-element with the ^ - operator. The accumulator of List.fold_left needs to be the same as the output type, so it has to be an empty string: "".
let int_list_to_string lst = List.fold_left (fun acc x -> acc ^ string_of_int x) "" lst
val int_list_to_string : int list -> string = <fun>
# int_list_to_string [1;2;3];;
- : string = "123"
One could also create more advanced strings, e.g. with the list-syntax:
let int_list_to_string_fancy lst =
"[" ^ ( List.fold_left( fun acc x -> acc ^ string_of_int x ^ ";" ) "" lst) ^ "]"
val int_list_to_string_fancy : int list -> string = <fun>
# int_list_to_string_fancy [1;2;3];;
- : string = "[1;2;3;]"

how to extract words from a string with special characters

I am currently trying to do a spellcheck, but am having some trouble dealing with certain cases.
For example, given the string: { else"--but, }, my spellcheck automatically reads this as an invalid word. However, since else and but are both correctly spelled, I don't want to mark this as incorrect.
Is there any way I can do this with regular expressions?
A more common case I am having trouble with is things like "waistcoat-pocket".
Rather than a regular expression, you should use unicode word segmentation. With the uuseg and uucp library, you can extract words and filter word boundaries with
let is_alphaword =
let alphachar = function
| `Malformed _ -> false
| `Uchar x ->
match Uucp.Break.word x with
| `LE | `Extend -> true
| _ -> false
in
Uutf.String.fold_utf_8 (fun acc _ x -> acc && alphachar x) true
(* Note that we are supposing strings to be utf-8 encoded *)
let words s =
let cons l x = if is_alphaword x then x :: l else l in
List.rev (Uuseg_string.fold_utf_8 `Word cons [] s)
This function splits the string words-by-words:
words "else\"--but";;
- : string list = ["else"; "but"]
words "waistcoat-pocket";;
- : string list = ["waistcoat"; "pocket"]
and works correctly in more general context
words "आ तवेता नि षीदतेन्द्रमभि पर गायत";;
- : string list =
["आ"; "तवेता"; "नि"; "षीदतेन्द्रमभि";
"पर"; "गायत"]
or
words "Étoile(de Barnard)";;
- : string list = ["Étoile"; "de"; "Barnard"]

How to get the value of a type

I am trying to get the value of a type in my code. There is a x which is a stmt and the value is ("x" 1).
I want to get that "x" and use it as a key to find a value in a hashtable.
What I am asking is how to extract the "x".
type variable = string
type expr = int
type arrayref = variable * expr
type stmt = Dim of arrayref
let x = Dim("x",1);;
let aa (sbc:stmt) = match sbc with
|Dim a -> None;;
I should replace None to some codes, but no idea how to do that.
I'm not completely sure, but I think you're asking how to access a component of a compound value. For tuples and variants, the way to do this is with pattern matching. So you have that right. You just need to make your pattern a little deeper. To get the "x" from your value x you would do something like this:
let extracted_value =
match x with
| Dim (k, _) -> k
in
. . .
Since there is only one constructor in your stmt type (at least right now), you can do this without a match as follows:
let Dim (extracted_value, _) = x in
. . .
This works because there is a single pattern that is exhaustive. For types with more constructors you need to use match to handle all the possibilities.
If this isn't what you're asking, maybe try asking again in a different way.
Update
To print the string you could write something like this:
let Dim (k, _) = x in print_string k

match case unused in OCaml

I want to build a list of type (char, 'a list) list where each char is an upper case letter of the alphabet. I'am getting a warning Warning 11: this match case is unused. for the second match case on get_list. I did some prints on the first case and found out len get's there with value 0, so it never uses the second case. What's happening?
let rec get_list abc i len =
match i with
| len -> []
| _ -> ((String.get abc i), [])::get_list abc (i + 1) len
in
let rec print_list l =
match l with
| [] -> ()
| h::t -> print_char(fst h);print_list t
in
let abc = "ABCDEFGHIJKLMNOPQRSTUVWXYZ" in
let abc_l = get_list abc 0 (String.length abc) in
print_list abc_l;;
The reason it doesn't work
When you write
match i with
| len -> []
| _ -> ["..."]
len is a generic pattern, which hasn't anything to do with the len define above. In a pattern matching you define only how the variable should look like, you describe it's general "structure", the variable names are used to name the differents parts of the pattern matching, and are new variables. For example with lists you can do:
match my_list with
| [x,y,z] -> x+y+z
| x :: r -> x + (List.length r)
| anything_else -> List.length anything_else
When you put '_' it's only a convention to say "I don't mind which value it is, I don't need it". Here is another example with tuples:
match my_tuple with
| (a,b) -> a+b
A solution : conditionnal pattern matching
If you want to put condition in a pattern matching you can use the when keyword :
match i with
| n when n = len -> []
| _ -> ["..."]
Another example that "sort" a tuple:
match my_tuple with
| (a,b) when a>b -> (a,b)
| (a,b) -> (b,a)
Or just use conditions with integers :
if i = len then []
else ["..."]
You can also note that you can do pattern matching within functions :
let f (a,b) = a+b
The len in your pattern is a new variable introduced by the pattern. As a pattern, its meaning is that it will match anything at all. Thus, the next pattern _ will never match.
As #AlexanderRevyakin says, this new variable len is hiding the parameter that's also named len.
It is not the case that the len in your pattern represents the value of the parameter len. OCaml patterns contain only new variables (to which pieces of the matched value are bound) and constants. They don't contain expressions that are evaluated at run time. For that you want to use if/then/else (as #AntonTrunov points out).

OCaml: remove the first element of a list

I have a list composed by several couples of numbers:
[(1,2);(3,4);(5,6);(7,8)]
I want to remove the the first element (the head) from the list, so the output should be:
[(3,4);(5,6);(7,8)]
Someone can help me? I was thinking about this function but it doesn't work:
let cut x = function
[] -> []
| (a,b) -> []
| (a,b)::ris -> ris
Just remember, that
let f x y = function -> <code>
is really a shortcut (or a syntactic sugar), for:
let f x y z = match z with -> <code>
So, it just cuts the last argument in a function, and automatically matches on it.
Also, when you pattern matching keep in mind, that all expressions in the left side of pattern match should have the same type. Otherwise, compiler may pick a random one, and decide, that all others have the same type, yielding a somewhat confusing error message. The same is true for the right sides of patter match. So, when you see a compiler message, saying that something is not what he has expected, just check this preconditions:
| [] (* is a list, by definition *)
| (a,b) -> [] (* is a pair, by definition of a pair *)
| (a,b)::ris -> ris (* is a list, by definition of (::) *)
If left part works, look at the right.
Also, if you have a variable that you do not need to use, then you should better give it a name starting with underscore, or just an underscore.
let cut = function
| [] -> []
| _::xs -> xs
You are almost there:
let tl x = match x with
| [] -> [] (* or failwith "empty" *)
| ab::ris -> ris
Few points:
function takes another argument. Your function already get one arg, so use match x with instead.
You are interested only in the list is empty or has a "tail", so you need not pattern-match its element as a tuple.
This function is called "tail", in OCaml, known as List.tl.
You could write simple:
let cut = List.tl
You have little mistake.
Third line should look like
| [(a,b)] -> []
or
| (a,b) :: [] -> []
P.S. by the way, this third line is unnecessary. Just remove it.
And delete x in first line:
let cut = function