Explanation of OCaml code: explode a string, split a list - list

I am absolute OCaml beginner and have an assignment about more code. I have got the following code, but I don't know how it works. If someone can help me out, I appreciate it.
# let explode str = (*defines function that explodes argument str witch is type
string into list of chars*)
let rec exp = function (*defines recursive function exp*)
| a, b when a < 0 -> b (*this part i dont know.is this pattern
matching ?is it function with arguments a and b
and they go into expression? when is a guard and
then we have if a is smaller than 0 then b *)
(*if a is not smaller than 0 then this function ? *)
| a, b -> exp (a-1, str.[a]::b) (*this i dont know, a and b are arguments
that go into recursive function in the way
that a is decreesed by one and b goes into
string a?? *)
in
exp ((String.length str)-1, []);; (*defined function exp on string lenght of
str decresed by one (why?) [ ]these
brackets mean or tell some kind of type ? *)
# let split lst ch =
let rec split = function (* defines recursive fun split *)
| [], ch, cacc', aacc' -> cacc'::aacc'(* if empty ...this is about what i got
so far :) *)
| c::lst, ch, cacc', aacc' when c = ch -> split (lst, ch, [], cacc'::aacc')
| c::lst, ch, cacc', aacc' -> split (lst, ch, c::cacc', aacc')
in
split (lst, ch, [], []);;
val split : 'a list -> 'a -> 'a list list = <fun>

This code is ugly. Whoever has been giving that to you is making you a disservice. If a student of mine wrote that, I would ask them to rewrite them without using when conditionals, because they tend to be confusing, encourage to write pattern-matching-heavy code at places where they are not warranted.
As a rule of the thumb, beginners should never use when. A simple if..then..else test provides an increase in readability.
Here are equivalent versions of those two functions, rewritten for readability:
let explode str =
let rec exp a b =
if a < 0 then b
else exp (a - 1) (str.[a] :: b)
in
exp (String.length str - 1) []
let split input delim_char =
let rec split input curr_word past_words =
match input with
| [] -> curr_word :: past_words
| c :: rest ->
if c = delim_char
then split rest [] (curr_word :: past_words)
else split rest (c :: curr_word) past_words
in
split input [] []
My advice to understand them is to run them yourself, on a given example, on paper. Just write down the function call (eg. explode "foo" and split 'b' ['a';'b';'c';'d']), expand the definition, evaluate the code to get another expression, etc., until you get to the result. Here is an example:
explode "fo"
=>
exp (String.length "fo" - 1) []
=>
exp 1 []
=>
if 1 < 0 then [] else exp 0 ("fo".[1] :: [])
=>
exp 0 ("fo".[1] :: [])
=>
exp 0 ('o' :: [])
=>
exp 0 ['o']
=>
if 0 < 0 then ['o'] else exp (-1) ("fo".[0] :: ['o'])
=>
exp (-1) ("fo".[0] :: ['o'])
=>
exp (-1) ('f' :: ['o'])
=>
exp (-1) ['f'; 'o']
=>
if -1 < 0 then ['f'; 'o'] else exp (-2) ("fo".[-1] :: ['o'])
=>
['f'; 'o']
Take the care to do that, for each function, and any function you will have problem understanding. On a small example. That's the best way to get a global view of what's going on.
(Later when you grow more used to recursion, you'll find out that you don't actually need to do that, you can reason inductively on the function: make an assumption on what they do, and assuming that recursive calls actually do that, check that it indeed does it. In more advanced cases, trying to hold all the execution in one's head is just too hard, and this induction technique works better, but it is more high-level and requires more practices. First begin by simply running the code.)

If you're using the Core library you can just use
String.to_list "BKMGTPEZY"
Which will return a list of chars if you want strings just map it:
String.to_list "BKMGTPEZY" |> List.map ~f:Char.to_string
Outputs:
- : bytes list = ["B"; "K"; "M"; "G"; "T"; "P"; "E"; "Z"; "Y"]
As a function
let explode s = String.to_list s |> List.map ~f:Char.to_string

You can also implement in this way.
let rec strexp s =
if length(s)==0 then
[]
else
(strexp (sub s 0 (length(s)-1)))#(s.[length(s)-1]::[])
;;

Related

How to count the number of recurring character repetitions in a char list?

My goal is to take a char list like:
['a'; 'a'; 'a'; 'a'; 'a'; 'b'; 'b'; 'b'; 'a'; 'd'; 'd'; 'd'; 'd']
Count the number of repeated characters and transform it into a (int * char) list like this:
[(5, 'a'); (3, 'b'); (1, 'a'); (4, 'd')]
I am completely lost and also am very very new to OCaml. Here is the code I have rn:
let to_run_length (lst : char list) : (int * char) list =
match lst with
| [] -> []
| h :: t ->
let count = int 0 in
while t <> [] do
if h = t then
count := count + 1;
done;
I am struggling on how to check the list like you would an array in C or Python. I am not allowed to use fold functions or map or anything like that.
Edit: Updated code, yielding an exception on List.nth:
let rec to_run_length (lst : char list) : (int * char) list =
let n = ref 0 in
match lst with
| [] -> []
| h :: t ->
if h = List.nth t 0 then n := !n + 1 ;
(!n, h) :: to_run_length t ;;
Edit: Added nested match resulting in a function that doesn't work... but no errors!
let rec to_run_length (lst : char list) : (int * char) list =
match lst with
| [] -> []
| h :: t ->
match to_run_length t with
| [] -> []
| (n, c) :: tail ->
if h <> c then to_run_length t
else (n + 1, c) :: tail ;;
Final Edit: Finally got the code running perfect!
let rec to_run_length (lst : char list) : (int * char) list =
match lst with
| [] -> []
| h :: t ->
match to_run_length t with
| (n, c) :: tail when h = c -> (n + 1, h) :: tail
| tail -> (1, h) :: tail ;;
One way to answer your question is to point out that a list in OCaml isn't like an array in C or Python. There is no (constant-time) way to index an OCaml list like you can an array.
If you want to code in an imperative style, you can treat an OCaml list like a list in C, i.e., a linked structure that can be traversed in one direction from beginning to end.
To make this work you would indeed have a while statement that continues only as long as the list is non-empty. At each step you examine the head of the list and update your output accordingly. Then replace the list with the tail of the list.
For this you would want to use references for holding the input and output. (As a side comment, where you have int 0 you almost certainly wanted ref 0. I.e., you want to use a reference. There is no predefined OCaml function or operator named int.)
However, the usual reason to study OCaml is to learn functional style. In that case you should be thinking of a recursive function that will compute the value you want.
For that you need a base case and a way to reduce a non-base case to a smaller case that can be solved recursively. A pretty good base case is an empty list. The desired output for this input is (presumably) also an empty list.
Now assume (by recursion hypothesis) you have a function that works, and you are given a non-empty list. You can call your function on the tail of the list, and it (by hypothesis) gives you a run-length encoded version of the tail. What do you need to do to this result to add one more character to the front? That's what you would have to figure out.
Update
Your code is getting closer, as you say.
You need to ask yourself how to add a new character to the beginning of the encoded value. In your code you have this, for example:
. . .
match to_run_length t with
| [] -> []
. . .
This says to return an empty encoding if the tail is empty. But that doesn't make sense. You know for a fact that there's a character in the input (namely, h). You should be returning some kind of result that includes h.
In general if the returned list starts with h, you want to add 1 to the count of the first group. Otherwise you want to add a new group to the front of the returned list.

Error: Unbound value take

I'm learning ocaml and I'm trying to write easy function which prints out firsts given as argument int's.
What I have wrote:
let rec take(number, lista)=
let rec take_acc(number, lista, acc)=
match number with
| 0 -> []
| number < 0 -> []
| number > lista.length -> lista
| number < lista.length -> take_acc(number-1, lista.tl, acc#lista.head);;
take_acc(number, lista, [])
let listas = 1 :: 2 :: 3 :: 4 :: 5 :: 6 :: [];;
take(2,listas);;
The point is, that given code above gives me error:
Error: Unbound value take
What I'm doing wrong?
The point is that this code works:
let xxl = 11 :: 33 :: 54 :: 74 :: [];;
let rec take2 (ile,xxx) =
if ile=0 || ile<0 then []
else
if ile>(List.length xxl) then take2(ile-1,xxx)
else
List.hd xxx :: take2(ile-1,List.tl xxx);;
Where is difference beetwen these two programs?
EDIT:
Due to Jeffrey Scofield's suggestion I have written something like this:
let rec take2(ilosc, lista) =
let rec take_acc(ilosc, lista, acc) =
if ilosc = 0 || ilosc < 0 then []
else
if ilosc > List.length lista
then lista
else
take_acc(ilosc-1, lista.tl, acc#lista.hd);;
let listas = 1 :: 2 :: 3 :: 4 :: 5 :: 6 :: [];;
take2(2,listas);;
Still the same.
Your code is not syntactically well formed. So it couldn't ever reach the point of saying that take isn't defined.
The first thing to fix is your use of patterns. The construct number < 0 isn't a pattern, it's a boolean expression. You can have a boolean as part of a pattern using when:
| _ when number < 0
However, this isn't particulary good style for what you want to test. It might be better just to use if for your tests.
The next thing to fix might be your use of lista.length. The way to get the length of a list in OCaml is List.length lista.

Empty character in OCaml

I am trying to do something fairly simple. I want to take a string such as "1,000" and return the string "1000".
Here was my attempt:
String.map (function x -> if x = ',' then '' else x) "1,000";;
however I get a compiler error saying there is a syntax error wrt ''
Thanks for the insight!
Unfortunately, there's no character like the one you're looking for. There is a string that's 0 characters long (""), but there's no character that's not there at all. All characters (so to speak) are 1 character.
To solve your problem you need a more general operation than String.map. The essence of a map is that its input and output have the same shape but different contents. For strings this means that the input and output are strings of the same length.
Unless you really want to avoid imperative coding (which is actually a great thing to avoid, especially when starting out with OCaml), you would probably do best using String.iter and a buffer (from the Buffer module).
Update
The string_map_partial function given by Andreas Rossberg is pretty nice. Here's another implementation that uses String.iter and a buffer:
let string_map_partial f s =
let b = Buffer.create (String.length s) in
let addperhaps c =
match f c with
| None -> ()
| Some c' -> Buffer.add_char b c'
in
String.iter addperhaps s;
Buffer.contents b
Just an alternate implementation with different stylistic tradeoffs. Not faster, probably not slower either. It's still written imperatively (for the same reason).
What you'd need here is a function like the following, which unfortunately is not in the standard library:
(* string_map_partial : (char -> char option) -> string -> string *)
let string_map_partial f s =
let buf = String.create (String.length s) in
let j = ref 0 in
for i = 0 to String.length s - 1 do
match f s.[i] with
| None -> ()
| Some c -> buf.[!j] <- c; incr j
done;
String.sub buf 0 !j
You can then write:
string_map_partial (fun c -> if c = ',' then None else Some c) "1,000"
(Note: I chose an imperative implementation for string_map_partial, because a purely functional one would require repeated string concatenation, which is fairly expensive in OCaml.)
A purely functional version could be this one:
let string_map_partial f s =
let n = String.length s in
let rec map_str i acc =
if i < n then
map_str (i + 1) (acc ^ (f (String.make 1 s.[i])))
else acc
in map_str 0 ""
Which is terminal recursive, but less performant than the imperative version.

Split a string into list of words' characters in Ocaml

So, I have homework and I'm doing my best to solve it.
We have to translate from English to Morse code.
Every word has to be separated.
Example: if I enter this is it should write: ["_";"....";"..";"..."]["..";"...."]
I wrote 2 functions so far (lowercase to uppercase and matching letters and numbers with Morse code) and now I need to write function which converts string to a list of list of characters like this:
stringSAllCaps " ban an a ";;
- : char list list = [['B'; 'A'; 'N']; ['A'; 'N']; ['A']]
stringSAllCaps "banana";;
- : char list list = [['B'; 'A'; 'N'; 'A'; 'N'; 'A']]
I know how to convert a string into a list of characters, but have no idea what to do next. I don't need someone to solve that for me completely, just to guide me in right direction.
This is what I have done:
let explode niz =
let rec exp a b =
if a < 0 then b
else exp (a - 1) (niz.[a] :: b) in
exp (String.length niz - 1) []
;;
edit:
ty for your help :)
I've managed to solve this problem, but not like this. I will post it later.
as I solved it and continued with my homework I realized that I had to use while and pointers and now I'm stuck again (pointers are not my best friends.. ). Any suggestions?
my solution at the moment:
# let explode str =
let rec exp = function
| a, b when a < 0 -> b
| a, b -> exp (a-1, str.[a]::b)
in
exp ((String.length str)-1, []);;
# let split lst ch =
let rec split = function
| [], ch, cacc', aacc' -> cacc'::aacc'
| c::lst, ch, cacc', aacc' when c = ch -> split (lst, ch, [], cacc'::aacc')
| c::lst, ch, cacc', aacc' -> split (lst, ch, c::cacc', aacc')
in
split (lst, ch, [], []);;
I guess you should start by:
Renaming the arguments of your recursive function to have a more explicit meaning (as index and current_word for instance)
Adding a new parameter in you recursive function to store the words already seen (seen_words)
testing whether niz.[a] is a blank char and do the right things if it is the case ie. update the current word or the already seen list of words.

Ocaml introduction

i'm trying to learn ocaml right now and wanted to start with a little program, generating all bit-combinations:
["0","0","0"]
["0","0","1"]
["0","1","0"]
... and so on
My idea is the following code:
let rec bitstr length list =
if length = 0 then
list
else begin
bitstr (length-1)("0"::list);
bitstr (length-1)("1"::list);
end;;
But i get the following error:
Warning S: this expression should have type unit.
val bitstr : int -> string list -> string list = <fun>
# bitstr 3 [];;
- : string list = ["1"; "1"; "1"]
I did not understand what to change, can you help me?
Best regards
Philipp
begin foo; bar end executes foo and throws the result away, then it executes bar. Since this makes only sense if foo has side-effects and no meaningful return value ocaml emits a warning if foo has a return value other than unit, since everything else is likely to be a programmer error (i.e. the programmer does not actually intend for the result to be discarded) - as is the case here.
In this case it really does make no sense to calculate the list with "0" and then throw it away. Presumably you want to concatenate the two lists instead. You can do this using the # operator:
let rec bitstr length list =
if length = 0 then
[list]
else
bitstr (length-1)("0"::list) # bitstr (length-1)("1"::list);;
Note that I also made the length = 0 case return [list] instead of just list so the result is a list of lists instead of a flat list.
Although sepp2k's answer is spot on, I would like to add the following alternative (which doesn't match the signature you proposed, but actually does what you want) :
let rec bitstr = function
0 -> [[]]
| n -> let f e = List.map (fun x -> e :: x) and l = bitstr (n-1) in
(f "0" l)#(f "1" l);;
The first difference is that you do not need to pass an empty list to call the function bitsr 2 returns [["0"; "0"]; ["0"; "1"]; ["1"; "0"]; ["1"; "1"]]. Second, it returns a list of ordered binary values. But more importantly, in my opinion, it is closer to the spirit of ocaml.
I like to get other ideas!
So here it is...
let rec gen_x acc e1 e2 n = match n with
| 0 -> acc
| n -> (
let l = List.map (fun x -> e1 :: x) acc in
let r = List.map (fun x -> e2 :: x) acc in
gen_x (l # r) e1 e2 (n - 1)
);;
let rec gen_string = gen_x [[]] "0" "1"
let rec gen_int = gen_x [[]] 0 1
gen_string 2
gen_int 2
Result:
[["0"; "0"]; ["0"; "1"]; ["1"; "0"]; ["1"; "1"]]
[[0; 0]; [0; 1]; [1; 0]; [1; 1]]