How to create a module OCaml and use it? - ocaml

I'm new to this language/Tecnology
I have a simple question but I can not find answer:
I would like to create a my Module where you can enter OCaml simple functions / assignments such as the following
let rec gcd (m, n) = if m = 0 then n
   else gcd (n mod m, n);;
 
let one = 1;;
let two = 2;;
Use these functions to other programs OCaml

Every OCaml source file forms a module named the same as the file (with first character upper case). So one way to do what you want is to have a file named (say) numtheory.ml:
$ cat numtheory.ml
let rec gcd (m, n) = if m = 0 then n
else gcd (n mod m, n)
let one = 1
let two = 2
This forms a module named Numtheory. You can compile it and link into projects. Or you can compile it and use it from the OCaml toplevel:
$ ocamlc -c numtheory.ml
$ ocaml
OCaml version 4.01.0
# #load "numtheory.cmo";;
# Numtheory.one;;
- : int = 1
# Numtheory.gcd (4, 8);;
- : int = 8
(For what it's worth, this doesn't look like the correct definition of gcd.)

Related

Logistic Regression in OCaml

I was trying to use Logistic regression in OCaml. I need to use it as a blackbox for another problem I'm solving. I found the following site:
http://math.umons.ac.be/anum/en/software/OCaml/Logistic_Regression/
I pasted the following code (with a few modifications - I defined my own iris_features and iris_label) from this site into a file named logistic_regression.ml:
open Scanf
open Format
open Bigarray
open Lacaml.D
let log_reg ?(lambda=0.1) x y =
(* [f_df] returns the value of the function to maximize and store
its gradient in [g]. *)
let f_df w g =
let s = ref 0. in
ignore(copy ~y:g w); (* g ← w *)
scal (-. lambda) g; (* g = -λ w *)
for i = 0 to Array.length x - 1 do
let yi = float y.(i) in
let e = exp(-. yi *. dot w x.(i)) in
s := !s +. log1p e;
axpy g ~alpha:(yi *. e /. (1. +. e)) ~x:x.(i);
done;
-. !s -. 0.5 *. lambda *. dot w w
in
let w = Vec.make0 (Vec.dim x.(0)) in
ignore(Lbfgs.F.max f_df w);
w
let iris_features = [1 ; 2 ; 3] ;;
let iris_labels = 2 ;;
let proba w x y = 1. /. (1. +. exp(-. float y *. dot w x))
let () =
let sol = log_reg iris_features iris_labels in
printf "w = %a\n" Lacaml.Io.pp_fvec sol;
let nwrongs = ref 0 in
for i = 0 to Array.length iris_features - 1 do
let p = proba sol iris_features.(i) iris_labels.(i) in
printf "Label = %i prob = %g => %s\n" iris_labels.(i) p
(if p > 0.5 then "correct" else (incr nwrongs; "wrong"))
done;
printf "Number of wrong labels: %i\n" !nwrongs
I have the following questions:
On trying to compile the code, I get the error message: "Error: Unbound module Lacaml". I've installed Lacaml; done opam init several times, tried to provide a flag -package = Lacaml ; I don't know how to solve this?
As you can see I've defined my own version of iris_features and iris_labels - are the types correct i.e. in the function log_reg is the type of x int list and that of y as int?
Both iris_features and iris_labels are arrays and array literals in OCaml are delimited with the [|, |] style parentheses, e.g.,
let iris_features = [|(* I don't know what to put here*)|]
let iris_labels = [|2|]
The iris_features array has type vec array, i.e., an array of vectors, not an array of integers, and didn't I dig too deep to know what to put there, but the syntax is the following,
let iris_features =[|
Vec.of_list [1.; 2.; 3.;];
Vec.of_list [4.; 5.; 6.;];
|]
The Lacaml interface has changed a bit since the code was written and axpy no longer accepts labeled ~x arguments (both x and y vectors are positional now) so you need to remove ~x and fix the order (I presume that x.(i) is x in the a*x + y expression and g corresponds to y, e.g.,
axpy ~alpha:(yi *. e /. (1. +. e)) x.(i) g;
This code also depends on lbfgs, so you need to install it as well,
opam depext --install lbfgs
I would suggest you using dune as your default built system but for fast prototyping, you can use ocamlbuild. Put your code into an empty folder in a file named regress.ml (you can pick other name, just update the build instructions correspondingly), now you can build it to a native executable, as
ocamlbuild -pkg lacaml -pkg lbfgs regress.native
run it as
./regress.native
If you're playing in the OCaml toplevel (aka interpreter, i.e., running your code in the ocaml interpreter), you can load lacaml and lbfgs using the following two directives:
#use "topfind";;
#require "lacaml.top";;
#require "lbfgs";;
(The # is not a prompt but a part of the directive syntax, so don't forget to type it as well).
Now you can copy-paste your code into the interpreter and play with it.
Bonus Track - building with dune
create an empty folder and a put regress.ml there.
remove open Bigarray and open Scanf as dune is very strict on warnings and turns them into errors (and it will warn you on those lines as they are, in fact, unused)
create the dune project
dune init exe regress --libs lacaml,lbfgs
build and run
dune exec ./regress.exe

Printing ocaml function after compiling

It's my first day with ocaml. Enjoying it so far. I wanted to figure out if there is a way to print the result of a function. Here's an example based on Project Euler #5.
My code is:
let rec gcd a b =
if b==0 then a
else (gcd b (a mod b));;
let rec myans n anssofar=
if n==1 then anssofar
else (myans (n-1) ((anssofar*(n-1))/(gcd anssofar (n-1))));;
Printf.printf "%d\n" (myans 20 20)
This works fine. I then compile it using:
$ ocamlc -o PE0005 PE0005.ml
And then run it using
$ ./PE0005
And it spits out the answer.
Now, suppose I wanted to work out myans 10 10. It seems perverse to do what I have been doing which is to go back, edit the last line to
Printf.printf "%d\n" (myans 10 10)
and then recompile and rerun. The function has already been defined and compiled. Is there some way I can print out the answer without recompiling?
Any hints and tips are welcome.
One possibility is to run your code in the toplevel (the OCaml read/eval/print loop). This lets you experiment more easily.
$ ocaml
# #use "PE0005.ml";;
val gcd : int -> int -> int = <fun>
val myans : int -> int = <fun>
232792560
- : unit = ()
# myans 10 10;;
- : int = 2520
Another possibility is to rewrite your code to get the argument from the command line. This is what you would do in practice for a compiled command-line program.
let main () =
if Array.length Sys.argv < 3 then (
Printf.eprintf "need two integer arguments\n";
exit 1
) else (
Printf.printf "%d\n"
(myans (int_of_string Sys.argv.(1)))
)
let () = main ()
This is how it works when you run it:
$ ocamlc -o PE0005 PE0005.ml
$ ./PE0005 20 20
232792560
$ ./PE0005 10 10
2520
You can use sys.argv to get command line arguments and pass the values at runtime.

Generating list of integers in OCaml without recursion

How can I use one of the fold functions to generate a list of integers from 0 to a value n-1? I'm confused about how to get fold_right to return a list rather than returning just an accumulated value.
This is for a helper function that I'm trying to define to solve a larger problem. Here is my attempt:
-I know the base case has to be a list containing only zero, because I do not want to add anything less than zero.
-I know that I need to decrement the value n so that I can put numbers from n-1 to 0 in the list.
let buildList n =
let ibuildList elem list =
list#[n-1]
in List.fold_right ibuildList n [0];;
But I get an error underscoring "n" in the last line saying that the expression has type int but an expression was expected of type 'a list. Isn't n an integer that I'm turning into a list via [n-1]? Where did I go wrong?
Very sorry, I missed at least one step of the reasoning.
A fold is for traversing a collection. Since you want to generate a list and you just have n, not a collection, you can't really use fold in any reasonable way. In fact, what you want to do is more like an unfold. I.e., you want to unfold your n into a list.
It's easy to write this function, but not easy to write it using a fold.
Here's an implementation of unfold in OCaml:
let rec unfold_right f init =
match f init with
| None -> []
| Some (x, next) -> x :: unfold_right f next
Here's how to use unfold_right to generate a list of ints:
let range n =
let irange x = if x > n then None else Some (x, x + 1) in
unfold_right irange 1
Here's how it looks when you run range:
# range 0;;
- : int list = []
# range 8;;
- : int list = [1; 2; 3; 4; 5; 6; 7; 8]
# range 5;;
- : int list = [1; 2; 3; 4; 5]
An alternative version, using the standard Stream module:
(* an infinite stream of natural numbers, starting from 0 *)
let nats =
let rec nats_from n = [< 'n; nats_from (n + 1) >] (* extra syntax *)
in nats_from 0
(* the first n natural numbers: [0; n-1] *)
let range n = Stream.npeek n nats
The piece [< 'n; nats_from (n + 1) >] represents a lazy list with n as its head and the next natural numbers as its tail. Stream.npeek n stream consumes the first n elements of stream and returns them as a list.
Tests with utop:
utop # #load "dynlink.cma";; (* you need these to work with *)
utop # #load "camlp4o.cma";; (* the Stream's syntactic extension *)
utop # range 1;;
- : int list = [0]
utop # range 5;;
- : int list = [0; 1; 2; 3; 4]
utop # range 10;;
- : int list = [0; 1; 2; 3; 4; 5; 6; 7; 8; 9]
If you'd like to compile it, use the following commands (you need to use the camplp4o preprocessor):
$ ocamlc -pp camlp4o <filename>.ml
or
$ ocamlopt -pp camlp4o <filename>.ml

How to reduce code clutter in this function?

The function tally below is really simple: it takes a string s as argument, splits it on non-alphanumeric characters, and tallies the numbers of the resulting "words", case-insensitively.
open Core.Std
let tally s =
let get m k =
match Map.find m k with
| None -> 0
| Some n -> n
in
let upd m k = Map.add m ~key:k ~data:(1 + get m k) in
let re = Str.regexp "[^a-zA-Z0-9]+" in
let ws = List.map (Str.split re s) ~f:String.lowercase in
List.fold_left ws ~init:String.Map.empty ~f:upd
I think this function is harder to read than it should be due to clutter. I wish I could write something closer to this (where I've indulged in some "fantasy syntax"):
(* NOT VALID SYNTAX -- DO NOT COPY !!! *)
open Core.Std
let tally s =
let get m k =
match find m k with
| None -> 0
| Some n -> n ,
upd m k = add m k (1 + get m k) ,
re = regexp "[^a-zA-Z0-9]+" ,
ws = map (split re s) lowercase
in fold_left ws empty upd
The changes I did above fall primarily into three groups:
get rid of the repeated let ... in's, consolidated all the bindings (into a ,-separated sequence; this, AFAIK, is not valid OCaml);
got rid of the ~foo:-type noise in function calls;
got rid of the prefixes Str., List., etc.
Can I achieve similar effects using valid OCaml syntax?
Readability is difficult to achieve, it highly depends on the reader's abilities and familiarity with the code. I'll focus simply on the syntax transformations, but you could perhaps refactor the code in a more compact form, if this is what you are really looking for.
To remove the module qualifiers, simply open them beforehand:
open Str
open Map
open List
You must open them in that order to make sure the List values you are using there are still reachable, and not scope-overridden by the Map ones.
For labelled parameters, you may omit the labels if for each function call you provide all the parameters of the function in the function signature order.
To reduce the number of let...in constructs, you have several options:
Use a set of rec definitions:
let tally s =
let rec get m k =
match find m k with
| None -> 0
| Some n -> n
and upd m k = add m k (1 + get m k)
and re = regexp "[^a-zA-Z0-9]+"
and ws = map lowercase (split re s)
in fold_left ws empty upd
Make multiple definitions at once:
let tally s =
let get, upd, ws =
let re = regexp "[^a-zA-Z0-9]+" in
fun m k ->
match find m k with
| None -> 0
| Some n -> n,
fun g m k -> add m k (1 + g m k),
map lowercase (split re s)
in fold_left ws empty (upd get)
Use a module to group your definitions:
let tally s =
let module M = struct
let get m k =
match find m k with
| None -> 0
| Some n -> n
let upd m k = add m k (1 + get m k)
let re = regexp "[^a-zA-Z0-9]+"
let ws = map lowercase (split re s)
end in fold_left ws empty M.upd
The later is reminiscent of the Sml syntax, and perhaps better suited to proper optimization by the compiler, but it only get rid of the in keywords.
Please note that since I am not familiar with the Core Api, I might have written incorrect code.
If you have a sequence of computations on the same value, then in OCaml there is a |> operator, that takes a value from the left, and applies in to the function on the right. This can help you to "get rid of" let and in. What concerning labeled arguments, then you can get rid of them by falling back to a vanilla standard library, and make your code smaller, but less readable. Anyway, there is a small piece of sugar with labeled arguments, you can always write f ~key ~data instead of f ~key:key ~data:data. And, finally, module names can be removed either by local open syntax (let open List in ...) or by locally shorcutting it to a smaller names (let module L = List in).
Anyway, I would like to show you a code, that contains less clutter, to my opinion:
open Core.Std
open Re2.Std
open Re2.Infix
module Words = String.Map
let tally s =
Re2.split ~/"\\PL" s |>
List.map ~f:(fun s -> String.uppercase s, ()) |>
Words.of_alist_multi |>
Words.map ~f:List.length

How to use List.nth inside a function

I am new to OCaml. I am trying to use List.nth just like List.length but it keeps giving me a syntax error or complains about not matching the interface defined in another file. Everything seems to work fine if I comment out using List.nth
Thanks
It's hard to help unless you show the code that's not working. Here is a session that uses List.nth:
$ ocaml
OCaml version 4.00.0
# let x = [3;5;7;9];;
val x : int list = [3; 5; 7; 9]
# List.nth x 2;;
- : int = 7
#
Here's a session that defines a function that uses List.nth. (There's nothing special about this.)
# let name_of_day k =
List.nth ["Mon";"Tue";"Wed";"Thu";"Fri";"Sat";"Sun"] k;;
val name_of_day : int -> string = <fun>
# name_of_day 3;;
- : string = "Thu"
# 
(As a side comment: using List.nth is often inappropriate. It takes time proportional to n to find the nth element of a list. People just starting with OCaml often think of it like accessing an array--i.e., constant time--but it's not.)