Generating C code in Ocaml - ocaml

I'm trying to create a code generating DSL in OCaml, however I can't find many examples on what the code generation looks like. I would just like to see how to create code values in OCaml.
For example if I had a type like this:
let equation =
Add of int * int
| Sub of int * int
| Mul of int * int
| Div of int * int;;
and I want a function like this:
let write_code = function
| Add (x, y) -> // INSERT CODE "x + y" here
etc... how would this look?
I have looked at this example http://okmij.org/ftp/meta-programming/tutorial/power.ml but the characters .< >. are causing syntax errors when I try to compile.
The code generated will not need to be compiled or executed, but saved to a .c file for later use.
I would just like to see the basic structure for this simple example so I can apply it to a more complicated problem.

You can do like that :
type equation =
| Const of int
| Var of string
| Add of equation * equation
| Mul of equation * equation ;;
let rec c_string_of_equation = function
| Const i -> string_of_int i
| Var x -> x
| Add (e1, e2) ->
"(" ^ c_string_of_equation e1 ^ ") + (" ^ c_string_of_equation e2 ^ ")"
| Mul (e1, e2) ->
"(" ^ c_string_of_equation e1 ^ ") * (" ^ c_string_of_equation e2 ^ ")"
;;
Here you produce a string and after that you can write that string where you want.
I changed your expression type a bit to be more general.
The result string will contain too much parentheses, but it does not matter because the generated code is not targeted to humans but to a compiler.

You could use a buffer :
As it's written in the module :
This module implements buffers that automatically expand as necessary. It provides accumulative concatenation of strings in quasi-linear time (instead of quadratic time when strings are concatenated pairwise).
For example, you can write :
let equation =
| Add of int * int
| Sub of int * int
| Mul of int * int
| Div of int * int;;
let co = open_out filename
let buff = Buffer.create 11235
let write_code = function
| Add (x, y) -> Buffer.add_string buff (Printf.sprintf "%d + %d" x y)
| ... -> ...
let write c =
write_code c;
Buffer.output_buffer co buff
With
# Buffer.create;;
- : int -> Buffer.t = <fun>
# Buffer.add_string;;
- : Buffer.t -> string -> unit = <fun>
# Buffer.output_buffer;;
- : out_channel -> Buffer.t -> unit = <fun>
Notice that Buffer.add_string write the string at the end of the buffer ;-)

Related

Why does ocaml keep miss understanding type?

module Value =
struct
type t = Int of int
end
module M = Map.Make(String)
type expr =
| Num of int
| Add of expr * expr
type t = Value.t M.t (* Value.t is Int of int *)
let rec add_map (st: string list) (e: expr list) (s: t): t =
match st with
| [] -> s
| s1::st ->
match e with
| e1::e ->
M.add s1 e1 s;
add_map st e s;;
In above function, e is list of user defined type expr, and s is user defined map "t = Int M.t" which store int in key of string. Problem is if I compile this, error says that type of e1 is t = t M.t, and I need expr M.t. Clearly e1 is element of expr list, why does ocaml think it is t?? I know M.add need (M.add string expr (map)
You didn't show the exact error message, but there is a problem with your call to M.add: the map s has type Value.t M.t, but you are giving it a value of type expr, not Value.t.
You have a Map type t that maps strings to Value.t values. But in your add_map function, you're adding values of type expr to the map.
You need to map values of type expr to Value.t:
let rec expr_to_value_t = function
| Num n -> Value.Int n
| Add (e1, e2) ->
let Value.Int n1 = expr_to_value_t e1 in
let Value.Int n2 = expr_to_value_t e2 in
Value.Int (n1 + n2)
let rec add_map (st: string list) (e: expr list) (s: t): t =
match st with
| [] -> s
| s1::st ->
match e with
| e1::e ->
M.add s1 (expr_to_value_t e1) s;
add_map st e s
However, while this compiles, it does prompt errors about non-exhaustive pattern-matching, and worse, M.add s1 (expr_to_value_t e1) s in this context doesn't do anything. Maps in OCaml are functional data structures. You don't mutate them, but rather transform them. M.add doesn't modify s, it just creates a new map with an additional binding.
You can overcome this with relatively few modifications to your function.
let rec add_map (st: string list) (e: expr list) (s: t): t =
match st with
| [] -> s
| s1::st ->
match e with
| e1::e ->
let s = M.add s1 (expr_to_value_t e1) s in
add_map st e s
Here I've shadowed the original s binding with the new map which is used in the recursive call to add_map. Testing this:
utop # add_map ["hello"; "world"] [Num 23; Num 42] M.empty |> M.bindings;;
- : (string * Value.t) list =
[("hello", Value.Int 23); ("world", Value.Int 42)]
This would be a great place to use List.fold_left2, assuming both lists are of equal length. Otherwise Invalid_argument will be raised.
let add_map st e s =
List.fold_left2 (fun m a b -> M.add a b m) s st e

OCaml Function to Perform Differentiation

I'm currently studying the language OCaml, and was solving an exercise problem when I came across a question that I can't seem to wrap my head around. Here's the question:
"Write a function differentiate : expression * string -> expression that receives an algebraic equation and a string as an argument, and returns the differentiated version of the argument equation.
For example, diff (Add [Mult [Int 3 ; Exp("x", 2)] ; Mult [Int 6 ; Variable "x"], "x") should produce the result:
Add [Mult [Int 6 ; Variable "x"] ; Int 6]"
Here's the code that I wrote:
type expression =
| Int of int
| Variable of string
| Exponent of string * int
| Mult of expression list
| Add of expression list
let rec differentiate : expression * string -> expression
= fun (exp, x) ->
match exp with
| Int a -> Int 0
| Variable a -> if (a = x) then Int 1 else Variable a
| Exponent (a, b) -> if (a = x) then
match b with
| 2 -> Mult [Int 2; Variable a]
| _ -> Mult [Int b; Exponent (a, b - 1)]
else Int 0
| Mult [Int a; Int b] -> Const (a * b)
| Mult (Int a::[Variable b]) -> Mult (Int a::[differentiate (Variable b, x)])
| Mult (Int a::[Exponent (e1, e2)]) -> Mult (Int a::[differentiate (Exponent (e1, e2),
x)])
| Mult (Int a::[Mult (Int b :: l)]) -> Mult (Int (a * b) :: l)
| Add l -> match l with
| [] -> l
| hd::tl -> Add ((differentiate (hd, x)) :: tl)
;;
My algorithm is basically performing rigorous pattern matching. More specifically, for Mult, the first element is always an integer, so I performed pattern matching on the second element. For Add, my plan was to write the function so that it performs the function differentiate on each element. Here are the specific problems I would like to ask about.
This code actually gives me an error on the Add l portion of pattern matching. The error message states: Error: This expression has type (expression list) but an expression was expected of type (expression). As far as my understanding reaches, I am certain that Add l is an expression type, not an expression list type. Why is this error message produced?
I am not sure how to perform recursion in this specific example. My initial thought is that the function should only execute once each, otherwise the result would consist mainly of Int 0's or Int 1's. Please correct me if I'm wrong.
Any feedback is greatly appreciated. Thank you!

When is OCaml's warning 27 "Innocuous unused variable" useful?

This is the description of warning 27 from the OCaml manual:
27 Innocuous unused variable: unused variable that is not bound with let nor as, and doesn't start with an underscore (_) character.
This warning is turned on by jbuilder --dev, and I'm curious to know in which cases people find it useful. For me, it's an annoyance to get warnings when I write code like this:
$ utop -w +27
utop # fun (x, y) -> x;;
Characters 8-9:
Warning 27: unused variable y.
- : 'a * 'b -> 'a = <fun>
or like that:
utop # let error loc msg = failwith (loc ^ ": " ^ msg);;
val error : string -> string -> 'a = <fun>
utop # let rec eval = function
| `Plus (loc, a, b) -> eval a + eval b
| `Minus (loc, a, b) -> eval a - eval b
| `Star (loc, a, b) -> eval a * eval b
| `Slash (loc, a, b) ->
let denom = eval b in
if denom = 0 then
error loc "division by zero"
else
eval a / denom
| `Int (loc, x) -> x
;;
Characters 33-36:
Warning 27: unused variable loc.
Characters 73-76:
Warning 27: unused variable loc.
Characters 112-115:
Warning 27: unused variable loc.
Characters 287-290:
Warning 27: unused variable loc.
val eval :
([< `Int of 'b * int
| `Minus of 'c * 'a * 'a
| `Plus of 'd * 'a * 'a
| `Slash of 'e * 'a * 'a
| `Star of 'f * 'a * 'a ]
as 'a) ->
int = <fun>
I know that prepending an underscore to the identifiers as in _loc suppresses the warnings, but it's not compatible with my notions that:
variables starting with an underscore are ugly and are meant for use in generated code, hidden from the programmer;
a name given to something should not have to change based on how it's used (including unused).
Using underscores, the code becomes:
(* Here we have _loc or loc depending on whether it's used. *)
let rec eval = function
| `Plus (_loc, a, b) -> eval a + eval b
| `Minus (_loc, a, b) -> eval a - eval b
| `Star (_loc, a, b) -> eval a * eval b
| `Slash (loc, a, b) ->
let denom = eval b in
if denom = 0 then
error loc "division by zero"
else
eval a / denom
| `Int (_loc, x) -> x
or
(* Here it can be hard to know what _ stands for. *)
let rec eval = function
| `Plus (_, a, b) -> eval a + eval b
| `Minus (_, a, b) -> eval a - eval b
| `Star (_, a, b) -> eval a * eval b
| `Slash (loc, a, b) ->
let denom = eval b in
if denom = 0 then
error loc "division by zero"
else
eval a / denom
| `Int (_, x) -> x
It is very useful in the monadic code, where instead of the common syntactic let bindings you're forced to use monadic >>= bind operator. Basically, where
let x = something in
code
translates to
something >>= fun x ->
code
If x is not used in code then only with the 27 warning enabled the latter will be highlighted, while the former will produce a warning by default. Enabling this warning, revealed lots of bugs for us. For example, it showed us that this code is buggy :)
Another source of use cases are higher-order functions, i.e., map, fold, etc. It captures one of the most common bugs:
let bug init =
List.fold ~init ~f:(fun acc xs ->
List.fold ~init ~f:(fun acc x -> x :: acc))
Concerning the ugliness, I totally agree that underscores are ugly, but in most cases, this is the main purpose of them - to highlight the suspicious code. Concerning the example, that you're showing, in the modern OCaml it could be easily addressed with the inline records, e.g.,
type exp =
| Plus of {loc : loc; lhs : exp; rhs: exp}
| ...
so that instead of using the underscores, you can just omit the unused field,
let rec eval = function
| Plus {lhs; rhs} -> eval lhs + eval rhs
You can use the same approach without using inline records by sparing some extra space in your program and defining all those records separately. The real-world example.
For me this warning is useful in order to remind me to explicit more my intention. If we take your example :
fun (x, y) -> x;;
Your intention is to use only the first element. If we rewrite it this way :
fun (x, _ ) -> x;;
You use a pattern matching in the parameter to make your code more concise, but you explain your intention of using only the first element. The added value in this example is small, related to the very simple implementation. But in real life functions, this warning promote a good habit in coding.

Changing an expression to string

I need to convert an arithmetic sequence that uses
this type:
type expr =
VarX
| VarY
| Sine of expr
| Cosine of expr
| Average of expr * expr
| Times of expr * expr
| Thresh of expr * expr * expr * expr
here are the definitions for all the things in it:
e ::= x | y | sin (pi*e) | cos (pi*e) | ((e + e)/2) | e * e | (e<e ? e : e)
need to convert something like this:
exprToString (Thresh(VarX,VarY,VarX,(Times(Sine(VarX),Cosine(Average(VarX,VarY))))));;
to this:
string = "(x<y?x:sin(pi*x)*cos(pi*((x+y)/2)))"
I know I have to do this recursively by matching each expr with its appropriate string but Im not sure where the function begins matching or how to recurse through it. Any help or clues would be appreciated
Here is a simplified version of what you probably want:
type expr =
| VarX
| Sine of expr
let rec exprToString = function
| VarX -> "x"
| Sine e -> "sin(" ^ exprToString e ^ ")"
let () = print_endline (exprToString (Sine (Sine (Sine (Sine VarX)))))
It recurses over the AST nodes and create the string representation of the input by concatenating the string representations of the nodes.
This approach may not work nicely for bigger real world examples since:
String concatenation (^) creates a new string from two, this is slower than using some more appropriate data structure such as Buffer.t
Too many parentheses, ex, (2*(2*(2*2))), not 2*2*2*2. If you want minimize the number of parentheses, your algorithm must be aware of operator precedence and connectivity.

Filtering OCaml list to one variant

So I have a list of stmt (algebraic type) that contain a number of VarDecl within the list.
I'd like to reduce the list from stmt list to VarDecl list.
When I use List.filter I can eliminate all other types but I'm still left with a stmt list.
I found that I was able to do the filtering as well as the type change by folding, but I can't figure out how to generalize it (I need this pattern many places in the project).
let decls = List.fold_left
(fun lst st -> match st with
| VarDecl(vd) -> vd :: lst
| _ -> lst
) [] stmts in
Is there a better way to perform a filter and cast to a variant of the list type?
Assuming you have a type like
type stmt = VarDecl of int | Foo of int | Bar | Fie of string
and a stmt list, Batteries lets you do
let vardecl_ints l =
List.filter_map (function Vardecl i -> Some i | _ -> None) l
let foo_ints l =
List.filter_map (function Foo i -> Some i | _ -> None) l
which I think is about as concise as you're going to get. I don't
think you can make general "list-getters" for ADT's, because e.g.
let bars l =
List.filter_map (function Bar -> Some Bar | _ -> None) l
https://github.com/ocaml-batteries-team/batteries-included/blob/d471e24/src/batList.mlv#L544
has the Batteries implementation of filter_map, if you don't want the
dependency. A functional version with [] instead of dst would be quite similar, only doing
(x::dst) and a |>List.rev at the end.
You could use GADTs or polymorphic variants, but both tend to drive up complexity.
Here's a rough sketch of how you might approach this problem with polymorphic variants:
type constant = [ `Int of int | `String of string ]
type var = [ `Var of string ]
type term = [ constant | var | `Add of term * term ]
let rec select_vars (list : term list) : var list =
match list with
| [] -> []
| (#var as v)::list -> v::select_vars list
| _::list -> select_vars list
let rec select_constants (list : term list) : constant list =
match list with
| [] -> []
| (#constant as k)::list -> k::select_constants list
| _::list -> select_constants list
Another possibility is to pull the bits of a var out into an explicit type of which you can have a list:
type var = {
...
}
type term =
| Int of int
| Var of var
This has some overhead over having the bits just be constructor args, and a var is not a term, so you will likely need to do some wrapping and unwrapping.
It's hard to answer without seeing your type definition (or a simplified version of it).
Note, though, that if you have this definition:
type xyz = X | Y | Z
The values X, Y, and Z aren't types. They're values. Possibly Vardecl is a value also. So you can't have a list of that type (in OCaml).
Update
One thing I have done for cases like this is to use the type projected from the one variant you want:
type xyz = X | Y of int * int | Z
let extract_proj v l =
match v with
| X | Z -> l
| Y (a, b) -> (a, b) :: l
let filter_to_y l =
List.fold_right extract_proj l []
Here's a toplevel session:
type xyz = X | Y of int * int | Z
val extract_proj : xyz -> (int * int) list -> (int * int) list = <fun>
val filter_to_y : xyz list -> (int * int) list = <fun>
# filter_to_y [X; Z; Y(3,4); Z; Y(4,5)];;
- : (int * int) list = [(3, 4); (4, 5)]