Trie data structure in OCaml - ocaml

I am trying to build a trie in OCaml:
type ('a, 'b) trie = Nil | Cons of 'a * 'b option * ('a, 'b) trie list;;
(* find place to insert key in a list of tries *)
let rec findInsert key x =
match x with
[] -> Nil
| x::xs -> let Cons(k, _, _) = x in
if key = k then x else findInsert key xs;;
(* inser pair in a trie *)
let rec insert ( key, value ) trie =
match trie with
Nil -> Cons(key, value, [])
| t -> let Cons(k, v, trieList) = t and
subTree = insert (key, value) (findInsert key trieList) and
newSubTree = subTree::trieList in
Cons(k, v, newSubTree);;
But this gives me the following error:
val findInsert : 'a -> ('a, 'b) trie list -> ('a, 'b) trie = <fun>
File "trie.ml", line 15, characters 54-62:
Error: Unbound value trieList
EDIT::
Thanks to Virgile I now have the a program that compiles:
(* insert pair in a trie *)
let rec insert ( key, value ) trie =
match trie with
Nil -> Cons(key, value, [])
| t ->
let Cons(k, v, trieList) = t in
let subTree = insert (key, value) (findInsert key trieList) in
Cons(k, v, subTree::trieList);;
But when I try to run it I get this:
# let t = Cons(3, Some 4, []);;
val t : (int, int) trie = Cons (3, Some 4, [])
# insert (4, Some 5) t;;
Error: This expression has type (int, int) trie/1017
but an expression was expected of type (int, int) trie/1260
What do those numbers represent?

You shouldn't use let x = ... and y = ... in when y depends on x, as all identifiers bound by the unique let are supposed to be defined at the same time. Use let x = ... in let y = ... in instead, to ensure that x will be in scope when defining y.
In your case, this becomes:
let Cons(k, v, trieList) = t in
let subTree = insert (key, value) (findInsert key trieList) in ...

When using the toplevel, if you define the same type twice, ocaml will see two types, not just one. As your two types have the same name trie, they are renamed trie/1017 and trie/1260. If you recompile the type declaration, you must recompile all other declarations that rely on this type so that they will use the new type and not the old one.
Other remark: you should never write
match foo with
| a -> let PATTERN = a in
you should use this instead:
match foo with
| PATTERN ->

Related

List.assoc using List.find

I want to implement the List.assoc function using List.find, this is what I have tried:
let rec assoc lista x = match lista with
| [] -> raise Not_found
| (a,b)::l -> try (List.find (fun x -> a = x) lista)
b
with Not_found -> assoc l x;;
but it gives me this error:
This expression has type ('a * 'b) list but an expression was expected of type 'a list
The type variable 'a occurs inside 'a * 'b
I don't know if this is something expected to happen or if I'm doing something wrong. I also tried this as an alternative:
let assoc lista x = match lista with
| [] -> raise Not_found
| (a,b)::l -> match List.split lista with
| (l1,l2) -> let ind = find l1 (List.find (fun s -> compare a x = 0))
in List.nth l2 ind;;
where find is a function that returns the index of the element requested:
let rec find lst x =
match lst with
| [] -> raise Not_found
| h :: t -> if x = h then 0 else 1 + find t x;;
with this code the problem is that the function should have type ('a * 'b) list -> 'a -> 'b, but instead it's (('a list -> 'a) * 'b) list -> ('a list -> 'a) -> 'b, so when I try
assoc [(1,a);(2,b);(3,c)] 2;;
I get:
This expression has type int but an expression was expected of type
'a list -> 'a (refering to the first element of the pair inside the list)
I don't understand why I don't get the expected function type.
First off, a quick suggestion on making your assoc function more idiomatic OCaml: have it take the list as the last argument.
Secondly, why are you attempting to implement this in terms of find? It's much easier without.
let rec assoc x lista =
match lista with
| [] -> raise Not_found
| (a, b) :: xs -> if a = x then b else assoc x xs
Something like this is simpler and substantially more efficient with the way lists work in OCaml.
Having the list as the last argument, even means we can write this more tersely.
let rec assoc x =
function
| [] -> raise Not_found
| (a, b) :: xs -> if a = x then b else assoc x xs
As to your question, OCaml infers the types of functions from how they're used.
find l1 (List.find (fun s -> compare a x = 0))
We know l1 is an int list. So we must be trying to find it in an int list list. So:
List.find (fun s -> compare a x = 0)
Must return an int list list. It's a mess. Try rethinking your function and you'll end up with something much easier to reason about.

Inserting values in a Trie

I found this implementation of a Trie in an SML directory:
signature DICT =
sig
type key = string (* concrete type *)
type 'a entry = key * 'a (* concrete type *)
type 'a dict (* abstract type *)
val empty : 'a dict
val lookup : 'a dict -> key -> 'a option
val insert : 'a dict * 'a entry -> 'a dict
val toString : ('a -> string) -> 'a dict -> string
end; (* signature DICT *)
exception InvariantViolationException
structure Trie :> DICT =
struct
type key = string
type 'a entry = key * 'a
datatype 'a trie =
Root of 'a option * 'a trie list
| Node of 'a option * char * 'a trie list
type 'a dict = 'a trie
val empty = Root(NONE, nil)
(* val lookup: 'a dict -> key -> 'a option *)
fun lookup trie key =
let
(* val lookupList: 'a trie list * char list -> 'a option *)
fun lookupList (nil, _) = NONE
| lookupList (_, nil) = raise InvariantViolationException
| lookupList ((trie as Node(_, letter', _))::lst, key as letter::rest) =
if letter = letter' then lookup' (trie, rest)
else lookupList (lst, key)
| lookupList (_, _) =
raise InvariantViolationException
(*
val lookup': 'a trie -> char list
*)
and lookup' (Root(elem, _), nil) = elem
| lookup' (Root(_, lst), key) = lookupList (lst, key)
| lookup' (Node(elem, _, _), nil) = elem
| lookup' (Node(elem, letter, lst), key) = lookupList (lst, key)
in
lookup' (trie, explode key)
end
(*
val insert: 'a dict * 'a entry -> 'a dict
*)
fun insert (trie, (key, value)) =
let
(*
val insertChild: 'a trie list * key * value -> 'a trie list
Searches a list of tries to insert the value. If a matching letter
prefix is found, it peels of a letter from the key and calls insert'.
If no matching letter prefix is found, a new trie is added to the list.
Invariants:
* key is never nil.
* The trie list does not contain a Root.
Effects: none
*)
fun insertChild (nil, letter::nil, value) =
[ Node(SOME(value), letter, nil) ]
| insertChild (nil, letter::rest, value) =
[ Node(NONE, letter, insertChild (nil, rest, value)) ]
| insertChild ((trie as Node(_, letter', _))::lst, key as letter::rest, value) =
if letter = letter' then
insert' (trie, rest, value) :: lst
else
trie :: insertChild (lst, key, value)
| insertChild (Root(_,_)::lst, letter::rest, value) =
raise InvariantViolationException
| insertChild (_, nil, _) = (* invariant: key is never nil *)
raise InvariantViolationException
(*
val insert': 'a trie * char list * 'a -> 'a trie
Invariants:
* The value is on the current branch, including potentially the current node we're on.
* If the key is nil, assumes the current node is the destination.
Effects: none
*)
and insert' (Root(_, lst), nil, value) = Root(SOME(value), lst)
| insert' (Root(elem, lst), key, value) = Root(elem, insertChild (lst, key, value))
| insert' (Node(_, letter, lst), nil, value) = Node(SOME(value), letter, lst)
| insert' (Node(elem, letter, lst), key, value) = Node(elem, letter, insertChild (lst, key, value))
in
insert'(trie, explode key, value)
end
(*
val toString: ('a -> string) -> 'a dict -> string
*)
fun toString f trie =
let
val prefix = "digraph trie {\nnode [shape = circle];\n"
val suffix = "}\n"
(* val childNodeLetters: 'a trie list * char list -> char list *)
fun childNodeLetters (lst, id) =
(foldr
(fn (Node(_, letter, _), acc) => letter::acc
| _ => raise InvariantViolationException) nil lst)
(* val edgeStmt: string * string * char -> string *)
fun edgeStmt (start, dest, lbl) =
start ^ " -> " ^ dest ^ " [ label = " ^ Char.toString(lbl) ^ " ];\n"
(* val allEdgesFrom: char list * char list *)
fun allEdgesFrom (start, lst) =
(foldr
(fn (letter, acc) =>
acc ^ edgeStmt(implode(start), implode(start # [letter]), letter))
"" lst)
(* val labelNode: stirng * string -> string *)
fun labelNode (id: string, lbl: string) =
id ^ " [ label = \"" ^ lbl ^ "\" ];\n"
fun toString' (Root(elem, lst), id) =
let
val idStr = implode(id)
val childLetters = childNodeLetters(lst, id)
val childStr = foldr (fn (trie, acc) => acc ^ toString'(trie, id)) "" lst
in
(case elem
of SOME(value) =>
labelNode (idStr, f(value)) ^
allEdgesFrom (id, childLetters)
| NONE =>
labelNode (idStr, "") ^
allEdgesFrom (id, childLetters)) ^ childStr
end
| toString' (Node(elem, letter, lst), id) =
let
val thisId = id # [letter]
val idStr = implode(thisId)
val childLetters = childNodeLetters(lst, thisId)
val childStr = foldr (fn (trie, acc) => acc ^ toString'(trie, thisId)) "" lst
in
(case elem
of SOME(value) =>
labelNode (idStr, f(value)) ^
allEdgesFrom (thisId, childLetters)
| NONE =>
labelNode (idStr, "") ^
allEdgesFrom (thisId, childLetters)) ^ childStr
end
in
prefix ^ (toString' (trie, [#"_", #"R"])) ^ suffix
end
end
Whenever i Try to insert or lookup for a string in this implementation using the above functions:insert,lookup i get these error:
stdIn:1.2-1.8 Error: unbound variable or constructor: lookup
stdIn:1.2-1.8 Error: unbound variable or constructor: insert
I think this is a declaration problem but i am not sure how to fix it.
Why is this happening and how can i insert or search properly in a Trie data structure?
First off, if you don't have the intellectual rights to this code, you should link to where you found it rather than repeat it, since you provide no attributions. Secondly, the code seems to work fine. Here I'm inserting a couple of keys and looking them up:
$ sml trie.sml
Standard ML of New Jersey v110.79 [built: Tue Aug 8 23:21:20 2017]
[opening trie.sml]
signature DICT =
sig
type key = string
type 'a entry = key * 'a
type 'a dict
val empty : 'a dict
val lookup : 'a dict -> key -> 'a option
val insert : 'a dict * 'a entry -> 'a dict
val toString : ('a -> string) -> 'a dict -> string
end
[autoloading]
[library $SMLNJ-BASIS/basis.cm is stable]
[library $SMLNJ-BASIS/(basis.cm):basis-common.cm is stable]
[autoloading done]
exception InvariantViolationException
structure Trie : DICT
- val foo = Trie.insert (Trie.empty, ("foo", 42));
val foo = - : int Trie.dict
- val bar = Trie.insert (foo, ("fab", 43));
val bar = - : int Trie.dict
- Trie.lookup bar "foo";
val it = SOME 42 : int option
- Trie.lookup bar "fab";
val it = SOME 43 : int option
- Trie.lookup bar "wat";
val it = NONE : int option

How should I check whether a number is in a list in OCaml?

Why is my code wrong?
# let ls = [1;2];;
val ls : int list = [1; 2]
# let inList a l = List.exists a l;;
val inList : ('a -> bool) -> 'a list -> bool =
# inList 1 ls;;
Error: This expression has type int but an expression was expected of type
'a -> bool
The first parameter of List.exists is a function that returns true if the element is one you're looking for and false if not. You're supplying the int 1, which isn't a function.
You need a function looking_for like this:
let inList a l =
let looking_for x = ... in
List.exists looking_for l
The function looking_for should return true if x is what you're looking for (i.e., if it's equal to a) and false otherwise.
Well, as you can see :
# let inList a l = List.exists a l;;
val inList : ('a -> bool) -> 'a list -> bool
So a is of type 'a -> bool which means that a is a predicate on each element of the list.
What you wanted to write was
let inList a l = List.mem a l
val inList : 'a -> 'a list -> bool
TL;DR RTFM ;-) http://caml.inria.fr/pub/docs/manual-ocaml/libref/List.html

OCaml: the variant type unit has no constructor ::

I'm trying to implement sets through lists.. This is the code with the implementation (I omitted the interface):
module MySet : Set =
struct
type 'a set = 'a list
let empty : 'a set = []
let add (x: 'a) (s: 'a set) : 'a set =
if not(List.mem x s) then x::s
let remove (x: 'a) (s: 'a set) : 'a set =
let rec foo s res =
match s with
| [] -> List.rev res
| y::ys when y = x -> foo ys res
| y::ys -> foo ys (y::res)
in foo s []
let list_to_set (l: 'a list) : 'a set =
let rec foo l res =
match l with
| [] -> List.rev res
| x::xs when member x xs -> foo xs res
| x::xs -> foo xs (x::res)
in foo l []
let member (x: 'a) (s: 'set) : bool =
List.mem x s
let elements (s: 'a set) : 'a list =
let rec foo s res =
match s with
| [] -> List.rev res
| x::xs -> foo xs (x::res)
in foo s []
end;;
This is the error I get
Characters 162-164:
if not(List.mem x s) then x::s
^^
Error: The variant type unit has no constructor ::
I can't understand the error
It's a very confusing message that we got since 4.01 that stems from the fact that you have no else branch and that () is a valid constructor for unit.
Since you have no else branch the whole if must type to unit and thus the then branch aswell and it tries to unify the expression in the then branch with a value of type unit and detects that :: is not a constructor for values of type unit.
What you wanted to write is:
if not (List.mem x s) then x :: s else s
Without an else branch your add function needs to type to 'a -> 'a set -> unit
The strange error message is being bug tracked in OCaml's issue tracker, see PR 6173.

Ocaml - parameter type when checking for duplicates in a list

I've got a basic function which checks a list for duplicates and returns true if they are found, false otherwise.
# let rec check_dup l = match l with
[] -> false
| (h::t) ->
let x = (List.filter h t) in
if (x == []) then
check_dup t
else
true
;;
Yet when I try to use this code I get the error
Characters 92-93:
let x = (List.filter h t) in
^
Error: This expression has type ('a -> bool) list
but an expression was expected of type 'a list
I don't really understand why this is happening, where is the a->bool list type coming from?
The type ('a -> bool) list is coming from the type of filter and from the pattern match h::t in combination. You're asking to use a single element of the list, h, as a predicate to be applied to every element of the list t. The ML type system cannot express this situation. filter expects two arguments, one of some type 'a -> bool where 'a is unknown, and a second argument of type 'a list, where 'a is the same unknown type as in the first argument. So h must have type 'a -> bool and t must have type 'a list.
But you have also written h::t, which means that there is another unknown type 'b such that h has type 'b and t has type 'b list. Put this together and you get this set of equations:
'a -> bool == 'b
'a list == 'b list
The type checker looks at this and decides maybe 'a == 'b, yielding the simpler problem
'a -> bool == 'a
and it can't find any solution, so it bleats.
Neither the simpler form nor the original equation has a solution.
You are probably looking for List.filter (fun x -> x = h) t, and you would probably be even better off using List.exists.
For complete this answer I post the final function for search duplicate value in array:
let lstOne = [1;5;4;3;10;9;5;5;4];;
let lstTwo = [1;5;4;3;10];;
let rec check_dup l = match l with
[] -> false
| (h::t) ->
let x = (List.filter (fun x -> x = h) t) in
if (x == []) then
check_dup t
else
true;;
and when the function run:
# check_dup lstOne
- : bool = true
# check_dup lstTwo
- : bool = false
#