Finding a sub-trie by edge in OCaml - T9 predictive text implementation

Finding a sub-trie by edge in OCaml - T9 predictive text implementation - ocaml

I’m very new to OCaml and having a difficult time implementing a series of functions to build a T9 predictive text program. For example if my word is “Dog” - as an integer list would be [3;6;4]. I already have a pattern matching function to relate words to int lists. I’m using the data type trie to map numbers to the possible word outcomes possible:
type ('a, 'b) trie = Node of 'b list * ('a * ('a, 'b) trie) list
A trie with edges labelled with keys of type 'a and nodes labeled with lists of words of type 'b
I need to write a function with parameters trie and edge label that returns a trie at the end of the edge.
val trie_of_key : (’a, ’b) trie -> ’a -> (’a, ’b) trie = <fun>
How do I traverse the edges to arrive at a given node? Functional programming is still disorienting to me, so I am unsure of the recursive steps needed to arrive at the expected sub-trie.

It seems to me that if you don't want to modify the trie, functional programming is the same as regular old imperative programming. A lookup function that doesn't do any restructuring along the way should be pretty straightforward. Maybe you're just overthinking the problem?
It's hard to say more without seeing an example of what you've tried.
Update
Here's a lookup function I just wrote for a B-Tree structure. There are some similarities to the problem you're trying to solve, so maybe it will give you some ideas.
type ('a, 'b) btree = Node of ('a * 'b) list * ('a, 'b) btree list
let rec lookup bt k =
match bt with
| Node ([], _) -> raise Not_found
| Node (keyvals, subtrees) ->
let rec look kvs sts =
match kvs with
| [] ->
lookup (List.hd sts) k (* Rightmost subtree *)
| (hdk, hdv) :: tlkv ->
if hdk = k then hdv
else if hdk < k then look tlkv (List.tl sts)
else lookup (List.hd sts) k
in
look keyvals subtrees
I don't think the details are important, but if you're trying to understand the code carefully, it's based on the invariant that a node with no key/value pairs is a leaf node with no subtrees. Otherwise if there are n key/value pairs in the node, there are exactly n + 1 subtrees. (These subtrees can be empty.)

Related

Filtering integers from list of list in OCaml

I am trying to write a function that filters positive integers from a list of list of integers, returning a list of only negative integers.
For example, if I have a list of list such as [[-1; 1]; [1]; [-1;-1]] it would return [[-1]; []; [-1;-1]].
I tried to use filter and transform functions, which was in my textbook.
let rec transform (f:'a -> 'b) (l:'a list) : 'b list =
begin match l with
| [] -> []
| x::tl -> (f x)::(transform f tl)
end
and for filter, I had previously written:
let rec filter (pred: 'a -> bool) (l: 'a list) : 'a list =
begin match l with
| [] -> []
| x :: tl -> if pred x then x :: (filter pred tl) else filter pred tl
end
So, using these, I wrote
let filter_negatives (l: int list list) : int list list =
transform (fun l -> (filter(fun i -> i<0)) + l) [] l
but I'm still having trouble fully understanding anonymous functions, and I'm getting error messages which I don't know what to make of.
This function has type ('a -> 'b) -> 'a list -> 'b list
It is applied to too many arguments; maybe you forgot a `;'.

(For what it's worth this transform function is more commonly called map.)
The error message is telling you a simple, true fact. The transform function takes two arguments: a function and a list. You're giving it 3 arguments. So something must be wrong.
The transformation you want to happen to each element of the list is a filtering. So, if you remove the + (which really doesn't make any sense) from your transforming function you have something very close to what you want.
Possibly you just need to remove the [] from the arguments of transform. It's not clear (to me) why it's there.

How to find the min and max values of a tree in SML

I have the two datatypes:
datatype 'a Tree = LEAF of 'a | NODE of ('a Tree) * ('a Tree)
and
datatype 'a myTree = myLEAF of 'a | myNODE of 'a * 'a * 'a myTree * 'a myTree
With these two, I need to be able to find the min and max values of a tree.
For example:
findMin (NODE(NODE(LEAF(5),NODE(LEAF(6),LEAF(8))),LEAF(4)))
will produce 4.
I have been working on this for some time now, and I'm quite confused.
Any guidance would be helpful. Thank you.

You know that there is at least one element in every 'a Tree, so there is always a min/max.
Use pattern matching on each of the two constructors LEAF and NODE, and use recursion in the NODE case, since the two branches might have different min/max values and the min/max for the node is determined by whatever is min/max for its branches. And use the built-in helper functions Int.min and Int.max, if you're finding the min/max integers of a tree. (Your example suggests that this is the case.)
fun findMin (LEAF x) = (* ... *)
| findMin (NODE (leftTree, rightTree)) =
let (* ... use findMin recursively on each branch ... *)
in (* ... find the minimal value of the two branches ... *)
end
I'm not sure what the 'a myTree type is good for: It is a binary tree in that it has two 'a myTree branches per node, but it also has two 'a elements per node? Should you be interested in finding the min/max of either of those values? Or is one a key and another a value in some tree-based dictionary structure? If so, then why is it 'a -> 'a and not 'a -> 'b? It is hard to solve a problem when you don't understand the problem statement, and the datatype is a large portion of that.
Edit: Since you've provided a solution yourself, let me give some feedback on it:
fun findMin (LEAF(v)) = v
| findMin (NODE(left, right)) =
if findMin(left) < findMin(right)
then findMin(left)
else findMin(right)
This solution is very inefficient since it calls itself three times for each node's entire subtree. That means the number of function calls roughly follows the recurrence relation f(0) = 1 and f(n) = 3 ⋅ f(n-1). This is equivalent to 3n or exponentially many calls to find the minimal element in a list of n elements.
Here is a way that take linear time by temporarily storing the result you use twice:
fun findMin (LEAF v) = v
| findMin (NODE (left, right)) =
let val minLeft = findMin left
val minRight = findMin right
in if minLeft < minRight then minLeft else minRight
end
There is no reason to perform the Herculean task of calculating findMin left and findMin right more than once in every node of the tree. Since we refer to it multiple time, a let-in-end is an easy way to bind the results to lexically scoped names, minLeft and minRight.
The expression if minLeft < minRight then minLeft else minRight actually has a name in the standard library: Int.min. So we could do:
fun findMin (LEAF v) = v
| findMin (NODE (left, right)) =
let val minLeft = findMin left
val minRight = findMin right
in Int.min (minLeft, minRight)
end
But the reason for using a let-in-end has actually evaporated, since, with the help of a library function, we're now only referring to findMin left (aka minLeft) and findMin right (aka minRight) once now. (Actually, we are referring to them more than once, but that is inside Int.min in which the result has also been bound to a temporary, lexically scoped name.)
So we ditch the let-in-end for a much shorter:
fun findMin (LEAF v) = v
| findMin (NODE (left, right)) = Int.min (findMin left, findMin right)
In any case, these are all equally optimal: They use only n recursive function calls for n elements in the tree, which is the least you can do when the elements aren't sorted. Now, if you knew the smaller elements were always to the left, you'd have a binary search tree and you could find the min/max much faster. :-)
Edit (again): Just for fun, you could find the min/max simultaneously:
fun findMinMax (LEAF v) = (v, v)
| findMinMax (NODE (left, right)) =
let val (minLeft, maxLeft) = findMinMax left
val (minRight, maxRight) = findMinMax right
in (Int.min (minLeft, minRight), Int.max(maxLeft, maxRight))
end

I think one problem is that you're thinking hard about how to traverse the tree and check all the nodes in some order and keep track of things, but recursion will handle that for you.
Your tree has two cases; it is either a leaf, or a node with two subtrees.
This suggests that the solution will also have two cases: one for leaves and one for internal nodes.
Write down (in your own words, not code) how you would find the minimum in
a leaf; and
an internal node if you already knew the respective minimums of its subtrees -- don't worry about how to find them yet, but pretend that you know what they are.
Then write down how you find the minimums of the subtrees of an internal node.
(This is a recursion, so you've already solved this problem, before you started thinking about it.)
Then you translate it into ML.

I was 100% just overthinking the problem too much. Thank you both for your help! I got my answer.
fun findMin (LEAF(v)) = v
| findMin (NODE(left, right)) =
if findMin(left) < findMin(right)
then findMin(left)
else findMin(right)
fun findMax (LEAF(v)) = v
| findMax (NODE(left, right)) =
if findMax(left) > findMax(right)
then findMax(left)
else findMax(right)

OCaml option type in binary tree

I have a few problems creating a tree size function with type 'a option tree -> int
type 'a tree = Leaf of 'a
| Fork of 'a * 'a tree * 'a tree
How would I create a t_opt_size function with type 'a option tree -> int?
I know I would have to use Some and the None operate.
I have this so far, but it's complicated to match with the option type.
let rec t_size (tr: 'a tree): int =
match tr with
| Leaf _ -> 1
| Fork (_, t1, t2) -> t_size t1 + t_size t2 + 1

I assume from your comments that you want a leaf that looks like (Leaf None) not to be counted in your tree size calculation.
Seems like the key is to split this:
| Leaf _ -> 1
Into two cases:
| Leaf None -> (* Left as exercise *)
| Leaf (Some _) -> (* Left as exercise *)
Since OCaml will take the first match, you can abbreviate this as follows if you like:
| Leaf None -> (* Left as exercise *)
| Leaf _ -> (* Left as exercise *)
You should make a similar change to the Fork case, though I have to say that Fork (None, l, r) doesn't really work for constructing a search tree.

If you want to generalize, you might need to write a generic tree walker which accepts a visitor function. I recommend you try to implement fold_tree, which accepts: (1) a fold function, taking some value, a tree and producing a new result ('a -> 'b t -> 'c), (2) an initial element of type 'a as well as (3) a tree. Then, fold_tree returns a value of type 'c.
Then, you should be able to call fold_tree with a function that skips over None leaves but otherwise increment the count like you did.

If you don't want to count all values in the tree as 1, but each depending on its contents, write a function that determines the count per value and use that:
let weight = function
| _ -> 1 (* or anything else *)
let rec t_opt_size (tr: 'a tree): int = match tr with
| Leaf v -> weight v
| Fork (v, t1, t2) -> t_size t1 + t_size t2 + weight v
You even might want to generalise and pass the weight function as a parameter to t_size instead of writing different size functions that all use their own weighting.

Convert integer list into tree in F#

I'm new to F# and would like to know how to convert a simple integer list into a tree.
let lst =[1;2;3;4]
type Tree=
|Leaf of int
|Node Tree * Tree
list should convert to tree like this ---> Leaf 1,Node(Leaf 2),Node(Node(Leaf 3,Leaf 4))

The output that you want to get in your answer is a bit poorly formatted, but my interpretation is that you are trying to build a balanced binary tree. To do this recursively, you need to split the input list in two halves and then recursively build tree from the left and the right halves.
This is a bit tricky, because splitting a functional list in halves is not that simple. In practice, you could probably turn your data into an array and use that, but if you want a functional solution you can use:
type Tree = Leaf of int | Node of Tree * Tree
let rec half marker acc xs =
match xs, marker with
| x::xs, _::_::marker -> half marker (x::acc) xs
| x::xs, _::[] -> List.rev (x::acc), xs
| xs, _ -> List.rev acc, xs
The trick in the half function is that it iterates over the list and keeps two copies of the list. From one (called marker), it takes two elements at each step and so by the time this list is empty, you have reached the middle of the original list where we take just one element at each step.
Now you can write a simple recursive function to build a tree
let rec makeTree = function
| [] -> failwith "Does not work on empty lists"
| [x] -> Leaf x
| xs -> let l, r = half xs [] xs
Node(makeTree l, makeTree r)

Creating a doubly linked list from a list in OCaml

I am often told that using the Lazy module in OCaml, one can do everything you can do in a lazy language such as Haskell. To test this claim, I'm trying to write a function that converts a regular list into a static doubly linked list in ocaml.
type 'a dlist = Dnil | Dnode of 'a dlist * 'a * 'a dlist
Given this type I can create several static doubly linked lists by hand:
let rec l1 = Dnode (Dnil,1,l2)
and l2 = Dnode (l1,2,l3)
and l3 = Dnode (l2,3,Dnil)
but I'd like to write a function of type 'a list -> 'a dlist that given any list builds a static doubly linked list in OCaml. For example given [1;2;3] it should output something equivalent to l1 above.
The algorithm is pretty straightforward to write in Haskell:
data DList a = Dnil | Dnode (DList a) a (DList a)
toDList :: [a] -> DList a
toDList l = go Dnil l
where
go _ [] = Dnil
go h (x:xs) = let r = Dnode h x (go r xs) in r
but I haven't been able to figure out where to place calls to lazy to get this to compile in OCaml.

If you build your linked list in right-to-left order (as for normal lists), then the left element of every node will only be built after that node itself is built. You need to represent this by making the left element lazy, which means "this value will be constructed later" :
type 'a dlist =
| Dnil
| Dnode of 'a dlist Lazy.t * 'a * 'a dlist
Once you have this, construct every node as a lazy value using a recursive definition which passes the lazy (still unconstructed) node to the function call that builds the next node (so that it has access to the previous node). It's actually simpler than it looks :
let dlist_of_list list =
let rec aux prev = function
| [] -> Dnil
| h :: t -> let rec node = lazy (Dnode (prev, h, aux node t)) in
Lazy.force node
in
aux (Lazy.lazy_from_val Dnil) list

You can only build a cyclic immutable strict data structure of a shape that's determined at compile time. I'm not going to define or prove this formally, but intuitively speaking, once the data structure is created, its shape isn't going to change (because it's immutable). So you can't add to a cycle. And if you create any element of the cycle, you need to create all the other elements of the cycle at the same time, because you can't have any dangling pointer.
Ocaml can do what Haskell can do, but you do have to get the Lazy module involved! Unlike Haskell's, ML's data structures are strict unless otherwise specified. A lazy data structure has pieces of type 'a Lazy.t. (ML's typing is more precise than Haskell on that particular issue.) Lazy data structures allow cycles to be built by having provisionally-dangling pointers (whose linked values are automatically created when the pointer is first dereferenced).
type 'a lazy_dlist_value =
| Dnil
| Dnode of 'a lazy_dlist_value * 'a * 'a lazy_dlist_value
and 'a lazy_dlist = 'a lazy_dlist_value Lazy.t
Another common way to have cyclic data structures is to use mutable nodes. (In fact, die-hard proponents of strict programming might see lazy data structures as a special case of mutable data structures that doesn't break referential transparency too much.)
type 'a mutable_dlist_value =
| Dnil
| Dnode of 'a mutable_dlist_value * 'a * 'a mutable_dlist_value
and 'a mutable_dlist = 'a mutable_dlist_value ref
Cyclic data structures are mostly useful when they involve at least one mutable component, one function (closure), or sometimes modules. But there'd be no reason for the compiler to enforce that — cyclic strict immutable first-order data structures are just a special case which can occasionally be useful.

type 'a dlist = Dnil | Dnode of 'a dlist Lazy.t * 'a * 'a dlist Lazy.t
let rec of_list list = match list with
[] -> Dnil
| x :: [] ->
let rec single () = Dnode (lazy (single ()), x, lazy (single ()))
in single ()
| x :: y -> Dnode (
lazy (
of_list (match List.rev list with
[] | _ :: [] -> assert false
| x :: y -> x :: List.rev y
)
),
x,
lazy (
of_list (match list with
[] | _ :: [] -> assert false
| x :: y -> y # x :: []
)
)
)
let middle dlist = match dlist with
Dnil -> raise (Failure "middle")
| Dnode (_, x, _) -> x
let left dlist = match dlist with
Dnil -> raise (Failure "left")
| Dnode (x, _, _) -> Lazy.force x
let right dlist = match dlist with
Dnil -> raise (Failure "right")
| Dnode (_, _, x) -> Lazy.force x

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js