I'm working on a bloom filter in OCaml and I'm really stumped.
First, I define a signature to interact with a bloom filter, so that the bloom filter can be implemented in multiple different ways:
module type memset = sig
type elt (* type of values stored in the set *)
type t (* abstract type used to represent a set *)
val mem : elt -> t -> bool
val empty : t
val is_empty : t -> bool
val add : elt -> t -> t
val from_list : elt list -> t
val union : t -> t -> t
val inter : t -> t -> t
end
The bloom filter currently has two implementations:
SparseSet
SparseSet is implemented by storing all integers using a list.
module SparseSet : (memset with type elt = int) = struct
include Set.Make(struct
let compare = Pervasives.compare
type t = int
end)
let from_list l = List.fold_left (fun acc x -> add x acc) empty l
end
BoolSet
Implements the bloom filter by storing an array of booleans, where an integer is a member of the set if its corresponding index = true.
module BoolSet : (memset with type elt = int) = struct
type elt = int
type t = bool array
(* implementation details hidden for clarity's sake *)
end
In order to store whether an item exists in a set or not that isn't an integer, I define a hasher signature:
module type hasher = sig
type t (* the type of elements that are being hashed *)
val hashes : t -> int list
end
Finally, I define a Filter functor that accepts a bloom filter implementation and a hasher. To add an item, an item is hashed using three different methods to produce 3 integers. The three integers are stored in the underlying memset module passed to the Filter functor. To check if an item exists in the set, its 3 hashes are obtained, and checked. If all three hash integers exist in the set, the item is contained in the set. The filter functor allows the implementation of the bloom set, and the hash method to be swapped out:
module Filter (S : memset) (H : hasher)
: memset
with type elt = H.t
with type t = S.t = struct
type elt = H.t
type t = S.t
let mem x arr = [] = List.filter (fun y -> not (S.mem y arr)) (H.hashes x)
let empty = S.empty
let is_empty = S.is_empty
let add x arr = List.fold_left (fun acc x -> S.add x acc) empty (H.hashes x)
let add x arr = empty
let from_list l = S.from_list l
let union l1 l2 = S.union l1 l2
let inter l1 l2 = S.inter l1 l2
end
When I try to compile this program I get the following compile-time error that occurs at the mem, add, and from_list functions in the Filter functor:
File "bloom.ml", line 75, characters 66-78:
Error: This expression has type int list but an expression was expected of type
S.elt list
Type int is not compatible with type S.elt
For some reason the type isn't getting passed through correctly in the Filter module. Anyone have any suggestions on how to fix this? I've been tearing my hair out trying to figure it out.
The line
module Filter (S : memset) (H : hasher) = ...
means that the functor should work for any memset, independently of the type of elements S.elt. However, the functor body assumes that S.elt is int, leading to the type error:
Type int is not compatible with type S.elt
You can fix this issue by precising the type of S.elt in the argument signature:
module Filter (S : memset with type elt = int) (H : hasher) = ...
Related
I want to write a comparable set as below.
signature COMPARABLE_SET=
sig
type 'a set
val empty: 'a set
val insert: 'a * 'a set -> 'a set
val member: 'a * 'a set -> bool
end
I need to limit the element in 'a set type to be comparable:(there is a function with type:'a * 'a -> order).
How to achieve it?
If you want to do it in OCaml, this is simply a functor case :
First, you need to define the type of your elements :
module type OrderedType = sig
type t
val compare : t -> t -> int
end
And then you'll define a functor on this type :
module MakeComparableSet (Ord : OrderedType) :
sig
type elt = Ord.t
type t
val empty : t
val insert : elt -> t -> t
val member : elt -> t -> bool
end = struct
type elt = Ord.t
type t
let empty = failwith "TODO"
let insert = failwith "TODO"
let member = failwith "TODO"
end
Which is exactly what is made here.
You can see a functor as a function on module that will create new modules. Here, the functor ComparableSet takes a module of signature OrderedType and returns a module that is a set.
Function specification:
Write a function any_zeroes : int list -> bool that returns true if and only if the input list contains at least one 0
Code:
let any_zeroes l: int list =
List.exists 0 l
Error:
This expression has type int but an expression was expected of type
'a -> bool
I don't know why Ocaml is having an issue with the 0 when I marked l to be an int list. If anyone could help me fix the issue this would be greatly appreciated!
Thanks!
So, first of all, you didn't mark l as int list, the syntax:
let any_zeroes l: int list
Means that the any_zeroes is a function, that returns an int list. A correct way to annotate it, is the following:
let any_zeroes (l : int list) : bool
Second, the fact, that you mark something doesn't change the semantics of a program. It is a type constraint, that tells the type inference system, that you want this type to be unified to whatever you specified. If a type checker can't do this, it will bail out with an error. And type checker don't need your constraints, they are mostly added for readability. (I think they are also required by the course that you're taking).
Finally, the error points you not to the l (that, as you think, was annotated), but to the 0. And the message tells you, that List.exists function is accepting a function of type 'a -> bool as the first argument, but you're trying to feed it with 0 that has type int. So, the type system is trying to unify int and 'a list, and there is no such 'a that int = 'a list, so it doesn't type check. So you either need to pass a function, or to use List.mem as was suggested by Anton.
The type annotation let any_zeroes l: int list = ... means the type of any_zeroes l is int list; this is not what you mean here.
The correct type annotation related to your specification is:
let any_zeroes
: int list -> bool
= fun l -> List.exists 0 l
In the top level, it feedbacks:
= fun l -> List.exists 0 l;;
^
This expression has type int but an expression was expected of type
'a -> bool
Indeed, this expression fails to typecheck because of the type of List.exists:
# List.exists;;
- : ('a -> bool) -> 'a list -> bool = <fun>
The first argument is a predicate, which 0 is not.
A correct implementation is:
let any_zeroes
: int list -> bool
= let is_zero x = x = 0 in
fun l -> List.exists is_zero l
I would like to iterate over all combinations of elements from a list of lists which have the same length but not necessarily the same type. This is like the cartesian product of two lists (which is easy to do in OCaml), but for an arbitrary number of lists.
First I tried to write a general cartesian (outer) product function which takes a list of lists and returns a list of tuples, but that can't work because the input list of lists would not have elements of the same type.
Now I'm down to a function of the type
'a list * 'b list * 'c list -> ('a * 'b * 'c) list
which unfortunately fixes the number of inputs to three (for example). It's
let outer3 (l1, l2, l3) =
let open List in
l1 |> map (fun e1 ->
l2 |> map (fun e2 ->
l3 |> map (fun e3 ->
(e1,e2,e3))))
|> concat |> concat
This works but it's cumbersome since it has to be redone for each number of inputs. Is there a better way to do this?
Background: I want to feed the resulting flat list to Parmap.pariter.
To solve your task for arbitrary ntuple we need to use existential types. We can use GADT, but they are close by default. Of course we can use open variants, but I prefer a little more syntactically heavy but more portable solution with first class modules (and it works because GADT can be expressed via first class modules). But enough theory, first of all we need a function that will produce the n_cartesian_product for us, with type 'a list list -> 'a list list
let rec n_cartesian_product = function
| [] -> [[]]
| x :: xs ->
let rest = n_cartesian_product xs in
List.concat (List.map (fun i -> List.map (fun rs -> i :: rs) rest) x)
Now we need to fit different types into one type 'a, and here comes existential types, let's define a signature:
module type T = sig
type t
val x : t
end
Now let's try to write a lifter to this existential:
let int x = (module struct type t = int let x = x end : T)
it has type:
int -> (module T)
Let's extend the example with few more cases:
let string x = (module struct type t = string let x = x end : T)
let char x = (module struct type t = char let x = x end : T)
let xxs = [
List.map int [1;2;3;4];
List.map string ["1"; "2"; "3"; "4"];
List.map char ['1'; '2'; '3'; '4']
]
# n_cartesian_product xxs;;
- : (module T) list list =
[[<module>; <module>; <module>]; [<module>; <module>; <module>];
[<module>; <module>; <module>]; [<module>; <module>; <module>];
...
Instead of first class modules you can use other abstractions, like objects or functions, if your type requirements allow this (e.g., if you do not need to expose the type t). Of course, our existential is very terse, and maybe you will need to extend the signature.
I used #ivg 's answer but in a version with a GADT. I reproduce it here for reference. In a simple case where only the types float and int can appear in the input lists, first set
type wrapped = Int : int -> wrapped | Float : float -> wrapped
this is a GADT without type parameter. Then
let wrap_f f = Float f
let wrap_i i = Int f
wrap types into the sum type. On wrapped value lists we can call n_cartesian_product from #ivg 's answer. The result is a list combinations: wrapped list list which is flat (for the present purposes).
Now to use Parmap, i have e.g. a worker function work : float * int * float * float -> float. To get the arguments out of the wrappers, I pattern match:
combinations |> List.map (function
| [Float f1; Int i; Float f2; Float f3] -> (f1, i, f2, f3)
| _ -> raise Invalid_argument "wrong parameter number or types")
to construct the flat list of tuples. This can be finally fed to Parmap.pariter with the worker function work.
This setup is almost the same as using a regular sum type type wrapsum = F of float | I of int instead of wrapped. The pattern matching would be the same; the only difference seems to be that getting a wrong input, e.g. (F 1, I 1, F 2.0, F, 3.0) would be detected only at runtime, not compile time as here.
I have a question about the way SML of New Jersey interprets lists:
Suppose I have a function f(x : 'a, n : int) : 'a list such that f returns a list of n copies of x, e.g. f(2,5) = [2,2,2,2,2], f(9,0) = [].
So then I go into the REPL, and I check f(9,0) = nil, and it returns true. From this, I assumed that you could use list = nil to check whether a list is the empty list. I used this in a function, and it wouldn't run. I ended up learning that the type definitions are different:
sml:121.2-123.10 Error: operator and operand don't agree [equality type required]
operator domain: ''Z * ''Z
operand: 'a list * 'Y list
in expression:
xs = nil
(Where xs was my list). I then learned that the way to check if a list is the empty list is with null list. Why is this so? What's going on with nil? Can someone explain this behavior to me?
I also note that apparently (case xs = of nil is the same as checking null xs. Does this mean nil is a type?
This is an error related to polymorphism. By default, when an empty list is evaluated, it has the type 'a list. This means that the list can contain elements of any type. If you try to evaluate 1::[], you won't get a type error because of that. This is called polymorphism, it is a feature that allows your functions to take arguments of any type. This can be useful in functions like null, because you don't care about the contents of the list in that case, you only care about its length (in fact, you only care if it's empty or not).
However, you can also have empty lists with different types. You can make your function return an empty int list. In fact, you are doing so in your function.
This is the result in a trivial implementation of your function:
- fun f(x : 'a, n : int) : 'a list =
case n of
0 => []
| _ => x::f(x, n-1);
val f = fn : 'a * int -> 'a list
- f(4,5);
val it = [4,4,4,4,4] : int list
- f(4,0);
val it = [] : int list
As you can see, even if the second argument is 0, your function returns an int list. You should be able to compare it directly with an list of type 'a list.
- it = [];
val it = true : bool
However, if you try to compare two empty lists that have different types and are not type of 'a list, you should get an error. You can see an example of it below:
- [];
val it = [] : 'a list
- val list1 : int list = [];
val list1 = [] : int list
- val list2 : char list = [];
val list2 = [] : char list
- list1 = [];
val it = true : bool
- list2 = [];
val it = true : bool
- list1 = list2;
stdIn:6.1-6.14 Error: operator and operand don't agree [tycon mismatch]
operator domain: int list * int list
operand: int list * char list
in expression:
list1 = list2
Also, case xs of nil is a way of checking if a list is empty, but this is because nil (which is just a way to write []) has the type 'a list by default. (Note that case expressions don't directly return a boolean value.) Therefore, nil is not a type, but 'a list is a polymorphic type that you can compare with lists of any type, but if your empty lists don't have polymorphic type, you will get a type error, which I think what is happening in your case.
Let's say I have a list of options:
let opts = [Some 1; None; Some 4]
I'd like to convert these into an option of list, such that:
If the list contains None, the result is None
Otherwise, the various ints are collected.
It's relatively straightforward to write this for this specific case (using Core and the Monad module):
let sequence foo =
let open Option in
let open Monad_infix in
List.fold ~init:(return []) ~f:(fun acc x ->
acc >>= fun acc' ->
x >>= fun x' ->
return (x' :: acc')
) foo;;
However, as the question title suggests, I'd really like to abstract over the type constructor rather than specialising to Option. Core seems to use a functor to give the effect of a higher kinded type, but I'm not clear how I can write the function to be abstracted over the module. In Scala, I'd use an implicit context bound to require the availability of some Monad[M[_]]. I'm expecting that there's no way of implicitly passing in the module, but how would I do it explicitly? In other words, can I write something approximating this:
let sequence (module M : Monad.S) foo =
let open M in
let open M.Monad_infix in
List.fold ~init:(return []) ~f:(fun acc x ->
acc >>= fun acc' ->
x >>= fun x' ->
return (x' :: acc')
) foo;;
Is this something that can be done with first class modules?
Edit: Okay, so it didn't actually occur to me to try using that specific code, and it appears it's closer to working than I'd anticipated! Seems the syntax is in fact valid, but I get this result:
Error: This expression has type 'a M.t but an expression was expected of type 'a M.t
The type constructor M.t would escape its scope
The first part of the error seems confusing, since they match, so I'm guessing the problem is with the second - Is the problem here that the return type doesn't seem to be determined? I suppose it's dependent on the module which is passed in - is this a problem? Is there a way to fix this implementation?
First, here is a self-contained version of your code (using the legacy
List.fold_left of the standard library) for people that don't have
Core under hand and still want to try to compile your example.
module type MonadSig = sig
type 'a t
val bind : 'a t -> ('a -> 'b t) -> 'b t
val return : 'a -> 'a t
end
let sequence (module M : MonadSig) foo =
let open M in
let (>>=) = bind in
List.fold_left (fun acc x ->
acc >>= fun acc' ->
x >>= fun x' ->
return (x' :: acc')
) (return []) foo;;
The error message that you get means (the confusing first line can
be ignored) that the M.t definition is local to the M module, and
must not escape its scope, which it would do with what you're trying
to write.
This is because you are using first-class modules, that allow to
abstract on modules, but not to have dependent-looking types such as
the return type depends on the argument's module value, or at least
path (here M).
Consider this example:
module type Type = sig
type t
end
let identity (module T : Type) (x : T.t) = x
This is wrong. The error messages points on (x : T.t) and says:
Error: This pattern matches values of type T.t
but a pattern was expected which matches values of type T.t
The type constructor T.t would escape its scope
What you can do is abstract on the desired type before you abstract on the first-class module T, so that there is no escape anymore.
let identity (type a) (module T : Type with type t = a) (x : a) = x
This relies on the ability to explicitly abstract over the type variable a. Unfortunately, this feature has not been extended to abstraction over higher-kinded variables. You currently cannot write:
let sequence (type 'a m) (module M : MonadSig with 'a t = 'a m) (foo : 'a m list) =
...
The solution is to use a functor: instead of working at value level, you work at the module level, which has a richer kind language.
module MonadOps (M : MonadSig) = struct
open M
let (>>=) = bind
let sequence foo =
List.fold_left (fun acc x ->
acc >>= fun acc' ->
x >>= fun x' ->
return (x' :: acc')
) (return []) foo;;
end
Instead of having each monadic operation (sequence, map, etc.) abstract over the monad, you do a module-wide abstraction.