Cartesian (outer) product of lists in OCaml - list

I would like to iterate over all combinations of elements from a list of lists which have the same length but not necessarily the same type. This is like the cartesian product of two lists (which is easy to do in OCaml), but for an arbitrary number of lists.
First I tried to write a general cartesian (outer) product function which takes a list of lists and returns a list of tuples, but that can't work because the input list of lists would not have elements of the same type.
Now I'm down to a function of the type
'a list * 'b list * 'c list -> ('a * 'b * 'c) list
which unfortunately fixes the number of inputs to three (for example). It's
let outer3 (l1, l2, l3) =
let open List in
l1 |> map (fun e1 ->
l2 |> map (fun e2 ->
l3 |> map (fun e3 ->
(e1,e2,e3))))
|> concat |> concat
This works but it's cumbersome since it has to be redone for each number of inputs. Is there a better way to do this?
Background: I want to feed the resulting flat list to Parmap.pariter.

To solve your task for arbitrary ntuple we need to use existential types. We can use GADT, but they are close by default. Of course we can use open variants, but I prefer a little more syntactically heavy but more portable solution with first class modules (and it works because GADT can be expressed via first class modules). But enough theory, first of all we need a function that will produce the n_cartesian_product for us, with type 'a list list -> 'a list list
let rec n_cartesian_product = function
| [] -> [[]]
| x :: xs ->
let rest = n_cartesian_product xs in
List.concat (List.map (fun i -> List.map (fun rs -> i :: rs) rest) x)
Now we need to fit different types into one type 'a, and here comes existential types, let's define a signature:
module type T = sig
type t
val x : t
end
Now let's try to write a lifter to this existential:
let int x = (module struct type t = int let x = x end : T)
it has type:
int -> (module T)
Let's extend the example with few more cases:
let string x = (module struct type t = string let x = x end : T)
let char x = (module struct type t = char let x = x end : T)
let xxs = [
List.map int [1;2;3;4];
List.map string ["1"; "2"; "3"; "4"];
List.map char ['1'; '2'; '3'; '4']
]
# n_cartesian_product xxs;;
- : (module T) list list =
[[<module>; <module>; <module>]; [<module>; <module>; <module>];
[<module>; <module>; <module>]; [<module>; <module>; <module>];
...
Instead of first class modules you can use other abstractions, like objects or functions, if your type requirements allow this (e.g., if you do not need to expose the type t). Of course, our existential is very terse, and maybe you will need to extend the signature.

I used #ivg 's answer but in a version with a GADT. I reproduce it here for reference. In a simple case where only the types float and int can appear in the input lists, first set
type wrapped = Int : int -> wrapped | Float : float -> wrapped
this is a GADT without type parameter. Then
let wrap_f f = Float f
let wrap_i i = Int f
wrap types into the sum type. On wrapped value lists we can call n_cartesian_product from #ivg 's answer. The result is a list combinations: wrapped list list which is flat (for the present purposes).
Now to use Parmap, i have e.g. a worker function work : float * int * float * float -> float. To get the arguments out of the wrappers, I pattern match:
combinations |> List.map (function
| [Float f1; Int i; Float f2; Float f3] -> (f1, i, f2, f3)
| _ -> raise Invalid_argument "wrong parameter number or types")
to construct the flat list of tuples. This can be finally fed to Parmap.pariter with the worker function work.
This setup is almost the same as using a regular sum type type wrapsum = F of float | I of int instead of wrapped. The pattern matching would be the same; the only difference seems to be that getting a wrong input, e.g. (F 1, I 1, F 2.0, F, 3.0) would be detected only at runtime, not compile time as here.

Related

Subtyping for Yojson element in a yojson list

I meet an error about subtyping.
For this code, List.map (fun ((String goal_feat):> Basic.t) -> goal_feat) (goal_feats_json:> Basic.t list).
I meet the following error in vscode:
This expression cannot be coerced to type
Yojson.Basic.t =
[ Assoc of (string * Yojson.Basic.t) list
| Bool of bool
| Float of float
| Int of int
| List of Yojson.Basic.t list
| Null
| String of string ];
it has type [< String of 'a ] -> 'b but is here used with type
[< Yojson.Basic.t ].
While compiling, I meet the following error.
Error: Syntax error: ')' expected.
If I change the code to List.map (fun ((String goal_feat): Basic.t) -> goal_feat) (goal_feats_json:> Basic.t list), which useq explicit type cast instead of subtyping, then the error disappeared. I can not understand what is the problem with my code when i use subtyping. Much appreciation to anyone who could give me some help.
First of all, most likely the answer that you're looking for is
let to_strings xs =
List.map (function `String x -> x | _ -> assert false) (xs :> t list)
The compiler is telling you that your function is handling only one case and you're passing it a list that may contain many other things, so there is a possibility for runtime error. So it is better to indicate to the compiler that you know that only the variants tagged with String are expected. This is what we did in the example above. Now our function has type [> Yojson.Basic.t].
Now back to your direct question. The syntax for coercion is (expr : typeexpr), however in the fun ((String goal_feat):> Basic.t) -> goal_feat snippet, String goal_feat is a pattern, and you cannot coerce a pattern, so we shall use parenthesized pattern here it to give it the right, more general, type1, e.g.,
let exp xs =
List.map (fun (`String x : t) -> x ) (xs :> t list)
This will tell the compiler that the parameter of your function shall belong to a wider type and immediately turn the error into warning 8,
Warning 8: this pattern-matching is not exhaustive.
Here is an example of a case that is not matched:
(`Bool _|`Null|`Assoc _|`List _|`Float _|`Int _)
which says what I was saying in the first part of the post. It is usually a bad idea to leave warning 8 unattended, so I would suggest you to use the first solution, or, otherwise, find a way to prove to the compiler that your list doesn't have any other variants, e.g., you can use List.filter_map for that:
let collect_strings : t list -> [`String of string] list = fun xs ->
List.filter_map (function
| `String s -> Some (`String s)
| _ -> None) xs
And a more natural solution would be to return untagged strings (unless you really need the to be tagged, e.g., when you need to pass this list to a function that is polymorphic over [> t] (Besides, I am using t for Yojson.Basic.t to make the post shorter, but you should use the right name in your code). So here is the solution that will extract strings and make everyone happy (it will throw away values with other tags),
let collect_strings : t list -> string list = fun xs ->
List.filter_map (function
| `String s -> Some s
| _ -> None) xs
Note, that there is no need for type annotations here, and we can easily remove them to get the most general polymoprhic type:
let collect_strings xs =
List.filter_map (function
| `String s -> Some s
| _ -> None) xs
It will get the type
[> `String a] list -> 'a list
which means, a list of polymorphic variants with any tags, returning a list of objects that were tagged with the String tag.
1)It is not a limitation that coercion doesn't work on patterns, moreover it wouldn't make any sense to coerce a pattern. The coercion takes an expression with an existing type and upcasts (weakens) it to a supertype. A function parameter is not an expression, so there is nothing here to coerce. You can just annotate it with the type, e.g., fun (x : #t) -> x will say that our function expects values of type [< t] which is less general than the unannotated type 'a. To summarize, coercion is needed when you have a function that accepts an value that have a object or polymorphic variant type, and in you would like at some expressions to use it with a weakened (upcasted type) for example
type a = [`A]
type b = [`B]
type t = [a | b]
let f : t -> unit = fun _ -> ()
let example : a -> unit = fun x -> f (x :> t)
Here we have type t with two subtypes a and b. Our function f is accepting the base type t, but example is specific to a. In order to be able to use f on an object of type a we need an explicit type coercion to weaken (we lose the type information here) its type to t. Notice that, we do not change the type of x per se, so the following example still type checks:
let rec example : a -> unit = fun x -> f (x :> t); example x
I.e., we weakened the type of the argument to f but the variable x is still having the stronger type a, so we can still use it as a value of type a.

Functor does not properly inherit signature

I'm working on a bloom filter in OCaml and I'm really stumped.
First, I define a signature to interact with a bloom filter, so that the bloom filter can be implemented in multiple different ways:
module type memset = sig
type elt (* type of values stored in the set *)
type t (* abstract type used to represent a set *)
val mem : elt -> t -> bool
val empty : t
val is_empty : t -> bool
val add : elt -> t -> t
val from_list : elt list -> t
val union : t -> t -> t
val inter : t -> t -> t
end
The bloom filter currently has two implementations:
SparseSet
SparseSet is implemented by storing all integers using a list.
module SparseSet : (memset with type elt = int) = struct
include Set.Make(struct
let compare = Pervasives.compare
type t = int
end)
let from_list l = List.fold_left (fun acc x -> add x acc) empty l
end
BoolSet
Implements the bloom filter by storing an array of booleans, where an integer is a member of the set if its corresponding index = true.
module BoolSet : (memset with type elt = int) = struct
type elt = int
type t = bool array
(* implementation details hidden for clarity's sake *)
end
In order to store whether an item exists in a set or not that isn't an integer, I define a hasher signature:
module type hasher = sig
type t (* the type of elements that are being hashed *)
val hashes : t -> int list
end
Finally, I define a Filter functor that accepts a bloom filter implementation and a hasher. To add an item, an item is hashed using three different methods to produce 3 integers. The three integers are stored in the underlying memset module passed to the Filter functor. To check if an item exists in the set, its 3 hashes are obtained, and checked. If all three hash integers exist in the set, the item is contained in the set. The filter functor allows the implementation of the bloom set, and the hash method to be swapped out:
module Filter (S : memset) (H : hasher)
: memset
with type elt = H.t
with type t = S.t = struct
type elt = H.t
type t = S.t
let mem x arr = [] = List.filter (fun y -> not (S.mem y arr)) (H.hashes x)
let empty = S.empty
let is_empty = S.is_empty
let add x arr = List.fold_left (fun acc x -> S.add x acc) empty (H.hashes x)
let add x arr = empty
let from_list l = S.from_list l
let union l1 l2 = S.union l1 l2
let inter l1 l2 = S.inter l1 l2
end
When I try to compile this program I get the following compile-time error that occurs at the mem, add, and from_list functions in the Filter functor:
File "bloom.ml", line 75, characters 66-78:
Error: This expression has type int list but an expression was expected of type
S.elt list
Type int is not compatible with type S.elt
For some reason the type isn't getting passed through correctly in the Filter module. Anyone have any suggestions on how to fix this? I've been tearing my hair out trying to figure it out.
The line
module Filter (S : memset) (H : hasher) = ...
means that the functor should work for any memset, independently of the type of elements S.elt. However, the functor body assumes that S.elt is int, leading to the type error:
Type int is not compatible with type S.elt
You can fix this issue by precising the type of S.elt in the argument signature:
module Filter (S : memset with type elt = int) (H : hasher) = ...

OCaml - Expression was expected of type 'b list

I'm trying to write a function that checks whether a set (denoted by a list) is a subset of another.
I already wrote a helper function that gives me the intersection:
let rec intersect_helper a b =
match a, b with
| [], _ -> []
| _, [] -> []
| ah :: at, bh :: bt ->
if ah > bh then
intersect_helper a bt
else if ah < bh then
intersect_helper at b
else
ah :: intersect_helper at bt
I'm trying to use this inside of the subset function (if A is a subset of B, then A = A intersect B):
let subset a_ b_ =
let a = List.sort_uniq a_
and b = List.sort_uniq b_
in intersect_helper a b;;
Error: This expression has type 'a list -> 'a list but an expression was expected of type 'b list
What exactly is wrong here? I can use intersect_helper perfectly fine by itself, but calling it with lists here does not work. From what I know about 'a, it's just a placeholder for the first argument type. Shouldn't the lists also be of type 'a list?
I'm glad you could solve your own problem, but your code seems exceedingly intricate to me.
If I understood correctly, you want a function that tells whether a list is a subset of another list. Put another way, you want to know whether all elements of list a are present in list b.
Thus, the signature of your function should be
val subset : 'a list -> 'a list -> bool
The standard library comes with a variety of functions to manipulate lists.
let subset l1 l2 =
List.for_all (fun x -> List.mem x l2) l1
List.for_all checks that all elements in a list satisfy a given condition. List.mem checks whether a value is present in a list.
And there you have it. Let's check the results:
# subset [1;2;3] [4;2;3;5;1];;
- : bool = true
# subset [1;2;6] [4;2;3;5;1];;
- : bool = false
# subset [1;1;1] [1;1];; (* Doesn't work with duplicates, though. *)
- : bool = true
Remark: A tiny perk of using List.for_all is that it is a short-circuit operator. That means that it will stop whenever an item doesn't match, which results in better performance overall.
Also, since you specifically asked about sets, the standard library has a module for them. However, sets are a bit more complicated to use because they need you to create new modules using a functor.
module Int = struct
type t = int
let compare = Pervasives.compare
end
module IntSet = Set.Make(Int)
The extra overhead is worth it though, because now IntSet can use the whole Set interface, which includes the IntSet.subset function.
# IntSet.subset (IntSet.of_list [1;2;3]) (IntSet.subset [4;2;3;5;1]);;
- : bool = true
Instead of:
let a = List.sort_uniq a_
Should instead call:
let a = List.sort_uniq compare a_

ocaml function takes function as parameter and output a function

I need to find a way to combine two functions and output them as one.
I have the following code where take in a list of function ('a->'a) list then output a function ('a->'a) using the List.fold_left.
I figured out the base case, but I tried a lot of ways to combine two functions. The output should have the type ('a -> 'a) list -> ('a -> 'a).
example output:
# pipe [] 3;;
- : int = 3
# pipe [(fun x-> 2*x);(fun x -> x + 3)] 3 ;;
- : int = 9
# pipe [(fun x -> x + 3);(fun x-> 2*x)] 3;;
- : int = 12
function:
let p l =
let f acc x = fun y-> fun x->acc in (* acc & x are functions 'a->'a *)
let base = fun x->x in
List.fold_left f base l
Since you know that you have to use a left fold, you now have to solve a fairly constrained problem: given two functions of type 'a -> 'a, how do you combine them into a single function of the same type?
In practice, there is one general way of combining functions: composition. In math, this is usually written as f ∘ g where f and g are the functions. This operation produces a new function which corresponds to taking an argument, applying g to it and then applying f to the result. So if h = f ∘ g, then we can also write this as h(x) = f(g(x)).
So your function f is actually function composition. (You should really give it a better name than f.) It has to take in two functions of type 'a -> 'a and produce another function of the same type. This means it produces a function of one argument where you produce a function taking two arguments.
So you need to write a function compose (a more readable name than f) of type ('a -> 'a) -> ('a -> 'a) -> ('a -> 'a). It has to take two arguments f and g and produce a function that applies both of them to its argument.
I hope this clarifies what you need to do. Figuring out exactly how to do it in OCaml is a healthy exercise.

creat a tuple by removing first element from another tuple in ML

using ML as a programming language we have list and tuple, in the case of lists we can form a list from another list by removing or appending elements from and to the original list, for example if we have:
val x = [7,8,9] : int list
in REPL we can do some operations like the following:
- hd x;
val it = 7 : int
- tl x;
val it = [8,9] : int list
now if we have a tuple lets say:
val y = (7,8,9) :int*int*int
now the question is that , can we have a smaller tuple by removing the first element from the original tuple ? in other words , how to remove (#1 y) and have new tuple (8,9) in a similar way that we do it in the case of list.
Thanks.
Tuples are very different from lists. With lists, size need not be known at compile time, but with tuples, not only should the number of elements be known at compile time, the type of each element is independent of the others.
Take the type signature of tl:
- tl;
val it = fn : 'a list -> 'a list
It is 'a list -> 'a list - in other words tl takes a list of 'a and returns another one. Why don't we have one for tuples as well? Assume we wanted something like
y = (1,2,3);
tail y; (* returns (2,3) *)
Why does this not make sense? Think of the type signature of tail. What would it be?
In this case, it would clearly be
'a * 'b * 'c -> 'b * 'c
Takes product of an 'a, a 'b and a 'c and returns a product of
a 'b and a 'c. In ML, all functions defined must have a statically determined
type signature. It would be impossible to have a tail function for tuples that
handles all possible tuple sizes, because each tuple size is essentially a different type.
'a list
Can be the type of many kinds of lists: [1,2,3,4], or ["A", "short", "sentence"], or
[true, false, false, true, false]. In all these cases, the value of the type
variable 'a is bound to a different type. (int, string, and bool). And 'a list can be a list of any size.
But take tuples:
(1, true, "yes"); (* (int * bool * string) *)
("two", 2) (* (string, int) *)
("ok", "two", 2) (* (string, string, int) *)
Unlike list, these are all of different types. So while the type signature of all lists is simple ('a list), there is no 'common type' for all tuples - a 2-tuple has a different type from a 3-tuple.
So you'll have to do this instead:
y = (7, 8, 9);
(a, b, c) = y;
and a is your head and you can re-create the tail with (b,c).
Or create your own tail:
fun tail (a,b,c) = (b, c)
This also gives us an intuitive understanding as to why such a function would not make sense: If is impossible to define a single tail for use across all tuple types:
fun tail (a,b) = (b)
| tail (a,b,c) = (b, c) (* won't compile *)
You can also use the # shorthand to get at certain elements of the tuple:
#1 y; (* returns 7 *)
But note that #1 is not a function but a compile time shorthand.
Lists and tuples are immutable so there is no such thing like removing elements from them.
You can construct a new tuple by decomposing the original tuple. In SML, the preferred way is to use pattern matching:
fun getLastTwo (x, y, z) = (y, z)
If you like #n functions, you can use them as well:
val xyz = (7, 8, 9)
val yz = (#2 xyz, #3 xyz) (* (8, 9) *)