Convert 'a list to a Set? - ocaml

Is it really true that OCaml doesn't have a function which converts from a list to a set?
If that is the case, is it possible to make a generic function list_to_set? I've tried to make a polymorphic set without luck.

Fundamental problem: Lists can contain elements of any types. Sets (assuming you mean the Set module of the standard library), in contrary, rely on a element comparison operation to remain balanced trees. You cannot hope to convert a t list to a set if you don't have a comparison operation on t.
Practical problem: the Set module of the standard library is functorized: it takes as input a module representing your element type and its comparison operation, and produces as output a module representing the set. Making this work with the simple parametric polymoprhism of lists is a bit sport.
To do this, the easiest way is to wrap your set_of_list function in a functor, so that it is itself parametrized by a comparison function.
module SetOfList (E : Set.OrderedType) = struct
module S = Set.Make(E)
let set_of_list li =
List.fold_left (fun set elem -> S.add elem set) S.empty li
end
You can then use for example with the String module, which provides a suitable compare function.
module SoL = SetOfList(String);;
SoL.S.cardinal (SoL.set_of_list ["foo"; "bar"; "baz"]);; (* returns 3 *)
It is also possible to use different implementation of sets which are non-functorized, such as Batteries and Extlib 'PSet' implementation (documentation). The functorized design is advised because it has better typing guarantees -- you can't mix sets of the same element type using different comparison operations.
NB: of course, if you already have a given set module, instantiated form the Set.Make functor, you don't need all this; but you conversion function won't be polymorphic. For example assume I have the StringSet module defined in my code:
module StringSet = Set.Make(String)
Then I can write stringset_of_list easily, using StringSet.add and StringSet.empty:
let stringset_of_list li =
List.fold_left (fun set elem -> StringSet.add elem set) StringSet.empty li
In case you're not familiar with folds, here is a direct, non tail-recursive recursive version:
let rec stringset_of_list = function
| [] -> StringSet.empty
| hd::tl -> StringSet.add hd (stringset_of_list tl)

Ocaml 3.12 has extensions (7,13 Explicit naming of type variables and 7,14 First-class modules) that make it possible to instantiate and pass around modules for polymorphic values.
In this example, the make_set function returns a Set module for a given comparison function and the build_demo function constructs a set given a module and a list of values:
let make_set (type a) compare =
let module Ord = struct
type t = a
let compare = compare
end
in (module Set.Make (Ord) : Set.S with type elt = a)
let build_demo (type a) set_module xs =
let module S = (val set_module : Set.S with type elt = a) in
let set = List.fold_right S.add xs S.empty in
Printf.printf "%b\n" (S.cardinal set = List.length xs)
let demo (type a) xs = build_demo (make_set compare) xs
let _ = begin demo ['a', 'b', 'c']; demo [1, 2, 3]; end
This doesn't fully solve the problem, though, because the compiler doesn't allow the return value to have a type that depends on the module argument:
let list_to_set (type a) set_module xs =
let module S = (val set_module : Set.S with type elt = a) in
List.fold_right S.add xs S.empty
Error: This `let module' expression has type S.t
In this type, the locally bound module name S escapes its scope
A possible work-around is to return a collection of functions that operate on the hidden set value:
let list_to_add_mem_set (type a) set_module xs =
let module S = (val set_module : Set.S with type elt = a) in
let set = ref (List.fold_right S.add xs S.empty) in
let add x = set := S.add x !set in
let mem x = S.mem x !set in
(add, mem)

If you don't mind a very crude approach, you can use the polymorphic hash table interface. A hash table with an element type of unit is just a set.
# let set_of_list l =
let res = Hashtbl.create (List.length l)
in let () = List.iter (fun x -> Hashtbl.add res x ()) l
in res;;
val set_of_list : 'a list -> ('a, unit) Hashtbl.t = <fun>
# let a = set_of_list [3;5;7];;
val a : (int, unit) Hashtbl.t = <abstr>
# let b = set_of_list ["yes";"no"];;
val b : (string, unit) Hashtbl.t = <abstr>
# Hashtbl.mem a 5;;
- : bool = true
# Hashtbl.mem a 6;;
- : bool = false
# Hashtbl.mem b "no";;
- : bool = true
If you just need to test membership, this might be good enough. If you wanted other set operations (like union and intersection) this isn't a very nice solution. And it's definitely not very elegant from a typing standpoint.

Just extend the original type, as shown in
http://www.ffconsultancy.com/ocaml/benefits/modules.html
for the List module:
module StringSet = Set.Make (* define basic type *)
(struct
type t = string
let compare = Pervasives.compare
end)
module StringSet = struct (* extend type with more operations *)
include StringSet
let of_list l =
List.fold_left
(fun s e -> StringSet.add e s)
StringSet.empty l
end;;

Using the core library you could do something like:
let list_to_set l =
List.fold l ~init:(Set.empty ~comparator:Comparator.Poly.comparator)
~f:Set.add |> Set.to_list
So for example:
list_to_set [4;6;3;6;3;4;3;8;2]
-> [2; 3; 4; 6; 8]
Or:
list_to_set ["d";"g";"d";"a"]
-> ["a"; "d"; "g"]

Related

Pretty-print a Hashtbl in OCaml to work with ppx-deriving

I'm trying to pretty print a custom record type that contains a Hashtable (using the Base standard library) in OCaml with ppx-deriving, but I need to implement Hashtbl.pp for it to work.
I've tried looking at examples online and the best one that I found is https://github.com/ocaml-ppx/ppx_deriving#testing-plugins, but I'm still getting strange type errors like "This function has type Formatter.t -> (string, Value.t) Base.Hashtbl.t -> unit. It is applied to too many arguments; maybe you forgot a `;'"
How do you extend the Hashtbl module with a pp function?
Here is my code so far (Value.t is a custom type which I successfully annotated with [##deriving show]:
open Base
(* Extend Hashtbl with a custom pretty-printer *)
module Hashtbl = struct
include Hashtbl
let rec (pp : Formatter.t -> (string, Value.t) Hashtbl.t -> Ppx_deriving_runtime.unit) =
fun fmt -> function
| ht ->
List.iter
~f:(fun (str, value) ->
Caml.Format.fprintf fmt "#[<1>%s: %s#]#." str (Value.string_of value))
(Hashtbl.to_alist ht)
and show : (string, Value.t) Hashtbl.t -> Ppx_deriving_runtime.string =
fun s -> Caml.Format.asprintf "%a" pp s
;;
end
type t =
{ values : (string, Value.t) Hashtbl.t
; enclosing : t option
}
[##deriving show]
Solution 1
The type of the values field of your record is a parametrized with two type variables, therefore the deriver is trying to use a general pp function that is parametrized by the key and data pretty-printers, e.g., the following will enable show for any hashtable (with any key and any value, as long as keys and values are showable,
module Hashtbl = struct
include Base.Hashtbl
let pp pp_key pp_value ppf values =
Hashtbl.iteri values ~f:(fun ~key ~data ->
Format.fprintf ppf "#[<1>%a: %a#]#." pp_key key pp_value data)
end
so you can finally define your type
type t = {
values : (string,Value.t) Hashtbl.t;
enclosing : t option;
} [##deriving show]
Solution 2 (recommended)
However, I would suggest another approach that instead of creating a general Hashtable module, creates a specialized Values module, e.g.,
module Values = struct
type t = (string, Value.t) Hashtbl.t
let pp ppf values =
Hashtbl.iteri values ~f:(fun ~key ~data ->
Format.fprintf ppf "#[<1>%s: %s#]#." key (Value.to_string data))
end
Now you can use it as,
type t = {
values : Values.t;
enclosing : t option;
} [##deriving show]
Solution 3
If you still want a generic printable hash table, then I would advise against using the include statement, but, instead, implement just the required printable interface for the ('k,'s) Hashtbl.t type, e.g.,
module Hashtbl_printable = struct
type ('k,'s) t = ('k, 's) Hashtbl.t
let pp pp_key pp_value ppf values =
Hashtbl.iteri values ~f:(fun ~key ~data ->
Format.fprintf ppf "#[<1>%a: %a#]#." pp_key key pp_value data)
end
type t = {
values : (string, Value.t) Hashtbl_printable.t;
enclosing : t option;
} [##deriving show]

Java GuardTypes analogy for OCaml

How do you do, Stackoverflow!
In Java practice there are some issues concerning partially defined functions. Sometimes it's convinient to separate an error handling from the calculation itself. We may utilize an approach called "Guard types" or "Guard decorators".
Consider the simple synthetic example: to guard the null reference. This can be done with the aid of the next class
public class NonNull<T> {
public take() {
return null != this.ref ? this.ref : throw new ExcptionOfMine("message");
}
public NotNull(T ref_) {
this.ref = ref_;
}
private T ref;
}
The question is:
Is there a way to implement the same "Guard type" in OCaml without touching its object model? I believe for the OCaml as the functional programming language to possess enough abstraction methods without objec-oriented technics.
You can use an abstract type to get the same effect. OCaml has no problem with null pointers. So say instead you want to represent a nonempty list in the same way as above. I.e., you want to be able to create values that are empty, but only complain when the person tries to access the value.
module G :
sig type 'a t
val make : 'a list -> 'a t
val take : 'a t -> 'a list
end =
struct
type 'a t = 'a list
let make x = x
let take x = if x = [] then raise (Invalid_argument "take") else x
end
Here's how it looks when you use the module:
$ ocaml
OCaml version 4.02.1
# #use "m.ml";;
module G :
sig type 'a t val make : 'a list -> 'a t val take : 'a t -> 'a list end
# let x = G.make [4];;
val x : int G.t = <abstr>
# G.take x;;
- : int list = [4]
# let y = G.make [];;
val y : '_a G.t = <abstr>
# G.take y;;
Exception: Invalid_argument "take".
There's a concept of Optional types, on which you can effectively pattern match. Example:
let optional = Some 20
let value =
match optional with
| Some v -> v
| None -> 0
You can use simple closures
let guard_list v =
fun () ->
if v = [] then failwith "Empty list"
else v
let () =
let a = guard_list [1;2;3] in
let b = guard_list [] in
print_int (List.length (a ())); (* prints 3 *)
print_int (List.length (b ())) (* throws Failure "Empty list" *)
or lazy values
let guard_string v = lazy begin
if v = "" then failwith "Empty string"
else v
end
let () =
let a = guard_string "Foo" in
let b = guard_string "" in
print_endline (Lazy.force a); (* prints "Foo" *)
print_endline (Lazy.force b) (* throws Failure "Empty string" *)

heterogeneous lists through flexible types

I am trying to stick heterogeneous types in a list making use of flexible types
type IFilter<'a> =
abstract member Filter: 'a -> 'a
type Cap<'a when 'a: comparison> (cap) =
interface IFilter<'a> with
member this.Filter x =
if x < cap
then x
else cap
type Floor<'a when 'a: comparison> (floor) =
interface IFilter<'a> with
member this.Filter x =
if x > floor
then x
else floor
type Calculator<'a, 'b when 'b:> IFilter<'a>> (aFilter: 'b, operation: 'a -> 'a) =
member this.Calculate x =
let y = x |> operation
aFilter.Filter y
type TowerControl<'a> () =
let mutable calculationStack = List.empty
member this.addCalculation (x: Calculator<'a, #IFilter<'a>> ) =
let newList = x::calculationStack
calculationStack <- newList
let floor10 = Floor<int> 10
let calc1 = Calculator<int, Floor<int>> (floor10, ((+) 10))
let cap10 = Cap 10
let calc2 = Calculator (cap10, ((-) 5))
let tower = TowerControl<int> ()
tower.addCalculation calc1
tower.addCalculation calc2
In the example above
member this.addCalculation (x: Calculator<'a, #IFiler<'a>> ) =
produces the error
error FS0670: This code is not sufficiently generic. The type variable 'a could not be generalized because it would escape its scope.
Apologies if a similar question has already been posted.
Thank you.
There's no easy way to do this. It looks like you really want calculationStack to have type:
(∃('t:>IFilter<'a>).Calculator<'a, 't>) list
but F# doesn't provide existential types. You can use the "double-negation encoding" ∃'t.f<'t> = ∀'x.(∀'t.f<'t>->'x)->'x to come up with the following workaround:
// helper type representing ∀'t.Calculator<'t>->'x
type AnyCalc<'x,'a> = abstract Apply<'t when 't :> IFilter<'a>> : Calculator<'a,'t> -> 'x
// type representing ∃('t:>IFilter<'a>).Calculator<'a, 't>
type ExCalc<'a> = abstract Apply : AnyCalc<'x,'a> -> 'x
// packs a particular Calculator<'a,'t> into an ExCalc<'a>
let pack f = { new ExCalc<'a> with member this.Apply(i) = i.Apply f }
// all packing and unpacking hidden here
type TowerControl<'a> () =
let mutable calculationStack = List.empty
// note: type inferred correctly!
member this.addCalculation x =
let newList = (pack x)::calculationStack
calculationStack <- newList
// added this to show how to unpack the calculations for application
member this.SequenceCalculations (v:'a) =
calculationStack |> List.fold (fun v i -> i.Apply { new AnyCalc<_,_> with member this.Apply c = c.Calculate v }) v
// the remaining code is untouched
let floor10 = Floor<int> 10
let calc1 = Calculator<int, Floor<int>> (floor10, ((+) 10))
let cap10 = Cap 10
let calc2 = Calculator (cap10, ((-) 5))
let tower = TowerControl<int> ()
tower.addCalculation calc1
tower.addCalculation calc2
This has the big advantage that it works without modifying the Calculator<_,_> type, and that the semantics are exactly what you want, but the following disadvantages:
It's hard to follow if you're unfamiliar with this way of encoding existentials.
Even if you are familiar, there's a lot of ugly boilerplate (the two helper types) since F# doesn't allow anonymous universal qualification either. That is, even given that F# doesn't directly support existential types, it would be much easier to read if you could write something like:
type ExCalc<'a> = ∀'x.(∀('t:>IFilter<'a>).Calculator<'a,'t>->'x)->'x
let pack (c:Calculator<'a,'t>) : ExCalc<'a> = fun f -> f c
type TowerControl<'a>() =
...
member this.SequenceCalcualtions (v:'a) =
calculationStack |> List.fold (fun v i -> i (fun c -> c.Calculate v)) v
But instead we've got to come up with names for both helper types and their single methods. This ends up making the code hard to follow, even for someone already familiar with the general technique.
On the off chance that you own the Calculator<_,_> class, there's a much simpler solution that might work (it may also depend on the signatures of the methods of the real Calcuator<,> class, if it's more complex than what you've presented here): introduce an ICalculator<'a> interface, have Calculator<_,_> implement that, and make calculationStack a list of values of that interface type. This will be much more straightforward and easier for people to understand, but is only possible if you own Calculator<_,_> (or if there's already an existing interface you can piggy back on). You can even make the interface private, so that only your code is aware of its existence. Here's how that would look:
type private ICalculator<'a> = abstract Calculate : 'a -> 'a
type Calculator<'a, 'b when 'b:> IFilter<'a>> (aFilter: 'b, operation: 'a -> 'a) =
member this.Calculate x =
let y = x |> operation
aFilter.Filter y
interface ICalculator<'a> with
member this.Calculate x = this.Calculate x
type TowerControl<'a> () =
let mutable calculationStack = List.empty
member this.addCalculation (x: Calculator<'a, #IFilter<'a>> ) =
let newList = (x :> ICalculator<'a>)::calculationStack
calculationStack <- newList
member this.SequenceCalculations (v:'a) =
calculationStack |> List.fold (fun v c -> c.Calculate v) v

Returning first value of list of tuples

I'm learning to deal with Lists and Tuples in F# and a problem came up. I have two lists: one of names and one with names,ages.
let namesToFind = [ "john", "andrea" ]
let namesAndAges = [ ("john", 10); ("andrea", 15) ]
I'm trying to create a function that will return the first age found in namesAndAges given namesToFind. Just the first.
So far I have the following code which returns the entire tuple ("john", 10).
let findInList source target =
let itemFound = seq { for n in source do
yield target |> List.filter (fun (x,y) -> x = n) }
|> Seq.head
itemFound
I tried using fst() in the returning statement but it does not compile and gives me "This expression was expected to have type 'a * 'b but here has type ('c * 'd) list"
Thanks for any help!
There are lots of functions in the Collections.List module that can be used. Since there are no break or a real return statement in F#, it is often better to use some search function, or write a recursive loop-function. Here is an example:
let namesToFind = [ "john"; "andrea" ]
let namesAndAges = [ "john", 10; "andrea", 15 ]
let findInList source target =
List.pick (fun x -> List.tryFind (fun (y,_) -> x = y) target) source
findInList namesToFind namesAndAges
The findInList function is composed of two functions from the Collections.List module.
First we have the List.tryFind predicate list function, which returns the first item for which the given predicate function returns true.
The result is in the form of an option type, which can take two values: None and Some(x). It is used for functions that sometimes give no useful result.
The signature is: tryFind : ('T -> bool) -> 'T list -> 'T option, where 'T is the item type, and ('T -> bool) is the predicate function type.
In this case it will search trough the target list, looking for tuples where the first element (y) equals the variable x from the outer function.
Then we have the List.pick mapper list function, which applies the mapper-function to each one, until the first result that is not None, which is returned.
This function will not return an option value, but will instead throw an exception if no item is found. There is also an option-variant of this function named List.tryPick.
The signature is: pick : ('T -> 'U option) -> 'T list -> 'U, where 'T is the item type, 'U is the result type, and ('T -> 'U option) is the mapping function type.
In this case it will go through the source-list, looking for matches in the target array (via List.tryFind) for each one, and will stop at the first match.
If you want to write the loops explicitly, here is how it could look:
let findInList source target =
let rec loop names =
match names with
| (name1::xs) -> // Look at the current item in the
// source list, and see if there are
// any matches in the target list.
let rec loop2 tuples =
match tuples with
| ((name2,age)::ys) -> // Look at the current tuple in
// the target list, and see if
// it matches the current item.
if name1 = name2 then
Some (name2, age) // Found a match!
else
loop2 ys // Nothing yet; Continue looking.
| [] -> None // No more items, return "nothing"
match loop2 target with // Start the loop
| Some (name, age) -> (name, age) // Found a match!
| None -> loop rest // Nothing yet; Continue looking.
| [] -> failwith "No name found" // No more items.
// Start the loop
loop source
(xs and ys are common ways of writing lists or sequences of items)
First let's look at your code and annotate all the types:
let findInList source target =
let itemFound =
seq {
for n in source do
yield target |> List.filter (fun (x,y) -> x = n) }
|> Seq.head
itemFound
The statement yield List.Filter ... means you're creating a sequence of lists: seq<list<'a * 'b>>.
The statement Seq.head takes the first element from your sequence of lists: list<'a * 'b>.
So the whole function returns a list<'a * 'b>, which is obviously not the right type for your function. I think you intended to write something like this:
let findInList source target =
let itemFound =
target // list<'a * 'b>
|> List.filter (fun (x,y) -> x = n) // list<'a * 'b>
|> Seq.head // 'a * 'b
itemFound // function returns 'a * 'b
There are lots of ways you can get the results you want. Your code is already half way there. In place of filtering by hand, I recommend using the built in val Seq.find : (a' -> bool) -> seq<'a> -> 'a method:
let findAge l name = l |> Seq.find (fun (a, b) -> a = name) |> snd
Or you can try using a different data structure like a Map<'key, 'value>:
> let namesAndAges = [ ("john", 10); ("andrea", 15) ] |> Map.ofList;;
val namesAndAges : Map<string,int> = map [("andrea", 15); ("john", 10)]
> namesAndAges.["john"];;
val it : int = 10
If you want to write it by hand, then try this with your seq expression:
let findInList source target =
seq {
for (x, y) in source do
if x = target then
yield y}
|> Seq.head
Like fst use this(below) . This way you can access all the values.
This is from F# interactive
let a = ((1,2), (3,4));
let b = snd (fst a);;
//interactive output below.
val a : (int * int) * (int * int) = ((1, 2), (3, 4))
val b : int = 2

Functor implementation issue

The aim of the module described below is to implement a module which once initiated by an integer n does all the operations based on the value of n.
module ReturnSetZero =
functor ( Elt : int ) ->
struct
let rec sublist b e l =
match l with
[] -> failwith "sublist"
| h :: t ->
let tail = if e = 0 then [] else sublist (b - 1) (e - 1) t in
if b > 0 then tail else h :: tail
let rec zerol = 0:: zerol
let zeron = sublist 0 n zerol
(*other operations based on n which is selected once when the module is initialized*)
end;;
Error: Unbound module type int
What is the issue here? Is there an alternate implementation which is more effective/intuitive?
A functor maps modules to modules. An integer is not a module, so you cannot use it to as a functor parameter.
You need to define a module type:
module type WITH_INTEGER = sig
val integer : int
end
module PrintInteger =
functor (Int:WITH_INTEGER) -> struct
let print_my_integer () = print_int Int.integer
end
Of course, unless your module needs to expose types that are dependent on the value of the integer (or you have to expose a lot of values dependent on that integer), you're probably better off with a plain function that takes that integer as an argument:
let my_function integer =
let data = complex_precomputations integer in
function arg -> do_something_with arg data
This lets you run the complex pre-computations only once on the integer (when you pass it to the function).