How to run Ocaml simulations - ocaml

If I want to run a simulation of the abstract machine on the code below, how do I know what would be in the workspace, stack, and heap?
let rec map (f: 'a -> 'b) (y: 'a list): 'b list =
begin match y with
| [] -> []
| h :: t -> (f h) :: (map f t)
end in
let x = map (fun t -> t + 1) [0; 1; 2] in
0 :: x

You're using terminology that we (people reading StackOverflow) don't share. In particular, we don't know what abstract machine you're talking about.
Speaking very generally about computer systems, the workspace usually contains the current definitions. At the beginning it might contain predefined functions and so on. Your code uses some predefined type names (like 'a list) and constructors (like ::). I don't know if they need to be in your workspace specifically.
Very possibly your definitions of map and x would need to be loaded into the workspace as the first step.
The stack contains a record of the functions that have been called but haven't returned. In general it will be empty when you start an evaluation.
The heap is a general term for the set of values that exist at the moment but might disappear later (when no longer needed). Unless you count your values named map and x, the heap could also be empty at the beginning.
Sorry I can't be more specific. It sounds like you're taking a class, and you might want to consult some of the class resources (including the professor or TA :-)

Related

How does this nested fold_left work? and what is ~f: and ~init:?

I have this code snippet in Ocaml which is taken from here. I know it fills a data structure for a demand (traffic matrix) with a the specified value and when the two hosts are the same it just fill the value with 0. In python or in any imerative language, we would use two for loop one inside another to do the task. I assume this is the reason we have two (fold_left) in this code in which each one is equivilant to a one for loop (I might be mistaken!). My question is how this code works? and what is ~f: and ~init:? are these labels. If they are labels why the compiler complains when I remove them or when I change them? even when I put these arguments in the right order?!
I have finished one book and have watched alot of youtube videos but still find it difficult to understand most of Ocaml code.
let create_3cycle_input () =
let topo = Net.Parse.from_dotfile "./data/topologies/3cycle.dot" in
let hosts = get_hosts topo in
let demands =
List.fold_left
hosts
~init:SrcDstMap.empty
~f:(fun acc u ->
List.fold_left
hosts
~init:acc
~f:(fun acc v ->
let r = if u = v then 0.0 else 53. in
SrcDstMap.set acc ~key:(u,v) ~data:r)) in
(hosts,topo,demands);;
Please, read my another SO answer that explains how fold_left works. Once you understand how a single fold works, we can move forward to the nested case (as well as to the labels).
When you have a collection of collections, i.e., when an element of a collection is another collection by itself, and you want to iterate over each element of those inner collections than you need to nest your folds. A good example, are matrices which could be seen as collections of vectors, where vectors are by themselves are also collections.
The iteration algorithm is simple,
state := init
for each inner-collection in outer-collection do
for each element in inner-collection do
state := user-function(state, element)
done
done
Or, the same in OCaml (using the Core version of the fold)
let fold_list_of_lists outer ~init ~f =
List.fold outer ~init ~f:(fun state inner ->
List.fold inner ~init:state ~f:(fun state elt ->
f state elt)
This function will have type 'a list list -> init:'b -> f:('b -> 'a -> 'b) -> 'b
and will be applicable to any list of lists.
Concerning the labels and their removal. The labels are keyworded arguments and enable passing arguments to a function in an arbitrary manner, which is very useful when you have so many arguments. Removing labels is sometimes possible, but could be disabled using a compiler option. And the Core library (which is used by the project that you have referenced) is disabling removing the labels, probably for the good sake.
In general, labels could be omitted if the application is total, i.e., when the returned value is not a function by itself. Since fold_left returns a type variable, it could always be a function, therefore we always need to use labels with the Core's List.fold (and List.fold_left) function.

Pack consecutive duplicates of list elements into sublists in Ocaml

I found this problem in the website 99 problems in ocaml. After some thinking I solved it by breaking the problem into a few smaller subproblems. Here is my code:
let rec frequency x l=
match l with
|[]-> 0
|h::t-> if x=[h] then 1+(frequency x t)
else frequency x t
;;
let rec expand x n=
match n with
|0->[]
|1-> x
|_-> (expand x (n-1)) # x
;;
let rec deduct a b=
match b with
|[]-> []
|h::t -> if a=[h] then (deduct a t)
else [h]# (deduct a t)
;;
let rec pack l=
match l with
|[]-> []
|h::t -> [(expand [h] (frequency [h] l))]# (pack (deduct [h] t))
;;
It is rather clear that this implementation is overkill, as I have to count the frequency of every element in the list, expand this and remove the identical elements from the list, then repeat the procedure. The algorithm complexity is about O(N*(N+N+N))=O(N^2) and would not work with large lists, even though it achieved the required purpose. I tried to read the official solution on the website, which says:
# let pack list =
let rec aux current acc = function
| [] -> [] (* Can only be reached if original list is empty *)
| [x] -> (x :: current) :: acc
| a :: (b :: _ as t) ->
if a = b then aux (a :: current) acc t
else aux [] ((a :: current) :: acc) t in
List.rev (aux [] [] list);;
val pack : 'a list -> 'a list list = <fun>
the code should be better as it is more concise and does the same thing. But I am confused with the use of "aux current acc" in the inside. It seems to me that the author has created a new function inside of the "pack" function and after some elaborate procedure was able to get the desired result using List.rev which reverses the list. What I do not understand is:
1) What is the point of using this, which makes the code very hard to read on first sight?
2) What is the benefit of using an accumulator and an auxiliary function inside of another function which takes 3 inputs? Did the author implicitly used tail recursion or something?
3) Is there anyway to modify the program so that it can pack all duplicates like my program?
These are questions mostly of opinion rather than fact.
1) Your code is far harder to understand, in my opinion.
2a) It's very common to use auxiliary functions in OCaml and other functional languages. You should think of it more like nested curly braces in a C-like language rather than as something strange.
2b) Yes, the code is using tail recursion, which yours doesn't. You might try giving your code a list of (say) 200,000 distinct elements. Then try the same with the official solution. You might try determining the longest list of distinct values your code can handle, then try timing the two different implementations for that length.
2c) In order to write a tail-recursive function, it's sometimes necessary to reverse the result at the end. This just adds a linear cost, which is often not enough to notice.
3) I suspect your code doesn't solve the problem as given. If you're only supposed to compress adjacent elements, your code doesn't do this. If you wanted to do what your code does with the official solution you could sort the list beforehand. Or you could use a map or hashtable to keep counts.
Generally speaking, the official solution is far better than yours in many ways. Again, you're asking for an opinion and this is mine.
Update
The official solution uses an auxiliary function named aux that takes three parameters: the currently accumulated sublist (some number of repetitions of the same value), the currently accumulated result (in reverse order), and the remaining input to be processed.
The invariant is that all the values in the first parameter (named current) are the same as the head value of the unprocessed list. Initially this is true because current is empty.
The function looks at the first two elements of the unprocessed list. If they're the same, it adds the first of them to the beginning of current and continues with the tail of the list (all but the first). If they're different, it wants to start accumulating a different value in current. It does this by adding current (with the one extra value added to the front) to the accumulated result, then continuing to process the tail with an empty value for current. Note that both of these maintain the invariant.

Function- and Type substitutions or Views in Coq

I proved some theorems about lists, and extracted algorithms from them. Now I want to use heaps instead, because lookup and concatenation are faster. What I currently do to achieve this is to just use custom definitions for the extracted list type. I would like to do this in a more formal way, but ideally without having to redo all of my proofs. Lets say I have a type
Heap : Set -> Set
and an isomorphism
f : forall A, Heap A -> List A.
Furthermore, I have functions H_app and H_nth, such that
H_app (f a) (f b) = f (a ++ b)
and
H_nth (f a) = nth a
On the one hand, I would have to replace every list-recursion by a specialized function that mimics list recursion. On the other hand, beforehand I would want to replace ++ and nth by H_app and H_nth, so the extracted algorithms would be faster. The problem is that I use tactics like simpl and compute in some places, which will probably fail if I just replace everything in the proof code. It would be good to have a possibility to "overload" the functions afterwards.
Is something like this possible?
Edit: To clarify, a similar problem arises with numbers: I have some old proofs that use nat, but the numbers are getting too large. Using BinNat would be better, but is it possible to use BinNat instead of nat also in the old proofs without too much modification? (And especially, replace inefficient usages of + by the more efficient definition for BinNat?)
Just for the sake of clarity, I take it that Heap must look like
this:
Inductive Heap A : Type :=
| Node : Heap A -> A -> Heap A -> Heap A
| Leaf : Heap A.
with f being defined as
Fixpoint f A (h : Heap A) : list A :=
match h with
| Node h1 a h2 => f h1 ++ a :: f h2
| Leaf => []
end.
If this is the case, then f does not define an isomorphism between
Heap A and list A for all A. Instead, we can find a function
g : forall A, list A -> Heap A such that
forall A (l : list A), f (g l) = l
Nevertheless, we would like to say that both Heap and list are
equivalent in some sense when they are used to implement the same
abstraction, namely sets of elements of some type.
There is a precise and formal way in which we can validate this idea
in languages that have parametric polymorphism, such as Coq. This
principle, known as parametricity, roughly says that
parametrically polymorphic functions respect relations that we impose
on types we instantiate them with.
This is a little bit abstract, so let's try to make it more
concrete. Suppose that you have a function over lists (say, foo)
that uses only ++ and nth. To be able to replace foo by an
equivalent version on Heap using parametricity, we need to make
foo's definition polymorphic, abstracting over the functions over
lists:
Definition foo (T : Set -> Set)
(app : forall A, T A -> T A -> T A)
(nth : forall A, T A -> nat -> option A)
A (l : T A) : T A :=
(* ... *)
You would first prove properties of foo by instantiating it over
lists:
Definition list_foo := foo list #app #nth.
Lemma list_foo_lemma : (* Some statement *).
Now, because we now that H_app and H_nth are compatible with their
list counterparts, and because foo is polymorphic, the theory of
parametricity says that we can prove
Definition H_foo := foo Heap #H_app #H_nth.
Lemma foo_param : forall A (h : Heap A),
f (H_foo h) = list_foo (f h).
with this lemma in hand, it should be possible to transport properties
of list_foo to similar properties of H_foo. For instance, as a
trivial example, we can show that H_app is associative, up to
conversion to a list:
forall A (h1 h2 h3 : Heap A),
list_foo (H_app h1 (H_app h2 h3)) =
list_foo (H_app (H_app h1 h2) h3).
What's nice about parametricity is that it applies to any
parametrically polymorphic function: as long as appropriate
compatibility conditions hold of your types, it should be possible to
relate two instantiations of a given function in a similar fashion to
foo_param.
There are two problems, however. The first one is having to change
your base definitions to polymorphic ones, which is probably not so
bad. What's worse, though, is that even though parametricity ensures
that it is always possible to prove lemmas such as foo_param under
certain conditions, Coq does not give you that for free, and you still
need to show these lemmas by hand. There are two things that could help
alleviate your pain:
There's a parametricity plugin for Coq (CoqParam) which should
help deriving the boilerplate proofs for you automatically. I have
never used it, though, so I can't really say how easy it is to use.
The Coq Effective Algebra Library (or CoqEAL, for short) uses
parametricity to prove things about efficient algorithms while
reasoning over more convenient ones. In particular, they define
refinements that allow one to switch between nat and BinNat, as
you suggested. Internally, they use an infrastructure based on
type-class inference, which you could adapt to your original
example, but I heard that they are currently migrating their
implementation to use CoqParam instead.

TAKE function in SML

In the TAKE function which is given by
fun TAKE (xs,0) = []
| TAKE (NIL, n) = raise Subscript
| TAKE (CONS (x,xf),n) = x :: TAKE(xf(), n-1);
What are xs, x , xf?
And can you also please tell me how take function works.
Your take function seems to operate over a data structure of some type like
datatype 'a stream = NIL | CONS of 'a * (unit -> 'a stream)
Your take function iterates over the stream data structure and takes n elements out of it, and returns a list containing those elements.
The identifier xs is the function parameter that holds the stream data structure, the identifier n is the function parameter holding the number of elements you want to retrieve (ie take). The identifiers x,xf are patterns, they are bound to the values of the CONS cell, so x is the head (ie 'a) and xf is the tail (ie (unit -> 'a stream).
It is my impression (based on your question) that you need to gain a deeper understanding of SML and functional programming in general to make sense of this answer, though. Most likely you won't achieve that asking questions here. I recommend you to get a good reference book, like the ones suggested in the information section of the SML tag here in SO.
You may also want to read the section 3.5 Streams from the great book Structure and Interpretation of Computer Programs. The code in the book is in Scheme. It might take a while to get it all (if you are unfamiliar with any lisp-related language), but it is worth the effort.

`Ord a =>` or `Num a =>`

I have the following functions:
which (x:xs) = worker x xs
worker x [] = x
worker x (y:ys)
| x > y = worker y ys
| otherwise = worker x ys
and am wondering how I should define the types signatures of these above functions which and worker?
For Example, which of the following ways would be best as a type signature for worker?
worker :: Num a => a -> [a] -> a,
or
worker :: Ord a => a -> [a] -> a?
I'm just really confused and don't get which these three I should choose. I'd appreciate your thoughts. Thanks.
If you define the function without an explicit type signature, Haskell will infer the most general one. If you’re unsure, this is the easiest way to figure out how your definition will be read; you can then copy it into your source code. A common mistake is incorrectly typing a function and then getting a confusing type error somewhere else.
Anyway, you can get info on the Num class by typing :i Num into ghci, or by reading the documentation. The Num class gives you +, *, -, negate, abs, signum, fromInteger, as well as every function of Eq and Show. Notice that < and > aren’t there! Requiring values of Num and attempting to compare them will in fact produce a type error — not every kind of number can be compared.
So it should be Ord a => ..., as Num a => ... would produce a type error if you tried it.
If you think about what your functions do, you'll see that which xs returns the minimum value in xs. What can have a minimum value? A list of something Orderable!
Ask ghci and see what it says. I just copy-pasted your code as is into a file and loaded it into ghci. Then I used :t which is a special ghci command to determine the type of something.
ghci> :t which
which :: (Ord t) => [t] -> t
ghci> :t worker
worker :: (Ord a) => a -> [a] -> a
Haskell's type inference is pretty smart in most cases; learn to trust it. Other answers sufficiently cover why Ord should be used in this case; I just wanted to make sure ghci was clearly mentioned as a technique for determining the type of something.
I would always go with the Ord type constraint. It is the most general, so it can be reused more often.
There is no advantage to using Num over Ord.
Int may have a small advantage as it is not polymorphic and would not require a dictionary lookup. I would stil use Ord and use the specialize pragma if I needed to for performance.
Edit: Altered my answer after comments.
It depends on what you want to being able to compare. If you want to being able to compare Double, Float, Int, Integer and Char then use Ord. If you only want to being able to compare Int then just use Int.
If you have another problem like this, just look at the instances of the type class to tell which types you want to be able to use in the function.
Ord documentation