I'am curious about memory management in OCaml. When lists are shared through the program invocation. As example:
let rec insertAux v acc l =
match l with
| [] -> acc
| h::t -> insertAux v ((v::h) :: acc) t;;
let insert v l = insertAux v l l;;
let rec sublist l =
match l with
| [] -> [[]]
| head::tail -> insert head (sublist tail);;
What elements/lists, in insertAux, are copied or shared?
Before I can tell, what will be shared and where, I need to ensure, that we have a common definition of the word "share". I would propose the following definition: "A value is shared between two data structures iff they both contain a pointer to that value".
Let first look at the insertAux function, that takes three values and produces the resulting value. So, let's infer the sharing relation between input values and the result. If l is empty, then there is no sharing between v and result, and no sharing between l and result. Finally, acc value is 100% shared with the result. So these two values are the same.
This was the simple, base case. Now let's look at the induction step:
| h::t -> insertAux v ((v::h) :: acc) t
Let's bind intermediate values to names, so that we can easily refer them in the text:
| h::t ->
let vh = v :: h in
let vhacc = vh :: acc in
let result = insertAux v vhacc t in
result
The vh value will share values with both v and h. To create vh OCaml will allocate a new linked list node, that is a pair of pointers. One pointer will reference v, and another will reference to h. Value vhacc will share values with vh and acc. Since, the sharing relation is transitive, it means, that it will share values with v, h and acc. Internally, a it will create a pair of pointers, with one pointing to vh and another to acc. By induction, the result will share v, h and t.
To summarize, insertAux will build a new value, that will share all components of the input values. It will allocate 2*N nodes to connect shared values in a new way, where N is a length of the list l.
Function let insert v l = insertAux v l l will produce a value, that will share both input values. It will create a list, that will contain a duplicate of list l, and N lists which will contain a pointer to v as a head, and duplicates of l as the tail.
Finally, function sublist, will produce a value, that will share its inputs. It will create a list, that will contain N+1 elements, where each element will be a subset of the original list, freshely built from the components (that are shared) of the input list.
To summarize, OCaml will share all values. If values have mutable fields, then it may impose problems. If they are immutable, then it is absolutely transparent (i.e., invisible, doesn't affect the semantics, etc) to a programmer, and one can reason about them like they are always copied, and every new constructor will create a totally new value without sharing if it makes things easier. In fact, sharing has a meaning only for mutable data structures. Moreover, further compiler optimizations, like Common Subexpression Elimination (CLE) may find even more opportunities for the sharing. There are other optimization techniques, that may reuse existing values, and mutate them in place, if it is possible to prove, that they are unused in other parts of the program (although, to my knowledge, currently OCaml doesn't perform this optimization).
One more thing to know. OCaml represents values uniformly, either as a word if a value can fit into a word, or as a pointer to the heap-allocated value if it can't. Basically, it means, that all values, that can fit into OCaml word will be unboxed (e.g., ints, nullary constructors, chars, etc).
Related
Implement the function ticktack which has 2 arguments. First argument is a tuple of natural numbers and defines the number of rows and columns of a play field. Second list contains a record of a match of ticktacktoe game given by coordinates on which played in turns player 'x' and player 'o'. Print actual state of the game in the way where play-field will be bordered by characters '-' and '|', empty squares ' ' and characters 'x' and 'o' will be on squares where the players have played.
ticktack::(Int,Int) -> [(Int,Int)] -> Result
I already tried something like this:
type Result = [String]
pp :: Result -> IO ()
pp x = putStr (concat (map (++"\n") x))
ticktack::(Int,Int) -> [(Int,Int)] -> Result
ticktack (0,0) (x:xs) = []
ticktack (a,b) [] = []
ticktack (a,b) (x:xs) =[ [if a == fst x && b == snd x then 'x' else ' ' | b <- [1..b]]| a <- [1..a]] ++ ticktack (a,b) xs
But it returns me only N [Strings] with 1 result, so i need these results merge into one [String].
As #luqui says in a comment to the question:
You could either merge the outputs ... or you could search the history once for each space in
the board. ...
The former solution is described in a nearby question. The "chess" problem having been
solved there is only superficially distinct from your "noughts & crosses" problem, so it should
not be too hard to adapt the solution. However:
In that case, the board size is fixed and small, so we were not worried about the inefficiency
of merging the boards pairwise.
In this case, the board size is variable, so a solution by the latter method may be worth a try.
To make the algorithm even more efficient, instead of scrolling across the board and searching for
matching moves at every cell, we will scroll across the moves and assign values to a board
represented as a mutable array. Mutable arrays may be considered an "advanced technique" in
functional programming, so it could also be a good exercise for an intermediate Haskeller. I only
used them once or twice before, so let us see if I can figure this out!
How is this going to work?
At the heart of the program will be a rectangular array of bytes. An array goes in two flavours:
mutable and "frozen". While a frozen array cannot be changed, It is a rule that a mutable array
may only exist in a monadic context, so we can only freely pass around an array when it is frozen.
If this seems to be overcomplicated, I can only ask the reader to believe that the additional
safety guarantees are worth this complication.
Anyway, here are the types:
type Position = (Int, Int)
type Field s = STUArray s Position Char
type FrozenField = UArray Position Char
We will create a function that "applies" a list of moves to an array, thawing and freezing it as
needed.
type Move = (Char, Position)
applyMoves :: FrozenField -> [Move] -> FrozenField
(The idea of Move is that it is sufficient to put a mark on the board, without needing to know
whose turn it is.)
Applied to an empty field of the appropriate size, this function will solve our problem — we shall
only need to adjust the format of the input and the output.
empty :: Position -> FrozenField
positionsToMoves :: [Position] -> [Move]
arrayToLists :: FrozenField -> [[Char]]
Our final program will then look like this:
tictac :: Position -> [Position] -> IO ()
tictac corner = pp . arrayToLists . applyMoves (empty corner) . positionsToMoves
I hope it looks sensible? Even though we have not yet written any tangible code.
Can we write the code?
Yes.
First, we will need some imports. No one likes imports, but, for some reason, it is not yet
automated. So, here:
import Data.Foldable (traverse_)
import Data.Array.Unboxed
import Data.Array.ST
import GHC.ST (ST)
The simplest thing one can do with arrays is to create an empty one. Let us give it a try:
empty :: Position -> FrozenField
empty corner = runSTUArray (newArray ((1, 1), corner) ' ')
The idea is that newArray claims a region in memory and fills it with spaces, and runSTUArray
freezes it so that it can be safely transported to another part of a program. We could instead
"inline" the creation of the array and win some speed, but we only need to do it once, and I
wanted to keep it composable — I think the program will be simpler this way.
Another easy thing to do is to write the "glue" code that adjusts the input and output format:
positionsToMoves :: [Position] -> [Move]
positionsToMoves = zip (cycle ['x', 'o'])
arrayToLists :: FrozenField -> [[Char]]
arrayToLists u =
let ((minHeight, minWidth), (maxHeight, maxWidth)) = bounds u
in [ [ u ! (row, column) | column <- [minWidth.. maxWidth] ] | row <- [minHeight.. maxHeight] ]
Nothing unusual here, run-of-the-mill list processing.
Finally, the hard part — the code that applies any number of moves to a given frozen array:
applyMoves :: FrozenField -> [Move] -> FrozenField
applyMoves start xs = runSTUArray (foldST applyMove (thaw start) xs)
where
foldST :: (a -> b -> ST s ()) -> ST s a -> [b] -> ST s a
foldST f start' moves = do
u <- start'
traverse_ (f u) moves
return u
applyMove :: Field s -> Move -> ST s ()
applyMove u (x, i) = writeArray u i x
The pattern is the same as in the function empty: modify an array, then freeze it — and all the
modifications have to happen in an ST monad, for safety. foldST contains all the
"imperative" "inner loop" of our program.
(P.S.) How does this actually work?
Let us unwrap the UArray and STUArray types first. What are they and what is the difference?
UArray means "unboxed array", which is to say an array of values, as opposed to an array of
pointers. The value in our case is actually a Unicode character, not a C "byte" char, so it is not a byte, but a variable
size entity. When it is stored in unboxed form, it is converted to an Int32 and back invisibly
to us. An Int32 is of course way too much for our humble purpose of storing 3 different values,
so there is space for improvement here. To find out more about unboxed values, I invite you to
check the article that introduced them back in 1991, "Unboxed Values as First Class Citizens in
a Non-Strict Functional Language".
That the values are unboxed does not mean that you can change them though. A pure value in Haskell
is always immutable. So, were you to change a single value in an array, the whole array would be
copied — expensive! This is where STUArray comes in. ST stands for State Thread, and what
STUArray is is an "unfrozen" array, where you can overwrite individual values without copying
the whole thing. To ensure safety, it can only live in a monad, in this case the ST monad.
(Notice how an STUArray value never appears outside of an ST s wrap.) You can imagine an
ST computation as a small imperative process with its own memory, separate from the outside
world. The story goes that they invented ST first, and then figured out they can get IO from
it, so IO is actually ST in disguise. For more details on how ST works, check out the
original article from 1994: "Lazy Functional State Threads".
Let us now take a more careful look at foldST. What we see is that functionally, it does not
make sense. First we bind the value of start' to u, and then we return u — the same
variable. From the functional point of view, this is the same as writing:
u <- start'
return u
— Which would be equivalent to u by monad laws. The trick is in what happens inbetween:
traverse_ (f u) moves
Let us check the type.
λ :type traverse_
traverse_ :: (Foldable t, Applicative f) => (a -> f b) -> t a -> f ()
So, some function is being called, with u as argument, but the result is the useless () type.
In a functional setting, this line would mean nothing. But in a monad, bound values may appear
to change. f is a function that can change the state of a monad, and so can change the value of
the bound names when they are returned. The analogous code in C would go somewhat like this:
char* foldST(void f(char*, Move), int n_start, char start[], int n_moves, Move moves[])
{
// u <- start
char* u = malloc(sizeof(char) * n_start);
memcpy(u, start, sizeof(char) * n_start);
// traverse_ (f u) moves
for (int i = 0; i < n_moves; i++)
{
f(u, moves[i]);
}
// return u
return u;
}
In Haskell, the pointer arithmetic is abstracted away, but essentially traverse_ in ST works
like this. I am not really familiar with C nor with the inner workings of the ST abstraction, so
this is merely an analogy, not an attempt at a precise rendition. Nevertheless I hope it helps the reader to observe the similarity between ST and ordinary imperative C code.
Mission accomplished!
It runs reasonably fast. Takes only a moment to draw a million-step match on a million-sized
board. I hope it is also explained clearly enough. Do not hesitate to comment if something is amiss or unclear.
I want to convert a sequence to a list using List.init. I want at each step to retrieve the i th value of s.
let to_list s =
let n = length s in
List.init n
(fun _i ->
match s () with
| Nil -> assert false
| Cons (a, sr) -> a)
This is giving me a list initialized with the first element of s only. Is it possible in OCaml to initialize the list with all the values of s?
It may help to study the definition of List.init.
There are two variations depending on the size of the list: a tail recursive one, init_tailrec_aux, whose result is in reverse order, and a basic one, init_aux. They have identical results, so we need only look at init_aux:
let rec init_aux i n f =
if i >= n then []
else
let r = f i in
r :: init_aux (i+1) n f
This function recursively increments a counter i until it reaches a limit n. For each value of the counter that is strictly less than the limit, it adds the value given by f i to the head of the list being produced.
The question now is, what does your anonymous function do when called with different values of i?:
let f_anon =
(fun _i -> match s () with
|Nil -> assert false
|Cons(a, sr) -> a)
Regardless of _i, it always gives the head of the list produced by s (), and if s () always returns the same list, then f_anon 0 = f_anon 1 = f_anon 2 = f_anon 3 = hd (s ()).
Jeffrey Scofield's answer describes a technique for giving a different value at each _i, and I agree with his suggestion that List.init is not the best solution for this problem.
The essence of the problem is that you're not saving sr, which would let you retrieve the next element of the sequence.
However, the slightly larger problem is that List.init passes only an int as an argument to the initialization function. So even if you did keep track of sr, there's no way it can be passed to your initialization function.
You can do what you want using the impure parts of OCaml. E.g., you could save sr in a global reference variable at each step and retrieve it in the next call to the initialization function. However, this is really quite a cumbersome way to produce your list.
I would suggest not using List.init. You can write a straightforward recursive function to do what you want. (If you care about tail recursion, you can write a slightly less straightforward function.)
using a recursive function will increase the complexity so i think that initializing directly the list (or array) at the corresponding length will be better but i don't really know how to get a different value at each _i like Jeffrey Scofield said i am not really familiar with ocaml especially sequences so i have some difficulties doing that:(
So I'm trying to write some minimal code to put two lists of strings together, and to do this I thought it was best to use the haskell map function.
Essentially I want to be able to do adders ["1","2"] ["3","4"] = ["1","2","3","4"]
So I have a function called adder, which takes a list, then adds a string to that list and returns the new list. Then I have a function called adders which replicates the adder function, but adds a list of strings instead of just one string, however at the moment it produces multiple lists instead of one list.
I thought
adder :: [String] -> String -> [String]
adder y x = y ++ [x]
adders y x = map (adder y) x
would work, but this just gives a list of two lists
[["1","2","3"],[["1","2","4"]]
How is the best way to go about this?
I thought it was best to use the haskell map function
No. map f applies f to every element of your list. But you don't want to change the elements at all, you want to change the list itself. That, however, is out of scope of the things that are possible with map. map cannot add more elements, neither can it remove some.
If you want to concatenate two lists, simply use ++:
adders :: [a] -> [a] -> [a]
adders x y = x ++ y
Is there a more efficient way to update an element in a list in Elm than maping over each element?
{ model | items = List.indexedMap (\i x -> if i == 2 then "z" else x) model.items }
Maybe Elm's compiler is sophisticated enough to optimize this so that map or indexedMap isn't unnecessarily copying over every element except 1. What about nested lists?
Clojure has assoc-in to update an element inside a nested list or record (can be combined too). Does Elm have an equivalent?
More efficient in terms of amount of code would be (this is similar to #MichaelKohl's answer):
List.take n list ++ newN :: List.drop (n+1) list
PS: if n is < 0 or n > (length of list - 1) then the new item will be added before or at the end of the list.
PPS: I seem to recall that a :: alist is slightly better performing than [a] ++ alist.
If you mean efficient in terms of performance/ number of operations:
As soon as your lists get large, it is more efficient to use an Array (or a Dict) instead of a List as your type.
But there is a trade-off:
Array and Dict are very efficient/ performant when you frequently retrieve/ update/ add items.
List is very performant when you do frequent sorting and filtering and other operations where you actually need to map over the entire set.
That is why in my code, List is what I use a lot in view code. On the data side (in my update functions) I use Dict and Array more.
Basically, an Elm list is not meant for such a use-case. Instead, consider using an Array. Array contains a set function you can use for what is conceptually an in-pace update. Here's an example:
import Html exposing (text)
import Array
type alias Model = { items : Array.Array String }
model =
{ items = Array.fromList ["a", "b", "c"]
}
main =
let
m = { model | items = Array.set 2 "z" model.items }
z = Array.get 2 m.items
output = case z of
Just n -> n
Nothing -> "Nothing"
in
text output -- The output will be "z"
If for some reason you need model.items to be a List, note that you can convert back and forth between Array and List.
I'm not overly familiar with Elm, but given that it's immutable by default, I'd assume it uses structural sharing for its underlying data structures, so your concern re memory may be unfounded.
Personally I think there's nothing wrong with your approach posted above, but if you don't like it, you can try something like this (or List.concat):
List.take n list ++ newN :: List.drop (n+1)
I'm definitely not an Elm expert, but a look at Elm's List documentation did not reveal any function to update the element at a given index in a list.
I like Michael's answer. It's quite elegant. If you prefer a less-elegant, recursive approach, you can do something like the following. (Like I said, I'm not an Elm expert, but hopefully the intention of the code is clear if its not quite right. Also, I don't do any error handling.)
updateListAt :: List a -> Int -> a -> List a
updateListAt (head :: tail) 0 x = x :: tail
updateListAt (head :: tail) i x = head :: (updateListAt tail (i - 1) x)
However, both the runtime and space complexity will be O(n) in both the average and worst cases, regardless of the method used. This is a consequence of Elm's List being a single-linked list.
Regarding assoc-in, if you look at the Clojure source, you'll see that assoc-in is just recursively defined in terms of assoc. However, I think you'd have trouble typing it for arbitrary, dynamic depth in Elm.
Here's what I've got so far...
fun positive l1 = positive(l1,[],[])
| positive (l1, p, n) =
if hd(l1) < 0
then positive(tl(l1), p, n # [hd(l1])
else if hd(l1) >= 0
then positive(tl(l1), p # [hd(l1)], n)
else if null (h1(l1))
then p
Yes, this is for my educational purposes. I'm taking an ML class in college and we had to write a program that would return the biggest integer in a list and I want to go above and beyond that to see if I can remove the positives from it as well.
Also, if possible, can anyone point me to a decent ML book or primer? Our class text doesn't explain things well at all.
You fail to mention that your code doesn't type.
Your first function clause just has the variable l1, which is used in the recursive. However here it is used as the first element of the triple, which is given as the argument. This doesn't really go hand in hand with the Hindley–Milner type system that SML uses. This is perhaps better seen by the following informal thoughts:
Lets start by assuming that l1 has the type 'a, and thus the function must take arguments of that type and return something unknown 'a -> .... However on the right hand side you create an argument (l1, [], []) which must have the type 'a * 'b list * 'c list. But since it is passed as an argument to the function, that must also mean that 'a is equal to 'a * 'b list * 'c list, which clearly is not the case.
Clearly this was not your original intent. It seems that your intent was to have a function that takes an list as argument, and then at the same time have a recursive helper function, which takes two extra accumulation arguments, namely a list of positive and negative numbers in the original list.
To do this, you at least need to give your helper function another name, such that its definition won't rebind the definition of the original function.
Then you have some options, as to which scope this helper function should be in. In general if it doesn't make any sense to be calling this helper function other than from the "main" function, then it should not be places in a scope outside the "main" function. This can be done using a let binding like this:
fun positive xs =
let
fun positive' ys p n = ...
in
positive' xs [] []
end
This way the helper function positives' can't be called outside of the positive function.
With this take care of there are some more issues with your original code.
Since you are only returning the list of positive integers, there is no need to keep track of the
negative ones.
You should be using pattern matching to decompose the list elements. This way you eliminate the
use of taking the head and tail of the list, and also the need to verify whether there actually is
a head and tail in the list.
fun foo [] = ... (* input list is empty *)
| foo (x::xs) = ... (* x is now the head, and xs is the tail *)
You should not use the append operator (#), whenever you can avoid it (which you always can).
The problem is that it has a terrible running time when you have a huge list on the left hand
side and a small list on the right hand side (which is often the case for the right hand side, as
it is mostly used to append a single element). Thus it should in general be considered bad
practice to use it.
However there exists a very simple solution to this, which is to always concatenate the element
in front of the list (constructing the list in reverse order), and then just reversing the list
when returning it as the last thing (making it in expected order):
fun foo [] acc = rev acc
| foo (x::xs) acc = foo xs (x::acc)
Given these small notes, we end up with a function that looks something like this
fun positive xs =
let
fun positive' [] p = rev p
| positive' (y::ys) p =
if y < 0 then
positive' ys p
else
positive' ys (y :: p)
in
positive' xs []
end
Have you learned about List.filter? It might be appropriate here - it takes a function (which is a predicate) of type 'a -> bool and a list of type 'a list, and returns a list consisting of only the elements for which the predicate evaluates to true. For example:
List.filter (fn x => Real.>= (x, 0.0)) [1.0, 4.5, ~3.4, 42.0, ~9.0]
Your existing code won't work because you're comparing to integers using the intversion of <. The code hd(l1) < 0 will work over a list of int, not a list of real. Numeric literals are not automatically coerced by Standard ML. One must explicitly write 0.0, and use Real.< (hd(l1), 0.0) for your test.
If you don't want to use filter from the standard library, you could consider how one might implement filter yourself. Here's one way:
fun filter f [] = []
| filter f (h::t) =
if f h
then h :: filter f t
else filter f t