How to traverse a list without losing elements - ocaml

so i have a function:
let rec add_rules start rules prod =
match rules with
| [] -> prod
| nonterm::lst -> if ((fst nonterm) = start)
then add_rules start (List.remove_assoc
(fst nonterm) rules) (nonterm::prod)
else if (check_sym (fst nonterm) prod)
then add_rules start (List.remove_assoc
(fst nonterm) rules) (nonterm::prod)
else add_rules start lst prod
and it takes an element called start, a list of pairs called rules where (x,[y]) and x is an element and y is a list, and an empty list prod.
without getting too much into detail about the specifics of this function, i basically want it to traverse the list of pairs (rules) and add certain elements to the empty list prod.
problem: in the first two if/else if statements, my code successfully removes the pair corresponding to (fst nonterm) and passes in the entire original list (minus (fst nonterm)) back to the recursive function. however, in the last else statement, i recursively traverse through the list by calling the tail of the original rules list, passing that in to the function, and therefore ending up unable to ever access those heads ever again.
is there any way i can avoid this? or is there any way i can traverse the list without getting rid of the head each time?
thank you in advance!!!

Yes, you need to introduce an extra parameter for that, as #coredump has suggested.
So you will end up with an accumulator prod that will contain the result, that you're building, a queue of rules to be processed that reduces in size with each step of recursion (currently named rules in your version) and the rules that will contain all the rules (modulo those that you explicitly delete).

Related

Implementing own Union function without going through lists twice

I have to write a Union function using recursion
The ouput has to be the Union (no duplicates) of two lists. My teacher said the implementation has to be recursive and we cannot go through the lists twice but I don't think I can come up with a way of solving the problem without going through the lists twice?
My ideas which would solve the problem (but involve going through lists twice):
- Merge then remove duplicates
- Sorting the lists, then merge
Any hints or help would be appreciated
Edit: Well so I got to combine both lists by doing this:
union1 :: (Eq a) => [a] -> [a] -> [a]
union1 xs [] = xs
union1 [] ys = ys
union1 (x:xs)(y:ys) = x:y:union1(xs)(ys)
Then I thought I could use nub or a similar function to remove the duplicates but I got stuck thinking because then I would be going through the lists twice, right?
What is list union?
I would like to first point out that the requirements your teacher gave you are a bit vague. Moreover, union on multisets (aka sets that can have duplicates, like lists) have two different definitions in mathematics (other source). I am no mathematician, but here is what I was able to glean from various internets. Here is one definition:
λ> [1,2,2,3,3,3] `unionA` [1,2,2,2,3] --also called multiset sum
[1,1,2,2,2,2,2,3,3,3,3]
This is simply (++), if you're not worried about ordering. And here is the other:
λ> [1,2,2,3,3,3] `unionB` [1,2,2,2,3]
[1,2,2,2,3,3,3] --picks out the max number of occurrences from each list
Adding to this confusion, Data.List implements a somewhat quirky third type of union, that treats its left input differently from its right input. Here is approximately the documentation found in comments the source code of union from Data.List:
The union function returns the list union of the two lists. Duplicates, and elements of the first list, are removed from the the second list, but if the first list contains duplicates, so will the result. For example,
λ> "dog" `union` "cow"
"dogcw"
Here, you have 3 possible meanings of "union of lists" to choose from. So unless you had example input and output, I don't know which one your teacher wants, but since the goal is probably for you to learn about recursion, read on...
Removing duplicates
Removing duplicates from unordered lists in Haskell can be done in linear time, but solutions involve either random-access data structures, like Arrays, or something called "productive stable unordered discriminators", like in the discrimination package. I don't think that's what your teacher is looking for, since the first is also common in imperative programming, and the second is too complex for a beginner Haskell course. What your teacher probably meant is that you should traverse each list explicitly only once.
So now for the actual code. Here I will show you how to implement union #3, starting from what you wrote, and still traverse the lists (explicitly) only once. You already have the basic recursion scheme, but you don't need to recurse on both lists, since we opted for option #3, and therefore return the first list unchanged.
Actual code
You'll see in the code below that the first list is used as an "accumulator": while recursing on the second list, we check each element for a duplicate in the first list, and if there isn't a duplicate, we append it to the first list.
union [] r = r
union l [] = l
unionR ls (r:rs)
| r `elem` ls = unionR ls rs --discard head of second list if already seen
--`elem` traverses its second argument,
--but see discussion above.
| otherwise = unionR (r:ls) rs --append head of second list
As a side note, you can make this a bit more readable by using a fold:
union xs = foldl step xs where --xs is our first list; no recursion on it,
--we use it right away as the accumulator.
step els x
| x `elem` els = els
| otherwise = x : els

custom unfold returning the accumulator

I'm trying to create a custom unfold function that returns its last accumulator value like:
val unfold' : generator:('State -> ('T * 'State) option) -> state:'State -> 'T list * 'State
I managed to make the following:
let unfold' generator state =
let rec loop resultList state =
match generator state with
| Some (value, state) -> loop (value :: resultList) state
| None -> (List.rev resultList), state
loop [] state
But I wanted to avoid to List.rev the resulting list and generate it already with the correct order. I imagine it would be necessary to use continuations to build the list, but I'm quite new to functional programming and have not yet managed to wrap my mind around continuations; and all alternatives I can imagine would put the accumulator inside the resulting list or not allow it to be returned by the function.
Is there some way to do this?
As this is a personal learning exercise I would prefer an answer explaining how to do it instead of simply giving the completed code.
The way to do without a List.rev is to pass a function instead of the resultList parameter. Let's call that function buildResultList. At each step, this function would take the already-built tail of the list, prepend the current item, and then pass this to the function from the previous step, which would append the previous item, pass it to the function from the previous-previous step, and so on. The very last function in this chain will prepend the very first item to the list. The result of the whole recursive loop would be the last function of the chain (it calls all the previous ones), which you would then call with empty list as argument. I'm afraid this is as clear as I can go without just writing the code.
However, the thing is, this wouldn't be any better, for any definition of "better". Since the computation is progressing "forward", and resulting list is built "backward" (head :: tail, Lisp-style), you have to accumulate the result somewhere. In your code, you're accumulating it in a temporary list, but if you modify it to use continuations, you'll be accumulating it on the heap as a series of closures that reference each other in a chain. One could argue that it would be, in essence, the same list, only obfuscated.
Another approach you could try is to use a lazy sequence instead: build a recursive seq computation, which will yield the current item and then yield! itself. You can then enumerate this sequence, and it won't require a "temporary" storage. However, if you still want to get a list at the end, you'll have to convert the sequence to a list via List.ofSeq, and guess how that's going to be implemented? Theoretically, from purely mathematical standpoint, List.ofSeq would be implemented in exactly the same way: by building a temp list first and then reversing it. But the F# library cheats here: it builds the list in a mutable way, so it doesn't have to reverse.
And finally, since this is a learning exercise, you could also implement the equivalent of a lazy sequence yourself. Now, the standard .NET sequences (aka IEnumerable<_>, which is what Seq<_> is an alias for) are inherently mutable: you're changing the internal state of the iterator every time you move to the next item. You can do that, or, in the spirit of learning, you can do an immutable equivalent. That would be almost like a list (i.e. head::tail), except that, since it's lazy, the "tail" has to be a promise rather than the actual sequence, so:
type LazySeq<'t> = LazySeq of (unit -> LazySeqStep<'t>)
and LazySeqStep<'t> = Empty | Cons of head: 't * tail: LazySeq<'t>
The way to enumerate is to invoke the function, and it will return you the next item plus the tail of the sequence. Then you can write your unfold as a function that returns current item as head and then just returns itself wrapped in a closure as tail. Turns out pretty simple, actually.
Thanks to Fyodor Soikin's answer, here is the resulting function:
let unfold' generator state =
let rec loop build state =
match generator state with
| Some (value, state) -> loop (fun newValue -> build (value :: newValue)) state
| None -> (build []), state
loop id state

How can you access the last element in a list in ocaml

I know that when using ocaml pattern matching it is possible to use h::t
When using this the h with refer to the first element in a list and the t will refer to the rest of the list. Is it possible to use this same type of matching to get the last element in a list. So the t will refer to the last element and the h will refer to the rest of the list.
An example of code that this would be useful for is
let rec remove x y = match y with
[] -> x
| h::t -> remove (remove_element x (get_last y)) h
;;
If you want to get the last element then you can traverse the list recursively until you encounter this case: | [x] -> x
No, there's no pattern that matches against the end of a list. It's not an attractive structure in OCaml because it takes linear time to find the end of a list. OCaml pattern matching is supposed to be fast.
You can reverse your list and match the beginning of the reversed list. It's only a constant factor slower than finding the end of a list.
As the other answers says, you have to traverse/reverse the list to access it.
Depending on the specific problem, you could consider to use another data structure.
OCaml's standard library provides Queue, which could be of interest to you:
http://caml.inria.fr/pub/docs/manual-ocaml/libref/Queue.html

Writing a list append function in OCaml

I've defined a custom list type as part f a homework exercise.
type 'a myType =
| Item of ('a * 'a myType)
| Empty;;
I've already done 'length' and now I need a 'append' function.
My length function:
let length l =
let rec _length n = function
| Empty -> n
| Item(_, next) -> _length (n + 1) next
in _length 0 l;;
But I really don't know how to make the append function.
let append list1 list2 = (* TODO *)
I can't use the list module so I can't use either :: or #.
I guess my comments are getting too lengthy to count as mere comments. I don't really want to answer, I just want to give hints. Otherwise it defeats the purpose.
To repeat my hints:
a. The second parameter will appear unchanged in your result, so you can just
spend your time worrying about the first parameter.
b. You first need to know how to append something to an empty list. I.e., you need
to know what to do when the first parameter is Empty.
c. You next need to know how to break down the non-empty case into a smaller append
problem.
If you don't know how to create an item, then you might start by writing a function that takes (say) an integer and a list of integers and returns a new list with the integer at the front. Here is a function that takes an integer and returns a list containing just that one integer:
let list1 k =
Item (k, Empty)
One way to think of this is that every time Item appears in your code, you're creating a new item. Item is called a constructor because it constructs an item.
I hope this helps.
Your structure is a list, so you should start by defining a value nil that is the empty list, and a function cons head tail, that appends the head element in front of the list tail.
Another advice: sometimes, it helps a lot to start by taking a simple example, and trying to do it manually, i.e. decomposing what you want to do in simple operations that you do yourself. Then, you can generalize and write the code...

What is the easiest way to add an element to the end of the list?

As:: : 'a -> 'a list -> 'a list is used to add an element to the begin of a list, Could anyone tell me if there is a function to add an element to the end of a list? If not, I guess List.rev (element::(List.rev list)) is the most straightforward way to do it?
Thank you!
The reason there's not a standard function to do this is that appending at the end of a list is an anti-pattern (aka a "snoc list" or a Schlemiel the Painter algorithm). Adding an element at the end of a list requires a full copy of the list. Adding an element at the front of the list requires allocating a single cell—the tail of the new list can just point to the old list.
That said, the most straightforward way to do it is
let append_item lst a = lst # [a]
list#[element] should work. # joins lists.
Given that this operation is linear, you should not use it in the "hot" part of your code, where performance matters. In a cold part, use list # [element] as suggest by Adi. In a hot part, rewrite your algorithm so that you don't need to do that.
The typical way to do it is to accumulate results in the reverse order during processing, and then reverse the whole accumulated list before returning the result. If you have N processing steps (each adding an element to the list), you therefore amortize the linear cost of reverse over N elements, so you keep a linear algorithm instead of a quadratic one.
In some case, another technique that work is to process your elements in reverse order, so that the accumulated results turn out to be in the right order without explicit reversal step.