OCaml Style for a function that merges two sorted lists into one sorted list - ocaml

I am new to OCaml and I am auditing a class. I have a homework prompt that reads:
"merge xs ys takes two integer lists, each sorted in increasing order,
and returns a single merged list in sorted order."
I have successfully written a function that works:
let rec merge xs ys = match xs with
| [] -> ys
| hxs::txs -> if hxs <= (match ys with
| [] -> hxs
| hys::tys -> hys)
then hxs :: merge txs ys
else match ys with
| [] -> xs
| hys::tys -> hys :: merge xs tys in
merge [-1;2;3;100] [-1;5;1001]
;;
I would like to know if my code is considered to be in acceptable OCaml style? I want to avoid forming any bad habits. It feels compositionaly dense, but maybe that's because I'm still not used to OCaml.
Thanks.

I personally find it hard to follow if hxs <= (match ...), and it's difficult to format it nicely. So I would probably write
...
let hys =
match ys with
| [] -> hxs
| hys :: _ -> hys
in
if hxs < hys then
hxs :: merge txs ys
...
However, I think it might be even better to match both xs and ys at the same time:
let rec merge xs ys =
match xs, ys with
| [], _ -> ys
| _, [] -> xs
| hx :: txs, hy :: tys ->
if hx < hy then hx :: merge txs ys else hy :: merge xs tys
I think this captures the symmetry of the problem better.
I think it's good when the length of the code matches well with the simplicity of the problem it solves. Merging is simple to state, and so the code shouldn't need to be long (it seems to me).

Related

How to get the Index of an element in a list, by not using "list comprehensions"?

I'm new in haskell programming and I try to solve a problem by/not using list comprehensions.
The Problem is to find the index of an element in a list and return a list of the indexes (where the elements in the list was found.)
I already solved the problem by using list comprehensions but now i have some problems to solve the problem without using list comprehensions.
On my recursive way:
I tried to zip a list of [0..(length list)] and the list as it self.
then if the element a equals an element in the list -> make a new list with the first element of the Tupel of the zipped list(my index) and after that search the function on a recursive way until the list is [].
That's my list comprehension (works):
positions :: Eq a => a -> [a] -> [Int]
positions a list = [x | (x,y) <- zip [0..(length list)] list, a == y]
That's my recursive way (not working):
positions' :: Eq a => a -> [a] -> [Int]
positions' _ [] = []
positions' a (x:xs) =
let ((n,m):ns) = zip [0..(length (x:xs))] (x:xs)
in if (a == m) then n:(positions' a xs)
else (positions' a xs)
*sorry I don't know how to highlight words
but ghci says:
*Main> positions' 2 [1,2,3,4,5,6,7,8,8,9,2]
[0,0]
and it should be like that (my list comprehension):
*Main> positions 2 [1,2,3,4,5,6,7,8,8,9,2]
[1,10]
Where is my mistake ?
The problem with your attempt is simply that when you say:
let ((n,m):ns) = zip [0..(length (x:xs))] (x:xs)
then n will always be 0. That's because you are matching (n,m) against the first element of zip [0..(length (x:xs))] (x:xs), which will necessarily always be (0,x).
That's not a problem in itself - but it does mean you have to handle the recursive step properly. The way you have it now, positions _ _, if non-empty, will always have 0 as its first element, because the only way you allow it to find a match is if it's at the head of the list, resulting in an index of 0. That means that your result will always be a list of the correct length, but with all elements 0 - as you're seeing.
The problem isn't with your recursion scheme though, it's to do with the fact that you're not modifying the result to account for the fact that you don't always want 0 added to the front of the result list. Since each recursive call just adds 1 to the index you want to find, all you need to do is map the increment function (+1) over the recursive result:
positions' :: Eq a => a -> [a] -> [Int]
positions' _ [] = []
positions' a (x:xs) =
let ((0,m):ns) = zip [0..(length (x:xs))] (x:xs)
in if (a == m) then 0:(map (+1) (positions' a xs))
else (map (+1) (positions' a xs))
(Note that I've changed your let to be explicit that n will always be 0 - I prefer to be explicit this way but this in itself doesn't change the output.) Since m is always bound to x and ns isn't used at all, we can elide the let, inlining the definition of m:
positions' :: Eq a => a -> [a] -> [Int]
positions' _ [] = []
positions' a (x:xs) =
if a == x
then 0 : map (+1) (positions' a xs)
else map (+1) (positions' a xs)
You could go on to factor out the repeated map (+1) (positions' a xs) if you wanted to.
Incidentally, you didn't need explicit recursion to avoid a list comprehension here. For one, list comprehensions are basically a replacement for uses of map and filter. I was going to write this out explicitly, but I see #WillemVanOnsem has given this as an answer so I will simply refer you to his answer.
Another way, although perhaps not acceptable if you were asked to implement this yourself, would be to just use the built-in elemIndices function, which does exactly what you are trying to implement here.
We can make use of a filter :: (a -> Bool) -> [a] -> [a] and map :: (a -> b) -> [a] -> [b] approach, like:
positions :: Eq a => a -> [a] -> [Int]
positions x = map fst . filter ((x ==) . snd) . zip [0..]
We thus first construct tuples of the form (i, yi), next we filter such that we only retain these tuples for which x == yi, and finally we fetch the first item of these tuples.
For example:
Prelude> positions 'o' "foobaraboof"
[1,2,8,9]
Your
let ((n,m):ns) = zip [0..(length (x:xs))] (x:xs)
is equivalent to
== {- by laziness -}
let ((n,m):ns) = zip [0..] (x:xs)
== {- by definition of zip -}
let ((n,m):ns) = (0,x) : zip [1..] xs
== {- by pattern matching -}
let {(n,m) = (0,x)
; ns = zip [1..] xs }
== {- by pattern matching -}
let { n = 0
; m = x
; ns = zip [1..] xs }
but you never reference ns! So we don't need its binding at all:
positions' a (x:xs) =
let { n = 0 ; m = x } in
if (a == m) then n : (positions' a xs)
else (positions' a xs)
and so, by substitution, you actually have
positions' :: Eq a => a -> [a] -> [Int]
positions' _ [] = []
positions' a (x:xs) =
if (a == x) then 0 : (positions' a xs) -- NB: 0
else (positions' a xs)
And this is why all you ever produce are 0s. But you want to produce the correct index: 0, 1, 2, 3, ....
First, let's tweak your code a little bit further into
positions' :: Eq a => a -> [a] -> [Int]
positions' a = go xs
where
go [] = []
go (x:xs) | a == x = 0 : go xs -- NB: 0
| otherwise = go xs
This is known as a worker/wrapper transform. go is a worker, positions' is a wrapper. There's no need to pass a around from call to call, it doesn't change, and we have access to it anyway. It is in the enclosing scope with respect to the inner function, go. We've also used guards instead of the more verbose and less visually apparent if ... then ... else.
Now we just need to use something -- the correct index value -- instead of 0.
To use it, we must have it first. What is it? It starts as 0, then it is incremented on each step along the input list.
When do we make a step along the input list? At the recursive call:
positions' :: Eq a => a -> [a] -> [Int]
positions' a = go xs 0
where
go [] _ = []
go (x:xs) i | a == x = 0 : go xs (i+1) -- NB: 0
| otherwise = go xs (i+1)
_ as a pattern means we don't care about the argument's value -- it's there but we're not going to use it.
Now all that's left for us to do is to use that i in place of that 0.

Beginner Haskell | Group List of Ints

I'm trying to make a function that makes a list of list of ints, can you lend a hand?
groupUp :: [Int] -> [[Int]]
example:
groupUp [1,2,2,3,3,3] == [[1],[2,2],[3,3,3]]
The closest I could come was:
groupUp [] = [[]]
groupUp (x:[]) = []
groupUp(x:y:xs)
| x==y = [x,y] : groupUp (xs)
| otherwise = [x] : groupUp (y:xs)
But this limits the list to a group of maximum 2 (pairs) and not more. What should I change?
Edit: this one works, thx for the help!
groupUp xs= helper 0 xs
where helper _ []=[]
helper i xs= takeWhile (==(xs!!i))xs: helper (i) (dropWhile (==(xs!!i))xs)
Instead of laborously comparing single elements, use a function that compares elements until some condition.
Prelude> span (==2) [2,2,3,3,3,4,4,4,4]
([2,2],[3,3,3,4,4,4,4])
Then, recurse, using the remainder of that:
groupUp [] = [[]] -- This should probably just be [], not [[]].
groupUp (x:xs) = case span (==x) xs of
(thisGroup, others) -> (x:thisGroup) : groupUp others
Of course you can also define a version of span yourself if you prefer.

Intersection of infinite lists

I know from computability theory that it is possible to take the intersection of two infinite lists, but I can't find a way to express it in Haskell.
The traditional method fails as soon as the second list is infinite, because you spend all your time checking it for a non-matching element in the first list.
Example:
let ones = 1 : ones -- an unending list of 1s
intersect [0,1] ones
This never yields 1, as it never stops checking ones for the element 0.
A successful method needs to ensure that each element of each list will be visited in finite time.
Probably, this will be by iterating through both lists, and spending approximately equal time checking all previously-visited elements in each list against each other.
If possible, I'd like to also have a way to ignore duplicates in the lists, as it is occasionally necessary, but this is not a requirement.
Using the universe package's Cartesian product operator we can write this one-liner:
import Data.Universe.Helpers
isect :: Eq a => [a] -> [a] -> [a]
xs `isect` ys = [x | (x, y) <- xs +*+ ys, x == y]
-- or this, which may do marginally less allocation
xs `isect` ys = foldr ($) [] $ cartesianProduct
(\x y -> if x == y then (x:) else id)
xs ys
Try it in ghci:
> take 10 $ [0,2..] `isect` [0,3..]
[0,6,12,18,24,30,36,42,48,54]
This implementation will not produce any duplicates if the input lists don't have any; but if they do, you can tack on your favorite dup-remover either before or after calling isect. For example, with nub, you might write
> nub ([0,1] `isect` repeat 1)
[1
and then heat up your computer pretty good, since it can never be sure there might not be a 0 in that second list somewhere if it looks deep enough.
This approach is significantly faster than David Fletcher's, produces many fewer duplicates and produces new values much more quickly than Willem Van Onsem's, and doesn't assume the lists are sorted like freestyle's (but is consequently much slower on such lists than freestyle's).
An idea might be to use incrementing bounds. Let is first relax the problem a bit: yielding duplicated values is allowed. In that case you could use:
import Data.List (intersect)
intersectInfinite :: Eq a => [a] -> [a] -> [a]
intersectInfinite = intersectInfinite' 1
where intersectInfinite' n = intersect (take n xs) (take n ys) ++ intersectInfinite' (n+1)
In other words we claim that:
A∩B = A1∩B1 ∪ A2∩B2 ∪ ... ∪ ...
with A1 is a set containing the first i elements of A (yes there is no order in a set, but let's say there is somehow an order). If the set contains less elements then the full set is returned.
If c is in A (at index i) and in B (at index j), c will be emitted in segment (not index) max(i,j).
This will thus always generate an infinite list (with an infinite amount of duplicates) regardless whether the given lists are finite or not. The only exception is when you give it an empty list, in which case it will take forever. Nevertheless we here ensured that every element in the intersection will be emitted at least once.
Making the result finite (if the given lists are finite)
Now we can make our definition better. First we make a more advanced version of take, takeFinite (let's first give a straight-forward, but not very efficient defintion):
takeFinite :: Int -> [a] -> (Bool,[a])
takeFinite _ [] = (True,[])
takeFinite 0 _ = (False,[])
takeFinite n (x:xs) = let (b,t) = takeFinite (n-1) xs in (b,x:t)
Now we can iteratively deepen until both lists have reached the end:
intersectInfinite :: Eq a => [a] -> [a] -> [a]
intersectInfinite = intersectInfinite' 1
intersectInfinite' :: Eq a => Int -> [a] -> [a] -> [a]
intersectInfinite' n xs ys | fa && fb = intersect xs ys
| fa = intersect ys xs
| fb = intersect xs ys
| otherwise = intersect xfa xfb ++ intersectInfinite' (n+1) xs ys
where (fa,xfa) = takeFinite n xs
(fb,xfb) = takeFinite n ys
This will now terminate given both lists are finite, but still produces a lot of duplicates. There are definitely ways to resolve this issue more.
Here's one way. For each x we make a list of maybes which has
Just x only where x appeared in ys. Then we interleave all
these lists.
isect :: Eq a => [a] -> [a] -> [a]
isect xs ys = (catMaybes . foldr interleave [] . map matches) xs
where
matches x = [if x == y then Just x else Nothing | y <- ys]
interleave :: [a] -> [a] -> [a]
interleave [] ys = ys
interleave (x:xs) ys = x : interleave ys xs
Maybe it can be improved using some sort of fairer interleaving -
it's already pretty slow on the example below because (I think)
it's doing an exponential amount of work.
> take 10 (isect [0..] [0,2..])
[0,2,4,6,8,10,12,14,16,18]
If elements in the lists are ordered then you can easy to do that.
intersectOrd :: Ord a => [a] -> [a] -> [a]
intersectOrd [] _ = []
intersectOrd _ [] = []
intersectOrd (x:xs) (y:ys) = case x `compare` y of
EQ -> x : intersectOrd xs ys
LT -> intersectOrd xs (y:ys)
GT -> intersectOrd (x:xs) ys
Here's yet another alternative, leveraging Control.Monad.WeightedSearch
import Control.Monad (guard)
import Control.Applicative
import qualified Control.Monad.WeightedSearch as W
We first define a cost for digging inside the list. Accessing the tail costs 1 unit more. This will ensure a fair scheduling among the two infinite lists.
eachW :: [a] -> W.T Int a
eachW = foldr (\x w -> pure x <|> W.weight 1 w) empty
Then, we simply disregard infinite lists.
intersection :: [Int] -> [Int] -> [Int]
intersection xs ys = W.toList $ do
x <- eachW xs
y <- eachW ys
guard (x==y)
return y
Even better with MonadComprehensions on:
intersection2 :: [Int] -> [Int] -> [Int]
intersection2 xs ys = W.toList [ y | x <- eachW xs, y <- eachW ys, x==y ]
Solution
I ended up using the following implementation; a slight modification of the answer by David Fletcher:
isect :: Eq a => [a] -> [a] -> [a]
isect [] = const [] -- don't bother testing against an empty list
isect xs = catMaybes . diagonal . map matches
where matches y = [if x == y then Just x else Nothing | x <- xs]
This can be augmented with nub to filter out duplicates:
isectUniq :: Eq a => [a] -> [a] -> [a]
isectUniq xs = nub . isect xs
Explanation
Of the line isect xs = catMaybes . diagonal . map matches
(map matches) ys computes a list of lists of comparisons between elements of xs and ys, where the list indices specify the indices in ys and xs respectively: i.e (map matches) ys !! 3 !! 0 would represent the comparison of ys !! 3 with xs !! 0, which would be Nothing if those values differ. If those values are the same, it would be Just that value.
diagonals takes a list of lists and returns a list of lists where the nth output list contains an element each from the first n lists. Another way to conceptualise it is that (diagonals . map matches) ys !! n contains comparisons between elements whose indices in xs and ys sum to n.
diagonal is simply a flat version of diagonals (diagonal = concat diagonals)
Therefore (diagonal . map matches) ys is a list of comparisons between elements of xs and ys, where the elements are approximately sorted by the sum of the indices of the elements of ys and xs being compared; this means that early elements are compared to later elements with the same priority as middle elements being compared to each other.
(catMaybes . diagonal . map matches) ys is a list of only the elements which are in both lists, where the elements are approximately sorted by the sum of the indices of the two elements being compared.
Note
(diagonal . map (catMaybes . matches)) ys does not work: catMaybes . matches only yields when it finds a match, instead of also yielding Nothing on no match, so the interleaving does nothing to distribute the work.
To contrast, in the chosen solution, the interleaving of Nothing and Just values by diagonal means that the program divides its attention between 'searching' for multiple different elements, not waiting for one to succeed; whereas if the Nothing values are removed before interleaving, the program may spend too much time waiting for a fruitless 'search' for a given element to succeed.
Therefore, we would encounter the same problem as in the original question: while one element does not match any elements in the other list, the program will hang; whereas the chosen solution will only hang while no matches are found for any elements in either list.

Replace an element in a list only once - Haskell

I want to replace an element in a list with a new value only at first time occurrence.
I wrote the code below but using it, all the matched elements will change.
replaceX :: [Int] -> Int -> Int -> [Int]
replaceX items old new = map check items where
check item | item == old = new
| otherwise = item
How can I modify the code so that the changing only happen at first matched item?
Thanks for helping!
The point is that map and f (check in your example) only communicate regarding how to transform individual elements. They don't communicate about how far down the list to transform elements: map always carries on all the way to the end.
map :: (a -> b) -> [a] -> [b]
map _ [] = []
map f (x:xs) = f x : map f xs
Let's write a new version of map --- I'll call it mapOnce because I can't think of a better name.
mapOnce :: (a -> Maybe a) -> [a] -> [a]
There are two things to note about this type signature:
Because we may stop applying f part-way down the list, the input list and the output list must have the same type. (With map, because the entire list will always be mapped, the type can change.)
The type of f hasn't changed to a -> a, but to a -> Maybe a.
Nothing will mean "leave this element unchanged, continue down the list"
Just y will mean "change this element, and leave the remaining elements unaltered"
So:
mapOnce _ [] = []
mapOnce f (x:xs) = case f x of
Nothing -> x : mapOnce f xs
Just y -> y : xs
Your example is now:
replaceX :: [Int] -> Int -> Int -> [Int]
replaceX items old new = mapOnce check items where
check item | item == old = Just new
| otherwise = Nothing
You can easily write this as a recursive iteration like so:
rep :: Eq a => [a] -> a -> a -> [a]
rep items old new = rep' items
where rep' (x:xs) | x == old = new : xs
| otherwise = x : rep' xs
rep' [] = []
A direct implementation would be
rep :: Eq a => a -> a -> [a] -> [a]
rep _ _ [] = []
rep a b (x:xs) = if x == a then b:xs else x:rep a b xs
I like list as last argument to do something like
myRep = rep 3 5 . rep 7 8 . rep 9 1
An alternative using the Lens library.
>import Control.Lens
>import Control.Applicative
>_find :: (a -> Bool) -> Simple Traversal [a] a
>_find _ _ [] = pure []
>_find pred f (a:as) = if pred a
> then (: as) <$> f a
> else (a:) <$> (_find pred f as)
This function takes a (a -> Bool) which is a function that should return True on an type 'a' that you wan to modify.
If the first number greater then 5 needs to be doubled then we could write:
>over (_find (>5)) (*2) [4, 5, 3, 2, 20, 0, 8]
[4,5,3,2,40,0,8]
The great thing about lens is that you can combine them together by composing them (.). So if we want to zero the first number <100 in the 2th sub list we could:
>over ((element 1).(_find (<100))) (const 0) [[1,2,99],[101,456,50,80,4],[1,2,3,4]]
[[1,2,99],[101,456,0,80,4],[1,2,3,4]]
To be blunt, I don't like most of the answers so far. dave4420 presents some nice insights on map that I second, but I also don't like his solution.
Why don't I like those answers? Because you should be learning to solve problems like these by breaking them down into smaller problems that can be solved by simpler functions, preferably library functions. In this case, the library is Data.List, and the function is break:
break, applied to a predicate p and a list xs, returns a tuple where first element is longest prefix (possibly empty) of xs of elements that do not satisfy p and second element is the remainder of the list.
Armed with that, we can attack the problem like this:
Split the list into two pieces: all the elements before the first occurence of old, and the rest.
The "rest" list will either be empty, or its first element will be the first occurrence of old. Both of these cases are easy to handle.
So we have this solution:
import Data.List (break)
replaceX :: Eq a => a -> a -> [a] -> [a]
replaceX old new xs = beforeOld ++ replaceFirst oldAndRest
where (beforeOld, oldAndRest) = break (==old) xs
replaceFirst [] = []
replaceFirst (_:rest) = new:rest
Example:
*Main> replaceX 5 7 ([1..7] ++ [1..7])
[1,2,3,4,7,6,7,1,2,3,4,5,6,7]
So my advice to you:
Learn how to import libraries.
Study library documentation and learn standard functions. Data.List is a great place to start.
Try to use those library functions as much as you can.
As a self study exercise, you can pick some of the standard functions from Data.List and write your own versions of them.
When you run into a problem that can't be solved with a combination of library functions, try to invent your own generic function that would be useful.
EDIT: I just realized that break is actually a Prelude function, and doesn't need to be imported. Still, Data.List is one of the best libraries to study.
Maybe not the fastest solution, but easy to understand:
rep xs x y =
let (left, (_ : right)) = break (== x) xs
in left ++ [y] ++ right
[Edit]
As Dave commented, this will fail if x is not in the list. A safe version would be:
rep xs x y =
let (left, right) = break (== x) xs
in left ++ [y] ++ drop 1 right
[Edit]
Arrgh!!!
rep xs x y = left ++ r right where
(left, right) = break (== x) xs
r (_:rs) = y:rs
r [] = []
replaceValue :: Int -> Int -> [Int] -> [Int]
replaceValue a b (x:xs)
|(a == x) = [b] ++ xs
|otherwise = [x] ++ replaceValue a b xs
Here's an imperative way to do it, using State Monad:
import Control.Monad.State
replaceOnce :: Eq a => a -> a -> [a] -> [a]
replaceOnce old new items = flip evalState False $ do
forM items $ \item -> do
replacedBefore <- get
if item == old && not replacedBefore
then do
put True
return new
else
return old

Why there is no List.skip and List.take?

Why there is no List.skip and List.take? There is of course Seq.take and Seq.skip, but they does not create lists as a result.
One possible solution is: mylist |> Seq.skip N |> Seq.toList
But this creates first enumerator then a new list from that enumerator. I think there could be more direct way to create a immutable list from immutable list. Since there is no copying of elements internally there are just references from the new list to the original one.
Other possible solution (without throwing exceptions) is:
let rec listSkip n xs =
match (n, xs) with
| 0, _ -> xs
| _, [] -> []
| n, _::xs -> listSkip (n-1) xs
But this still not answer the question...
BTW, you can add your functions to List module:
module List =
let rec skip n xs =
match (n, xs) with
| 0, _ -> xs
| _, [] -> []
| n, _::xs -> skip (n-1) xs
The would-be List.skip 1 is called List.tail, you can just tail into the list n times.
List.take would have to create a new list anyway, since only common suffixes of an immutable list can be shared.