filtering values into two lists - sml

So i'm new to sml and am trying to understand the ins/out out of it. Recently i tried creating a filter which takes two parameters: a function (that returns a boolean), and a list of values to run against the function. What the filter does is it returns the list of values which return true against the function.
fun filter f [] = [] |
filter f (x::xs) =
if (f x)
then x::(filter f xs)
else (filter f xs);
So that works. But what i'm trying to do now is just a return a tuple that contains the list of true values, and false. I'm stuck on my conditional and I can't really see another way. Any thoughts on how to solve this?
fun filter2 f [] = ([],[]) |
filter2 f (x::xs) =
if (f x)
then (x::(filter2 f xs), []) (* error *)
else ([], x::(filter2 f xs)); (* error *)

I think there are several ways to do this.
Reusing Filter
For instance, we could use a inductive approach based on the fact that your tuple would be formed by two elements, the first is the list of elements that satisfy the predicate and the second the list of elements that don't. So, you could reuse your filter function as:
fun partition f xs = (filter f xs, filter (not o f) xs)
This is not the best approach, though, because it evaluates the lists twice, but if the lists are small, this is quite evident and very readable.
Another way to think about this is in terms of fold. You could think that you are reducing your list to a tuple list, and as you go, you split your items depending on a predicate. Somwewhat like this:
fun parition f xs =
fun split x (xs,ys) =
if f x
then (x::xs,ys)
else (xs, x::ys)
val (trueList, falseList) = List.foldl (fn (x,y) => split x y)
([],[]) xs
(List.rev trueList, List.rev falseList)
You could also implement your own folding algorithm in the same way as the List.parition method of SML does:
fun partition f xs =
fun iter(xs, (trueList,falseList)) =
case xs of
[] => (List.rev trueList, List.rev falseList)
| (x::xs') => if f x
then iter(xs', (x::trueList,falseList))
else iter(xs', (trueList,x::falseList))
Use SML Basis Method
And ultimately, you can avoid all this and use SML method List.partition whose documentation says:
partition f l
applies f to each element x of l, from left to right, and returns a
pair (pos, neg) where pos is the list of those x for which f x
evaluated to true, and neg is the list of those for which f x
evaluated to false. The elements of pos and neg retain the same
relative order they possessed in l.
This method is implemented as the previous example.

So I will show a good way to do it, and a better way to do it (IMO). But the 'better way' is just for future reference when you learn:
fun filter2 f [] = ([], [])
| filter2 f (x::xs) = let
fun ftuple f (x::xs) trueList falseList =
if (f x)
then ftuple f xs (x::trueList) falseList
else ftuple f xs trueList (x::falseList)
| ftuple _ [] trueList falseList = (trueList, falseList)
ftuple f (x::xs) [] []
The reason why yours does not work is because when you call x::(filter2 f xs), the compiler is naively assuming that you are building a single list, it doesn't assume that it is a tuple, it is stepping into the scope of your function call. So while you think to yourself result type is tuple of lists, the compiler gets tunnel vision and thinks result type is list. Here is the better version in my opinion, you should look up the function foldr if you are curious, it is much better to employ this technique since it is more readable, less verbose, and much more importantly ... more predictable and robust:
fun filter2 f l = foldr (fn(x,xs) => if (f x) then (x::(#1(xs)), #2(xs)) else (#1(xs), x::(#2(xs)))) ([],[]) l;
The reason why the first example works is because you are storing default empty lists that accumulate copies of the variables that either fit the condition, or do not fit the condition. However, you have to explicitly tell SML compiler to make sure that the type rules agree. You have to make absolutely sure that SML knows that your return type is a tuple of lists. Any mistake in this chain of command, and this will result in failure to execute. Hence, when working with SML, always study your type inferences. As for the second one, you can see that it is a one-liner, but I will leave you to research that one on your own, just google foldr and foldl.


Insert function using foldl/foldr

I have been working on a separate function that returns a list that inserts element x after each k elements of list l (counting from
the end of the list). For example, separate (1, 0, [1,2,3,4]) should return [1,0,2,0,3,0,4]. I finished the function and have it working as follows:
fun separate (k: int, x: 'a, l: 'a list) : 'a list =
fun kinsert [] _ = []
| kinsert ls 0 = x::(kinsert ls k)
| kinsert (l::ls) i = l::(kinsert ls (i-1))
List.rev (kinsert (List.rev l) k)
Im now trying to simplify the function using foldl/foldr without any recursion, but I cant seem to get it working right. Any tips/suggestions on how to approach this? Thank You!
These are more or less the thoughts I had when trying to write the function using foldl/foldr:
foldl/foldr abstracts away the list recursion from the logic that composes the end result.
Start by sketching out a function that has a much similar structure to your original program, but where foldr is used and kinsert instead of being a recursive function is the function given to foldr:
fun separate (k, x, L) =
let fun kinsert (y, ys) = ...
in foldr kinsert [] L
This isn't strictly necessary; kinsert might as well be anonymous.
You're using an inner helper function kinsert because you need a copy of k (i) that you gradually decrement and reset to k every time it reaches 0. So while the list that kinsert spits out is equivalent to the fold's accumulated variable, i is temporarily accumulated (and occasionally reset) in much the same way.
Change kinsert's accumulating variable to make room for i:
fun separate (k, x, L) =
let fun kinsert (y, (i, xs)) = ...
in foldr kinsert (?, []) L
Now the result of the fold becomes 'a * 'a list, which causes two problems: 1) We only really wanted to accumulate i temporarily, but it's part of the final result. This can be circumvented by discarding it using #2 (foldr ...). 2) If the result is now a tuple, I'm not sure what to put as the first i in place of ?.
Since kinsert is a separate function declaration, you can use pattern matching and multiple function bodies:
fun separate (k, x, L) =
let fun kinsert (y, (0, ys)) = ...
| kinsert (y, (i, ys)) = ...
in ... foldr kinsert ... L
Your original kinsert deviates from the recursion pattern that a fold performs in one way: In the middle pattern, when i matches 0, you're not chopping an element off ls, which a fold would otherwise force you to. So your 0 case will look slightly different from the original; you'll probably run into an off-by-one error.
Remember that foldr actually visits the last element in the list first, at which point i will have its initial value, where with the original kinsert, the initial value for i will be when you're at the first element.
Depending on whether you use foldl or foldr you'll run into different problems: foldl will reverse your list, but address items in the right order. foldr will keep the list order correct, but create a different result when k does not divide the length of L...
At this point, consider using foldl and reverse the list instead:
fun separate (k, x, L) =
let fun kinsert (y, (?, ys)) = ...
| kinsert (y, (i, ys)) = ...
in rev (... foldl kinsert ... L)
Otherwise you'll start to notice that separate (2, 0, [1,2,3,4,5]) should probably give [1,2,0,3,4,0,5] and not [1,0,2,3,0,5].

Lists defined as Maybe in Haskell? Why not?

You don't offen see Maybe List except for error-handling for example, because lists are a bit Maybe themselves: they have their own "Nothing": [] and their own "Just": (:).
I wrote a list type using Maybe and functions to convert standard and to "experimental" lists. toStd . toExp == id.
data List a = List a (Maybe (List a))
deriving (Eq, Show, Read)
toExp [] = Nothing
toExp (x:xs) = Just (List x (toExp xs))
toStd Nothing = []
toStd (Just (List x xs)) = x : (toStd xs)
What do you think about it, as an attempt to reduce repetition, to generalize?
Trees too could be defined using these lists:
type Tree a = List (Tree a, Tree a)
I haven't tested this last piece of code, though.
All ADTs are isomorphic (almost--see end) to some combination of (,),Either,(),(->),Void and Mu where
data Void --using empty data decls or
newtype Void = Void Void
and Mu computes the fixpoint of a functor
newtype Mu f = Mu (f (Mu f))
so for example
data [a] = [] | (a:[a])
is the same as
data [a] = Mu (ListF a)
data ListF a f = End | Pair a f
which itself is isomorphic to
newtype ListF a f = ListF (Either () (a,f))
data Maybe a = Nothing | Just a
is isomorphic to
newtype Maybe a = Maybe (Either () a)
you have
newtype ListF a f = ListF (Maybe (a,f))
which can be inlined in the mu to
data List a = List (Maybe (a,List a))
and your definition
data List a = List a (Maybe (List a))
is just the unfolding of the Mu and elimination of the outer Maybe (corresponding to non-empty lists)
and you are done...
a couple of things
Using custom ADTs increases clarity and type safety
This universality is useful: see GHC.Generic
Okay, I said almost isomorphic. It is not exactly, namely
hmm = List (Just undefined)
has no equivalent value in the [a] = [] | (a:[a]) definition of lists. This is because Haskell data types are coinductive, and has been a point of criticism of the lazy evaluation model. You can get around these problems by only using strict sums and products (and call by value functions), and adding a special "Lazy" data constructor
data SPair a b = SPair !a !b
data SEither a b = SLeft !a | SRight !b
data Lazy a = Lazy a --Note, this has no obvious encoding in Pure CBV languages,
--although Laza a = (() -> a) is semantically correct,
--it is strictly less efficient than Haskell's CB-Need
and then all the isomorphisms can be faithfully encoded.
You can define lists in a bunch of ways in Haskell. For example, as functions:
{-# LANGUAGE RankNTypes #-}
newtype List a = List { runList :: forall b. (a -> b -> b) -> b -> b }
nil :: List a
nil = List (\_ z -> z )
cons :: a -> List a -> List a
cons x xs = List (\f z -> f x (runList xs f z))
isNil :: List a -> Bool
isNil xs = runList xs (\x xs -> False) True
head :: List a -> a
head xs = runList xs (\x xs -> x) (error "empty list")
tail :: List a -> List a
tail xs | isNil xs = error "empty list"
tail xs = fst (runList xs go (nil, nil))
where go x (xs, xs') = (xs', cons x xs)
foldr :: (a -> b -> b) -> b -> List a -> b
foldr f z xs = runList xs f z
The trick to this implementation is that lists are being represented as functions that execute a fold over the elements of the list:
fromNative :: [a] -> List a
fromNative xs = List (\f z -> foldr f z xs)
toNative :: List a -> [a]
toNative xs = runList xs (:) []
In any case, what really matters is the contract (or laws) that the type and its operations follow, and the performance of implementation. Basically, any implementation that fulfills the contract will give you correct programs, and faster implementations will give you faster programs.
What is the contract of lists? Well, I'm not going to express it in complete detail, but lists obey statements like these:
head (x:xs) == x
tail (x:xs) == xs
[] == []
[] /= x:xs
If xs == ys and x == y, then x:xs == y:ys
foldr f z [] == z
foldr f z (x:xs) == f x (foldr f z xs)
EDIT: And to tie this to augustss' answer:
newtype ExpList a = ExpList (Maybe (a, ExpList a))
toExpList :: List a -> ExpList a
toExpList xs = runList xs (\x xs -> ExpList (Just (x, xs))) (ExpList Nothing)
foldExpList f z (ExpList Nothing) = z
foldExpList f z (ExpList (Just (head, taill))) = f head (foldExpList f z tail)
fromExpList :: ExpList a -> List a
fromExpList xs = List (\f z -> foldExpList f z xs)
You could define lists in terms of Maybe, but not that way do. Your List type cannot be empty. Or did you intend Maybe (List a) to be the replacement of [a]. This seems bad since it doesn't distinguish the list and maybe types.
This would work
newtype List a = List (Maybe (a, List a))
This has some problems. First using this would be more verbose than usual lists, and second, the domain is not isomorphic to lists since we got a pair in there (which can be undefined; adding an extra level in the domain).
If it's a list, it should be an instance of Functor, right?
instance Functor List
where fmap f (List a as) = List (f a) (mapMaybeList f as)
mapMaybeList :: (a -> b) -> Maybe (List a) -> Maybe (List b)
mapMaybeList f as = fmap (fmap f) as
Here's a problem: you can make List an instance of Functor, but your Maybe List is not: even if Maybe was not already an instance of Functor in its own right, you can't directly make a construction like Maybe . List into an instance of anything (you'd need a wrapper type).
Similarly for other typeclasses.
Having said that, with your formulation you can do this, which you can't do with standard Haskell lists:
instance Comonad List
where extract (List a _) = a
duplicate x # (List _ y) = List x (duplicate y)
A Maybe List still wouldn't be comonadic though.
When I first started using Haskell, I too tried to represent things in existing types as much as I could on the grounds that it's good to avoid redundancy. My current understanding (moving target!) tends to involve more the idea of a multidimensional web of trade-offs. I won't be giving any “answer” here so much as pasting examples and asking “do you see what I mean?” I hope it helps anyway.
Let's have a look at a bit of Darcs code:
data UseCache = YesUseCache | NoUseCache
deriving ( Eq )
data DryRun = YesDryRun | NoDryRun
deriving ( Eq )
data Compression = NoCompression
| GzipCompression
deriving ( Eq )
Did you notice that these three types could all have been Bool's? Why do you think the Darcs hackers decided that they should introduce this sort of redundancy in their code? As another example, here is a piece of code we changed a few years back:
type Slot = Maybe Bool -- OLD code
data Slot = InFirst | InMiddle | InLast -- newer code
Why do you think we decided that the second code was an improvement over the first?
Finally, here is a bit of code from some of my day job stuff. It uses the newtype syntax that augustss mentioned,
newtype Role = Role { fromRole :: Text }
deriving (Eq, Ord)
newtype KmClass = KmClass { fromKmClass :: Text }
deriving (Eq, Ord)
newtype Lemma = Lemma { fromLemma :: Text }
deriving (Eq, Ord)
Here you'll notice that I've done the curious thing of taking a perfectly good Text type and then wrapping it up into three different things. The three things don't have any new features compared to plain old Text. They're just there to be different. To be honest, I'm not entirely sure if it was a good idea for me to do this. I provisionally think it was because I manipulate lots of different bits and pieces of text for lots of reasons, but time will tell.
Can you see what I'm trying to get at?

Using Haskell's map function to calculate the sum of a list

addm (x:xs) = sum(x:xs)
I was able to achieve to get a sum of a list using sum function but is it possible to get the sum of a list using map function? Also what the use of map function?
You can't really use map to sum up a list, because map treats each list element independently from the others. You can use map for example to increment each value in a list like in
map (+1) [1,2,3,4] -- gives [2,3,4,5]
Another way to implement your addm would be to use foldl:
addm' = foldl (+) 0
Here it is, the supposedly impossible definition of sum in terms of map:
sum' xs = let { ys = 0 : map (\(a,b) -> a + b) (zip xs ys) } in last ys
this actually shows how scanl can be implemented in terms of map (and zip and last), the above being equivalent to foldl (+) 0 xs === last $ scanl (+) 0 xs:
scanl' f z xs = let { ys = z : map (uncurry f) (zip ys xs) } in ys
I expect one can calculate many things with map, arranging for all kinds of information flow through zip.
edit: the above is just a zipWith in disguise of course (and zipWith is kind of a map2):
sum' xs = let { ys = 0 : zipWith (+) ys xs } in last ys
This seems to suggest that scanl is more versatile than foldl.
It is not possible to use map to reduce a list to its sum. That recursive pattern is a fold.
sum :: [Int] -> Int
sum = foldr (+) 0
As an aside, note that you can define map as a fold as well:
map :: (a -> b) -> ([a] -> [b])
map f = fold (\x xs -> f x : xs) []
This is because foldr is the canonical recursive function on lists.
References: A tutorial on the universality and expressiveness of fold, Graham Hutton, J. Functional Programming 9 (4): 355–372, July 1999.
After some insights I have to add another answer: You can't get the sum of a list with map, but you can get the sum with its monadic version mapM. All you need to do is to use a Writer monad (see LYAHFGG) over the Sum monoid (see LYAHFGG).
I wrote a specialized version, which is probably easier to understand:
data Adder a = Adder a Int
instance Monad Adder where
return x = Adder x 0
(Adder x s) >>= f = let Adder x' s' = f x
in Adder x' (s + s')
toAdder x = Adder x x
sum' xs = let Adder _ s = mapM toAdder xs in s
main = print $ sum' [1..100]
Adder is just a wrapper around some type which also keeps a "running sum." We can make Adder a monad, and here it does some work: When the operation >>= (a.k.a. "bind") is executed, it returns the new result and the value of the running sum of that result plus the original running sum. The toAdder function takes an Int and creates an Adder that holds that argument both as wrapped value and as running sum (actually we're not interested in the value, but only in the sum part). Then in sum' mapM can do its magic: While it works similar to map for the values embedded in the monad, it executes "monadic" functions like toAdder, and chains these calls (it uses sequence to do this). At this point, we get through the "backdoor" of our monad the interaction between list elements that the standard map is missing.
Map "maps" each element of your list to an element in your output:
let f(x) = x*x
map f [1,2,3]
This will return a list of the squares.
To sum all elements in a list, use fold:
foldl (+) 0 [1,2,3]
+ is the function you want to apply, and 0 is the initial value (0 for sum, 1 for product etc)
As the other answers point out, the "normal" way is to use one of the fold functions. However it is possible to write something pretty similar to a while loop in imperative languages:
sum' [] = 0
sum' xs = head $ until single loop xs where
single [_] = True
single _ = False
loop (x1 : x2 : xs) = (x1 + x2) : xs
It adds the first two elements of the list together until it ends up with a one-element list, and returns that value (using head).
I realize this question has been answered, but I wanted to add this thought...
listLen2 :: [a] -> Int
listLen2 = sum . map (const 1)
I believe it returns the constant 1 for each item in the list, and returns the sum!
Might not be the best coding practice, but it was an example my professor gave to us students that seems to relate to this question well.
map can never be the primary tool for summing the elements of a container, in much the same way that a screwdriver can never be the primary tool for watching a movie. But you can use a screwdriver to fix a movie projector. If you really want, you can write
import Data.Monoid
import Data.Foldable
mySum :: (Foldable f, Functor f, Num a)
=> f a -> a
mySum = getSum . fold . fmap Sum
Of course, this is silly. You can get a more general, and possibly more efficient, version:
mySum' :: (Foldable f, Num a) => f a -> a
mySum' = getSum . foldMap Sum
Or better, just use sum, because its actually made for the job.

How to takeWhile elements in a list wrapped in a monad

Got a little puzzle I was wondering if you could help me clarify.
Let's define a function that returns a list:
let f = replicate 3
What we want to do is map this function to an infinite list, concatenate the results, and then take only things that match a predicate.
takeWhile (< 3) $ concatMap f [1..]
Great! That returns [1,1,1,2,2,2], which is what I want.
Now, I want to do something similar, but the function f now wraps its results in a Monad. In my usecase, this is the IO monad, but this works for discussing my problem:
let f' x = Just $ replicate 3 x
To map and concat, I can use:
fmap concat $ mapM f' [1..5]
That returns: Just [1,1,1,2,2,2,3,3,3,4,4,4,5,5,5]
If I want to use takeWhile, this still works:
fmap (takeWhile (< 3) . concat) $ mapM f' [1..5]
Which returns: Just [1,1,1,2,2,2]. Great!
But, if I make the list over which I map an infinite list this does not do what I expected:
fmap (takeWhile (< 3) . concat) $ mapM f' [1..]
Seems like the takeWhile is never happening. Somehow, I'm not getting the lazy computation I was expecting. I’m a bit lost.
The problem isn't that fmap + takeWhile doesn't work with infinite lists wrapped in a monad. The problem is that mapM can't produce an infinite list (at least not in the Maybe monad).
Think about it: If f' returns Nothing for any item in the list, mapM has to return Nothing. However mapM can't know whether that will happen until it has called f' on all items in the list. So it needs to iterate through the whole list before it knows whether the result is Nothing or Just. Obviously that's a problem with infinite lists.
This should do the trick:
takeWhileM :: (Monad m) => (a -> Bool) -> [m a] -> m [a]
takeWhileM p [] = return []
takeWhileM p (m:ms) = do
x <- m
if p x
then liftM (x:) (takeWhileM p ms)
else return []
See sepp2k's answer for an explanation of why you are losing laziness. The Identity monad or the nonempty list monad, for example, wouldn't have this problem.
You can't mapM an infinite list of Maybes. mapM is map followed by sequence. Here is the definition of sequence:
sequence ms = foldr k (return []) ms
k m m' = do { x <- m; xs <- m'; return (x:xs) }
From this we see that sequence evaluates every monadic value in the list. Since it's an infinite list, this operation will not terminate.
luqui and Carl make a good point that this doesn't generalize to any monad. To see why it doesn't work for Maybe we need to look at the implementation of (>>=):
(>>=) m k = case m of
Just x -> k x
Nothing -> Nothing
The important point here is that we do a case on m. This makes the m strict because we have to evaluate it to figure out how to continue execution. Note that we're not casing on x here, so it remains lazy.

Apply "permutations" of a function over a list

Creating the permutations of a list or set is simple enough. I need to apply a function to each element of all subsets of all elements in a list, in the order in which they occur. For instance:
apply f [x,y] = { [x,y], [f x, y], [x, f y], [f x, f y] }
The code I have is a monstrous pipeline or expensive computations, and I'm not sure how to proceed, or if it's correct. I'm sure there must be a better way to accomplish this task - perhaps in the list monad - but I'm not sure. This is my code:
apply :: Ord a => (a -> Maybe a) -> [a] -> Set [a]
apply p xs = let box = take (length xs + 1) . map (take $ length xs) in
(Set.fromList . map (catMaybes . zipWith (flip ($)) xs) . concatMap permutations
. box . map (flip (++) (repeat Just)) . flip iterate []) ((:) p)
The general idea was:
(1) make the list
[[], [f], [f,f], [f,f,f], ... ]
(2) map (++ repeat Just) over the list to obtain
[[Just, Just, Just, Just, ... ],
[f , Just, Just, Just, ... ],
[f , f , Just, Just, ... ],
... ]
(3) find all permutations of each list in (2) shaved to the length of the input list
(4) apply the permuted lists to the original list, garnering all possible applications
of the function f to each (possibly empty) subset of the original list, preserving
the original order.
I'm sure there's a better way to do it, though. I just don't know it. This way is expensive, messy, and rather prone to error. The Justs are there because of the intended application.
To do this, you can leverage the fact that lists represent non-deterministic values when using applicatives and monads. It then becomes as simple as:
apply f = mapM (\x -> [x, f x])
It basically reads as follows: "Map each item in a list to itself and the result of applying f to it. Finally, return a list of all the possible combinations of these two values across the whole list."
If I understand your problem correctly, it's best not to describe it in terms of permutations. Rather, it's closer to generating powersets.
powerset (x:xs) = let pxs = powerset xs in pxs ++ map (x :) pxs
powerset [] = [[]]
Each time you add another member to the head of the list, the powerset doubles in size. The second half of the powerset is exactly like the first, but with x included.
For your problem, the choice is not whether to include or exclude x, but whether to apply or not apply f.
powersetapp f (x:xs) = let pxs = powersetapp f xs in map (x:) pxs ++ map (f x:) pxs
powersetapp f [] = [[]]
This does what your "apply" function does, modulo making a Set out of the result.
Paul's and Heatsink's answers are good, but error out when you try to run them on infinite lists.
Here's a different method that works on both infinite and finite lists:
apply _ [] = [ [] ]
apply f (x:xs) = (x:ys):(x':ys):(double yss)
where x' = f x
(ys:yss) = apply f xs
double [] = []
double (ys:yss) = (x:ys):(x':ys):(double yss)
This works as expected - though you'll note it produces a different order to the permutations than Paul's and Heatsink's
ghci> -- on an infinite list
ghci> map (take 4) $ take 16 $ apply (+1) [0,0..]
ghci> -- on a finite list
ghci> apply (+1) [0,0,0,0]
Here is an alternative phrasing of rampion's infinite-input-handling solution:
-- sequence a list of nonempty lists
sequenceList :: [[a]] -> [[a]]
sequenceList [] = [[]]
sequenceList (m:ms) = do
xs <- nonempty (sequenceList ms)
x <- nonempty m
return (x:xs)
nonempty ~(x:xs) = x:xs
Then we can define apply in Paul's idiomatic style:
apply f = sequenceList . map (\x -> [x, f x])
Contrast sequenceList with the usual definition of sequence:
sequence :: (Monad m) => [m a] -> m [a]
sequence [] = [[]]
sequence (m:ms) = do
x <- m
xs <- sequence ms
return (x:xs)
The order of binding is reversed in sequenceList so that the variations of the first element are the "inner loop", i.e. we vary the head faster than the tail. Varying the end of an infinite list is a waste of time.
The other key change is nonempty, the promise that we won't bind an empty list. If any of the inputs were empty, or if the result of the recursive call to sequenceList were ever empty, then we would be forced to return an empty list. We can't tell in advance whether any of inputs is empty (because there are infinitely many of them to check), so the only way for this function to output anything at all is to promise that they won't be.
Anyway, this is fun subtle stuff. Don't stress about it on your first day :-)