Various instances of monads model different type of effects: for example, Maybe models partiality, List non-determinism, Reader read-only state. I would like to know if there is such an intuitive explanation for the monad instance of the stream data type (or infinite list or co-list), data Stream a = Cons a (Stream a) (see below its monad instance definition). I've stumbled upon the stream monad on a few different occasions and I would like to understand better its uses.
data Stream a = Cons a (Stream a)
instance Functor Stream where
fmap f (Cons x xs) = Cons (f x) (fmap f xs)
instance Applicative Stream where
pure a = Cons a (pure a)
(Cons f fs) <*> (Cons a as) = Cons (f a) (fs <*> as)
instance Monad Stream where
xs >>= f = diag (fmap f xs)
diag :: Stream (Stream a) -> Stream a
diag (Cons xs xss) = Cons (hd xs) (diag (fmap tl xss))
where
hd (Cons a _ ) = a
tl (Cons _ as) = as
P.S.: I'm not sure if I'm very precise in my language (especially when using the word "effect"), so feel free to correct me.
The Stream monad is isomorphic to Reader Natural (Natural: natural numbers), meaning that there is a bijection between Stream and Reader Natural which preserves their monadic structure.
Intuitively
Both Stream a and Reader Natural a (Natural -> a) can be seen as representing infinite collections of a indexed by integers.
fStream = Cons a0 (Cons a1 (Cons a2 ...))
fReader = \i -> case i of
0 -> a0
1 -> a1
2 -> a2
...
Their Applicative and Monad instances both compose elements index-wise. It's easier to show the intuition for Applicative. Below, we show a stream A of a0, a1, ..., and B of b0, b1, ..., and their composition AB = liftA2 (+) A B, and an equivalent presentation of functions.
fStreamA = Cons a0 (Cons a1 ...)
fStreamB = Cons b0 (Cons b1 ...)
fStreamAB = Cons (a0+b0) (Cons (a1+b1) ...)
fStreamAB = liftA2 (+) fStreamA fStreamB
-- lambda case "\case" is sugar for "\x -> case x of"
fReaderA = \case 0 -> a0 ; 1 -> a1 ; ...
fReaderB = \case 0 -> b0 ; 1 -> b1 ; ...
fReaderC = \case 0 -> a0+b0 ; 1 -> a1+b1 ; ...
fReaderC = liftA2 (+) fReaderA fReaderB = \i -> fReaderA i + fReaderB i
Formally
The bijection:
import Numeric.Natural -- in the base library
-- It could also be Integer, there is a bijection Integer <-> Natural
type ReaderN a = Natural -> a
tailReader :: ReaderN a -> ReaderN a
tailReader r = \i -> r (i+1)
toStream :: ReaderN a -> Stream a
toStream r = Cons (r 0) (toStream (tailReader r))
fromStream :: Stream a -> ReaderN a
fromStream (Cons a s) = \i -> case i of
0 -> a
i -> fromStream s (i-1)
toStream and fromStream are bijections, meaning that they satisfy these equations:
toStream (fromStream s) = s :: Stream a
fromStream (toStream r) = r :: ReaderN a
"Isomorphism" is a general notion; two things being isomorphic usually means that there is a bijection satisfying certain equations, which depend on the structure or interface being considered. In this case, we are talking about the structure of monads, and we say that two monads are isomorphic if there is a bijection which satisfies these equations:
toStream (return a) = return a
toStream (u >>= k) = toStream u >>= (toStream . k)
The idea is that we get the same result whether we apply the functions return and (>>=) "before or after" the bijection. (The similar equations using fromStream can then be derived from these two equations and the other two above).
#Li-yao Xia's answer pretty much covers it, but if it helps your intuition, think of the Stream monad as modeling an infinite sequence of parallel computations. A Stream value itself is an (infinite) sequence of values, and I can use the Functor instance to apply the same function in parallel to all values in the sequence; the Applicative instance to apply a sequence of given functions to a sequence of values, pointwise with each function applied to the corresponding value; and the Monad instance to apply a computation to each value in the sequence with a result that can depend on both the value and its position within the sequence.
As an example of some typical operations, here are some sample sequences plus a Show-instance
instance (Show a) => Show (Stream a) where
show = show . take 10 . toList
nat = go 1 where go x = Cons x (go (x+1))
odds = go 1 where go x = Cons x (go (x+2))
giving:
> odds
[1,3,5,7,9,11,13,15,17,19]
> -- apply same function to all values
> let evens = fmap (1+) odds
> evens
[2,4,6,8,10,12,14,16,18,20]
> -- pointwise application of functions to values
> (+) <$> odds <*> evens
[3,7,11,15,19,23,27,31,35,39]
> -- computation that depends on value and structure (position)
> odds >>= \val -> fmap (\pos -> (pos,val)) nat
[(1,1),(2,3),(3,5),(4,7),(5,9),(6,11),(7,13),(8,15),(9,17),(10,19)]
>
The difference between the Applicative and Monadic computations here is similar to other monads: the applicative operations have a static structure, in the sense that each result in a <*> b depends only on the values of the corresponding elements in a and b independent of how they fit in to the larger structure (i.e., their positions in the sequence); in contrast, the monadic operations can have a structure that depends on the underlying values, so that in the expression as >>= f, for a given value a in as, the corresponding result can depend both on the specific value a and structurally on its position within the sequence (since this will determine which element of the sequence f a will provide the result).
It turns out that in this case the apparent additional generality of monadic computations doesn't translate into any actual additional generality, as you can see by the fact that the last example above is equivalent to the purely applicative operation:
(,) <$> nat <*> odds
More generally, given a monadic action f :: a -> Stream b, it will always be possible to write it as:
f a = Cons (f1 a) (Cons (f2 a) ...))
for appropriately defined f1 :: a -> b, f2 :: a -> b, etc., after which we'll be able to express the monadic action as an application action:
as >>= f = (Cons f1 (Cons f2 ...)) <*> as
Contrast this with what happens in the List monad: Given f :: a -> List b, if we could write:
f a = [f1 a, f2 a, ..., fn a]
(meaning in particular that the number of elements in the result would be determined by f alone, regardless of the value of a), then we'd have the same situation:
as >>= f = as <**> [f1,...,fn]
and every monadic list operation would be a fundamentally applicative operation.
So, the fact that not all finite lists are the same length makes the List monad more powerful than its applicative, but because all (infinite) sequences are the same length, the Stream monad adds nothing over the applicative instance.
Related
I have two functions which only do something if C is a specific pattern.
Each function outputs a list of C.
My goal is, given [C], I want to get all possibilities of calling f1 and f2 on the list while leaving the rest unchanged. For example:
suppose the list of C is:
c1
c2 --matches the pattern
c3
then I want a list of two lists
[[c1] ++ (f1 c2) ++ [c3],[c1] ++ (f2 c2) ++ [c3]]
However, if I have
c1
c2 --matches the pattern
c3 --matches the pattern
Then we should have 4 lists because we want all combinations of calling f1 and f2.
So it would be:
[(f1 c1) ++ (f1 c2) ++ [c3], (f2 c1) ++ (f2 c2) ++ [c3],
(f1 c1) ++ (f2 c2) ++ [c3], (f2 c1) ++ (f1 c2) ++ [c3]]
currently, my code is structured roughly in the following way:
f1 :: C -> [C]
f2 :: C -> [C]
combine :: [C] -> [[C]]
combine my_pattern:xs = ?
combine (x:xs) = ?
combine [] = []
where first_set = (f1 my_pattern)
second_set = (f2 my_pattern)
Could someone give intuition on how I could fill the remaining part? Is there any functions from Data.List that can be useful? I looked at the documentation, but wasn't able to immediately notice which one could be helpful.
The other answers seem very complicated to me. In this answer I will expand on my comment: this is just a foldMap combining the nondeterminism monad (lists!) with the sequence monoid (lists!).
First write a thing that works on a single element of the list:
singleElement x
| matchesThePattern x = [f1 x, f2 x]
| otherwise = [[x]]
Then apply it to each element:
import Data.Monoid
combine = foldMap (Ap . singleElement)
That's it. That's the whole code.
For example, suppose we want to repeat each letter either 2 or 3 times, i.e. x -> xx or xxx, and all other characters to stay the same.
singleElement x
| 'a' <= x && x <= 'z' = [[x, x], [x, x, x]]
| otherwise = [[x]]
Then we can try it in ghci:
> combine "123def"
Ap {getAp = ["123ddeeff","123ddeefff","123ddeeeff","123ddeeefff","123dddeeff","123dddeefff","123dddeeeff","123dddeeefff"]}
Pick a better name than singleElement in your own code, of course.
You must have
applicable_f1 :: C -> Bool
applicable_f2 :: C -> Bool
defined somehow. Then,
combinations :: [C] -> [[C]]
combinations cs = map concat . sequence $
[ concat $ [ [ [c] | not (applicable_f1 c || applicable_f2 c)]
, [ f1 c | applicable_f1 c]
, [ f2 c | applicable_f2 c] ]
| c <- cs]
My approach would be to
Solve the problem for the element in the list you're currently looking at (x or my_pattern). This means generating one or more new lists.
Solve the problem for the rest of the list (xs). This will give you back a list of lists ([[C]]).
Combine the two solutions. If you have multiple lists generated from step 1, each of these lists ([C]) will combine with each list (also [C]) in the list of lists ([[C]]) from step 2.
I have two possible approaches.
It isn't clear to me how much help you are looking for, so I've left my answers somewhat "spoiler free." Ask for clarification or more details if you need it.
List comprehension
Without delving into the weeds of the Applicative or Traversable typeclasses, you can accomplish what you want with a list comprehension.
Let's consider the case where your pattern is matched. I would write a list comprehension as follows:
[ x ++ y | x <- _, y <- _] :: [[C]]
-- this means
-- x :: [C]
-- y :: [C]
-- _ :: [[C]]
This list comprehension creates a list of lists. x is what is being prepended, so it would make sense for it to be coming from the application of the functions f1 and f2. y is the tail end of each resulting list. I'll leave you to figure out what it might be.
The non matching case is simpler than this, and can be written like
[ x : y | y <- _] :: [[C]]
-- note that x is not local to the list comprehension
-- y :: [C]
-- _ :: [[C]]
although this really is just a special case of the above list comprehension.
Applicative
Another way of approaching this problem would be by using the Applicative instance of [a].
Let's examine the function (<*>) under the list Applicative instance.
-- this is the type when specialized to lists
(<*>) :: [a -> b] -> [a] -> [b]
This function has a kind of strange type signature. It takes a list of functions, and a list, then returns you another list. It has the effect of applying each function a -> b to each element of [a] in order.
>>> [(+1), (+2)] <*> [1,2,3]
-- [2,3,4] comes from (+1)
-- [3,4,5] comes from (+2)
[2,3,4,3,4,5]
We want to get out [[C]], not [C], so if we want to use (<*>) we can specialize its type more to
(<*>) :: [a -> [C]] -> [a] -> [[C]]
To avoid confusion, I recommend picking a = [C], which gives
(<*>) :: [[C] -> [C]] -> [[C]] -> [[C]]
Your list of functions should be prepending the right elements onto the lists you're generating. The second argument should be the lists returned by a recursive call.
For example, I am writing some function for lists and I want to use length function
foo :: [a] -> Bool
foo xs = length xs == 100
How can someone understand could this function be used with infinite lists or not?
Or should I always think about infinite lists and use something like this
foo :: [a] -> Bool
foo xs = length (take 101 xs) == 100
instead of using length directly?
What if haskell would have FiniteList type, so length and foo would be
length :: FiniteList a -> Int
foo :: FiniteList a -> Bool
length traverses the entire list, but to determine if a list has a particular length n you only need to look at the first n elements.
Your idea of using take will work. Alternatively
you can write a lengthIs function like this:
-- assume n >= 0
lengthIs 0 [] = True
lengthIs 0 _ = False
lengthIs n [] = False
lengthIs n (x:xs) = lengthIs (n-1) xs
You can use the same idea to write the lengthIsAtLeast and lengthIsAtMost variants.
On edit: I am primaily responding to the question in your title rather than the specifics of your particular example, (for which ErikR's answer is excellent).
A great many functions (such as length itself) on lists only make sense for finite lists. If the function that you are writing only makes sense for finite lists, make that clear in the documentation (if it isn't obvious). There isn't any way to enforce the restriction since the Halting problem is unsolvable. There simply is no algorithm to determine ahead of time whether or not the comprehension
takeWhile f [1..]
(where f is a predicate on integers) produces a finite or an infinite list.
Nats and laziness strike again:
import Data.List
data Nat = S Nat | Z deriving (Eq)
instance Num Nat where
fromInteger 0 = Z
fromInteger n = S (fromInteger (n - 1))
Z + m = m
S n + m = S (n + m)
lazyLength :: [a] -> Nat
lazyLength = genericLength
main = do
print $ lazyLength [1..] == 100 -- False
print $ lazyLength [1..100] == 100 -- True
ErikR and John Coleman have already answered the main parts of your question, however I'd like to point out something in addition:
It's best to write your functions in a way that they simply don't depend on the finiteness or infinity of their inputs — sometimes it's impossible but a lot of the time it's just a matter of redesign. For example instead of computing the average of the entire list, you can compute a running average, which is itself a list; and this list will itself be infinite if the input list is infinite, and finite otherwise.
avg :: [Double] -> [Double]
avg = drop 1 . scanl f 0.0 . zip [0..]
where f avg (n, i) = avg * (dbl n / dbl n') +
i / dbl n' where n' = n+1
dbl = fromInteger
in which case you could average an infinite list, not having to take its length:
*Main> take 10 $ avg [1..]
[1.0,1.5,2.0,2.5,3.0,3.5,4.0,4.5,5.0]
In other words, one option is to design as much of your functions to simply not care about the infinity aspect, and delay the (full) evaluation of lists, and other (potentially infinite) data structures, to as late a phase in your program as possible.
This way they will also be more reusable and composable — anything with fewer or more general assumptions about its inputs tends to be more composable; conversely, anything with more or more specific assumptions tends to be less composable and therefore less reusable.
There are a couple different ways to make a finite list type. The first is simply to make lists strict in their spines:
data FList a = Nil | Cons a !(FList a)
Unfortunately, this throws away all efficiency benefits of laziness. Some of these can be recovered by using length-indexed lists instead:
{-# LANGUAGE GADTs #-}
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE KindSignatures #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# OPTIONS_GHC -fwarn-incomplete-patterns #-}
data Nat = Z | S Nat deriving (Show, Read, Eq, Ord)
data Vec :: Nat -> * -> * where
Nil :: Vec 'Z a
Cons :: a -> Vec n a -> Vec ('S n) a
instance Functor (Vec n) where
fmap _f Nil = Nil
fmap f (Cons x xs) = Cons (f x) (fmap f xs)
data FList :: * -> * where
FList :: Vec n a -> FList a
instance Functor FList where
fmap f (FList xs) = FList (fmap f xs)
fcons :: a -> FList a -> FList a
fcons x (FList xs) = FList (Cons x xs)
funcons :: FList a -> Maybe (a, FList a)
funcons (FList Nil) = Nothing
funcons (FList (Cons x xs)) = Just (x, FList xs)
-- Foldable and Traversable instances are straightforward
-- as well, and in recent GHC versions, Foldable brings
-- along a definition of length.
GHC does not allow infinite types, so there's no way to build an infinite Vec and thus no way to build an infinite FList (1). However, an FList can be transformed and consumed somewhat lazily, with the cache and garbage collection benefits that entails.
(1) Note that the type system forces fcons to be strict in its FList argument, so any attempt to tie a knot with FList will bottom out.
The standard libraries include a function
unzip :: [(a, b)] -> ([a], [b])
The obvious way to define this is
unzip xs = (map fst xs, map snd xs)
However, this means traversing the list twice to construct the result. What I'm wondering is, is there some way to do this with only one traversal?
Appending to a list is expensive - O(n) in fact. But, as any newbie knows, we can make clever use of laziness and recursion to "append" to a list with a recursive call. Thus, zip may easily be implemented as
zip :: [a] -> [b] -> [(a, b)]
zip (a:as) (b:bs) = (a,b) : zip as bs
This trick appear to only work if you're returning one list, however. I can't see how to extend this to allow constructing the tails of multiple lists simultaneously without ending up duplicating the source traversal.
I always presumed that the unzip from the standard library manages to do this in a single traversal (that's kind of the whole point of implementing this otherwise trivial function in a library), but I don't actually know how it works.
Yes, it is possible:
unzip = foldr (\(a,b) ~(as,bs) -> (a:as,b:bs)) ([],[])
With explicit recursion, this would look thus:
unzip [] = ([], [])
unzip ((a,b):xs) = (a:as, b:bs)
where ( as, bs) = unzip xs
The reason that the standard library has the irrefutable pattern match ~(as, bs) is to allow it to work actually lazily:
Prelude> let unzip' = foldr (\(a,b) ~(as,bs) -> (a:as,b:bs)) ([],[])
Prelude> let unzip'' = foldr (\(a,b) (as,bs) -> (a:as,b:bs)) ([],[])
Prelude> head . fst $ unzip' [(n,n) | n<-[1..]]
1
Prelude> head . fst $ unzip'' [(n,n) | n<-[1..]]
*** Exception: stack overflow
The following ideas stem from The Beautiful Folding.
When you have two folding operations over a list, you can always perform them at once by folding with keeping both their states. Let's express this in Haskell. First we need to capture what is a folding operation:
{-# LANGUAGE ExistentialQuantification #-}
import Control.Applicative
data Foldr a b = forall r . Foldr (a -> r -> r) r (r -> b)
A folding operation has a folding function, a start value, and a function that produces a result from a final state. By using existential quantification we can hide the type of the state, which is necessary to combine folds with different states.
Applying a Foldr to a list is just the matter of calling foldr with the appropriate arguments:
fold :: Foldr a b -> [a] -> b
fold (Foldr f s g) = g . foldr f s
Naturally, Foldr is a functor, we can always append a function to the finalizing one:
instance Functor (Foldr a) where
fmap f (Foldr k s r) = Foldr k s (f . r)
More interestingly, it's also an Applicative functor. Implementing pure is easy, we just return a given value and don't fold anything. The most interesting part is <*>. It creates a new fold that keeps the states of both give folds and at the end, combines the results.
instance Applicative (Foldr a) where
pure x = Foldr (\_ _ -> ()) () (\_ -> x)
(Foldr f1 s1 r1) <*> (Foldr f2 s2 r2)
= Foldr foldPair (s1, s2) finishPair
where
foldPair a ~(x1, x2) = (f1 a x1, f2 a x2)
finishPair ~(x1, x2) = r1 x1 (r2 x2)
f *> g = g
f <* g = f
Notice (as in leftaroundabout's answer) that we have lazy pattern matches ~ on tuples. This ensures that <*> is sufficiently lazy.
Now we can express map as a Foldr:
fromMap :: (a -> b) -> Foldr a [b]
fromMap f = Foldr (\x xs -> f x : xs) [] id
With that, defining unzip becomes easy. We just combine two maps, one using fst and another using snd:
unzip' :: Foldr (a, b) ([a], [b])
unzip' = (,) <$> fromMap fst <*> fromMap snd
unzip :: [(a, b)] -> ([a], [b])
unzip = fold unzip'
We can verify that it processes an input only once (and lazily): Both
head . snd $ unzip (repeat (3,'a'))
head . fst $ unzip (repeat (3,'a'))
yield the correct result.
You don't offen see Maybe List except for error-handling for example, because lists are a bit Maybe themselves: they have their own "Nothing": [] and their own "Just": (:).
I wrote a list type using Maybe and functions to convert standard and to "experimental" lists. toStd . toExp == id.
data List a = List a (Maybe (List a))
deriving (Eq, Show, Read)
toExp [] = Nothing
toExp (x:xs) = Just (List x (toExp xs))
toStd Nothing = []
toStd (Just (List x xs)) = x : (toStd xs)
What do you think about it, as an attempt to reduce repetition, to generalize?
Trees too could be defined using these lists:
type Tree a = List (Tree a, Tree a)
I haven't tested this last piece of code, though.
All ADTs are isomorphic (almost--see end) to some combination of (,),Either,(),(->),Void and Mu where
data Void --using empty data decls or
newtype Void = Void Void
and Mu computes the fixpoint of a functor
newtype Mu f = Mu (f (Mu f))
so for example
data [a] = [] | (a:[a])
is the same as
data [a] = Mu (ListF a)
data ListF a f = End | Pair a f
which itself is isomorphic to
newtype ListF a f = ListF (Either () (a,f))
since
data Maybe a = Nothing | Just a
is isomorphic to
newtype Maybe a = Maybe (Either () a)
you have
newtype ListF a f = ListF (Maybe (a,f))
which can be inlined in the mu to
data List a = List (Maybe (a,List a))
and your definition
data List a = List a (Maybe (List a))
is just the unfolding of the Mu and elimination of the outer Maybe (corresponding to non-empty lists)
and you are done...
a couple of things
Using custom ADTs increases clarity and type safety
This universality is useful: see GHC.Generic
Okay, I said almost isomorphic. It is not exactly, namely
hmm = List (Just undefined)
has no equivalent value in the [a] = [] | (a:[a]) definition of lists. This is because Haskell data types are coinductive, and has been a point of criticism of the lazy evaluation model. You can get around these problems by only using strict sums and products (and call by value functions), and adding a special "Lazy" data constructor
data SPair a b = SPair !a !b
data SEither a b = SLeft !a | SRight !b
data Lazy a = Lazy a --Note, this has no obvious encoding in Pure CBV languages,
--although Laza a = (() -> a) is semantically correct,
--it is strictly less efficient than Haskell's CB-Need
and then all the isomorphisms can be faithfully encoded.
You can define lists in a bunch of ways in Haskell. For example, as functions:
{-# LANGUAGE RankNTypes #-}
newtype List a = List { runList :: forall b. (a -> b -> b) -> b -> b }
nil :: List a
nil = List (\_ z -> z )
cons :: a -> List a -> List a
cons x xs = List (\f z -> f x (runList xs f z))
isNil :: List a -> Bool
isNil xs = runList xs (\x xs -> False) True
head :: List a -> a
head xs = runList xs (\x xs -> x) (error "empty list")
tail :: List a -> List a
tail xs | isNil xs = error "empty list"
tail xs = fst (runList xs go (nil, nil))
where go x (xs, xs') = (xs', cons x xs)
foldr :: (a -> b -> b) -> b -> List a -> b
foldr f z xs = runList xs f z
The trick to this implementation is that lists are being represented as functions that execute a fold over the elements of the list:
fromNative :: [a] -> List a
fromNative xs = List (\f z -> foldr f z xs)
toNative :: List a -> [a]
toNative xs = runList xs (:) []
In any case, what really matters is the contract (or laws) that the type and its operations follow, and the performance of implementation. Basically, any implementation that fulfills the contract will give you correct programs, and faster implementations will give you faster programs.
What is the contract of lists? Well, I'm not going to express it in complete detail, but lists obey statements like these:
head (x:xs) == x
tail (x:xs) == xs
[] == []
[] /= x:xs
If xs == ys and x == y, then x:xs == y:ys
foldr f z [] == z
foldr f z (x:xs) == f x (foldr f z xs)
EDIT: And to tie this to augustss' answer:
newtype ExpList a = ExpList (Maybe (a, ExpList a))
toExpList :: List a -> ExpList a
toExpList xs = runList xs (\x xs -> ExpList (Just (x, xs))) (ExpList Nothing)
foldExpList f z (ExpList Nothing) = z
foldExpList f z (ExpList (Just (head, taill))) = f head (foldExpList f z tail)
fromExpList :: ExpList a -> List a
fromExpList xs = List (\f z -> foldExpList f z xs)
You could define lists in terms of Maybe, but not that way do. Your List type cannot be empty. Or did you intend Maybe (List a) to be the replacement of [a]. This seems bad since it doesn't distinguish the list and maybe types.
This would work
newtype List a = List (Maybe (a, List a))
This has some problems. First using this would be more verbose than usual lists, and second, the domain is not isomorphic to lists since we got a pair in there (which can be undefined; adding an extra level in the domain).
If it's a list, it should be an instance of Functor, right?
instance Functor List
where fmap f (List a as) = List (f a) (mapMaybeList f as)
mapMaybeList :: (a -> b) -> Maybe (List a) -> Maybe (List b)
mapMaybeList f as = fmap (fmap f) as
Here's a problem: you can make List an instance of Functor, but your Maybe List is not: even if Maybe was not already an instance of Functor in its own right, you can't directly make a construction like Maybe . List into an instance of anything (you'd need a wrapper type).
Similarly for other typeclasses.
Having said that, with your formulation you can do this, which you can't do with standard Haskell lists:
instance Comonad List
where extract (List a _) = a
duplicate x # (List _ y) = List x (duplicate y)
A Maybe List still wouldn't be comonadic though.
When I first started using Haskell, I too tried to represent things in existing types as much as I could on the grounds that it's good to avoid redundancy. My current understanding (moving target!) tends to involve more the idea of a multidimensional web of trade-offs. I won't be giving any “answer” here so much as pasting examples and asking “do you see what I mean?” I hope it helps anyway.
Let's have a look at a bit of Darcs code:
data UseCache = YesUseCache | NoUseCache
deriving ( Eq )
data DryRun = YesDryRun | NoDryRun
deriving ( Eq )
data Compression = NoCompression
| GzipCompression
deriving ( Eq )
Did you notice that these three types could all have been Bool's? Why do you think the Darcs hackers decided that they should introduce this sort of redundancy in their code? As another example, here is a piece of code we changed a few years back:
type Slot = Maybe Bool -- OLD code
data Slot = InFirst | InMiddle | InLast -- newer code
Why do you think we decided that the second code was an improvement over the first?
Finally, here is a bit of code from some of my day job stuff. It uses the newtype syntax that augustss mentioned,
newtype Role = Role { fromRole :: Text }
deriving (Eq, Ord)
newtype KmClass = KmClass { fromKmClass :: Text }
deriving (Eq, Ord)
newtype Lemma = Lemma { fromLemma :: Text }
deriving (Eq, Ord)
Here you'll notice that I've done the curious thing of taking a perfectly good Text type and then wrapping it up into three different things. The three things don't have any new features compared to plain old Text. They're just there to be different. To be honest, I'm not entirely sure if it was a good idea for me to do this. I provisionally think it was because I manipulate lots of different bits and pieces of text for lots of reasons, but time will tell.
Can you see what I'm trying to get at?
I'm trying to write a function that takes a predicate f and a list and returns a list consisting of all items that satisfy f with preserved order. The trick is to do this using only higher order functions (HoF), no recursion, no comprehensions, and of course no filter.
You can express filter in terms of foldr:
filter p = foldr (\x xs-> if p x then x:xs else xs) []
I think you can use map this way:
filter' :: (a -> Bool) -> [a] -> [a]
filter' p xs = concat (map (\x -> if (p x) then [x] else []) xs)
You see? Convert the list in a list of lists, where if the element you want doesn't pass p, it turns to an empty list
filter' (> 1) [1 , 2, 3 ] would be: concat [ [], [2], [3]] = [2,3]
In prelude there is concatMap that makes the code simplier :P
the code should look like:
filter' :: (a -> Bool) -> [a] -> [a]
filter' p xs = concatMap (\x -> if (p x) then [x] else []) xs
using foldr, as suggested by sclv, can be done with something like this:
filter'' :: (a -> Bool) -> [a] -> [a]
filter'' p xs = foldr (\x y -> if p x then (x:y) else y) [] xs
You're obviously doing this to learn, so let me show you something cool. First up, to refresh our minds, the type of filter is:
filter :: (a -> Bool) -> [a] -> [a]
The interesting part of this is the last bit [a] -> [a]. It breaks down one list and it builds up a new list.
Recursive patterns are so common in Haskell (and other functional languages) that people have come up with names for some of these patterns. The simplest are the catamorphism and it's dual the anamorphism. I'll show you how this relates to your immediate problem at the end.
Fixed points
Prerequisite knowledge FTW!
What is the type of Nothing? Firing up GHCI, it says Nothing :: Maybe a and I wouldn't disagree. What about Just Nothing? Using GHCI again, it says Just Nothing :: Maybe (Maybe a) which is also perfectly valid, but what about the value that this a Nothing embedded within an arbitrary number, or even an infinite number, of Justs. ie, what is the type of this value:
foo = Just foo
Haskell doesn't actually allow such a definition, but with a slight tweak we can make such a type:
data Fix a = In { out :: a (Fix a) }
just :: Fix Maybe -> Fix Maybe
just = In . Just
nothing :: Fix Maybe
nothing = In Nothing
foo :: Fix Maybe
foo = just foo
Wooh, close enough! Using the same type, we can create arbitrarily nested nothings:
bar :: Fix Maybe
bar = just (just (just (just nothing)))
Aside: Peano arithmetic anyone?
fromInt :: Int -> Fix Maybe
fromInt 0 = nothing
fromInt n = just $ fromInt (n - 1)
toInt :: Fix Maybe -> Int
toInt (In Nothing) = 0
toInt (In (Just x)) = 1 + toInt x
This Fix Maybe type is a bit boring. Here's a type whose fixed-point is a list:
data L a r = Nil | Cons a r
type List a = Fix (L a)
This data type is going to be instrumental in demonstrating some recursion patterns.
Useful Fact: The r in Cons a r is called a recursion site
Catamorphism
A catamorphism is an operation that breaks a structure down. The catamorphism for lists is better known as a fold. Now the type of a catamorphism can be expressed like so:
cata :: (T a -> a) -> Fix T -> a
Which can be written equivalently as:
cata :: (T a -> a) -> (Fix T -> a)
Or in English as:
You give me a function that reduces a data type to a value and I'll give you a function that reduces it's fixed point to a value.
Actually, I lied, the type is really:
cata :: Functor T => (T a -> a) -> Fix T -> a
But the principle is the same. Notice, T is only parameterized over the type of the recursion sites, so the Functor part is really saying "Give me a way of manipulating all the recursion sites".
Then cata can be defined as:
cata f = f . fmap (cata f) . out
This is quite dense, let me elaborate. It's a three step process:
First, We're given a Fix t, which is a difficult type to play with, we can make it easier by applying out (from the definition of Fix) giving us a t (Fix t).
Next we want to convert the t (Fix t) into a t a, which we can do, via wishful thinking, using fmap (cata f); we're assuming we'll be able to construct cata.
Lastly, we have a t a and we want an a, so we just use f.
Earlier I said that the catamorphism for a list is called fold, but cata doesn't look much like a fold at the moment. Let's define a fold function in terms of cata.
Recapping, the list type is:
data L a r = Nil | Cons a r
type List a = Fix (L a)
This needs to be a functor to be useful, which is straight forward:
instance Functor (L a) where
fmap _ Nil = Nil
fmap f (Cons a r) = Cons a (f r)
So specializing cata we get:
cata :: (L x a -> a) -> List x -> a
We're practically there:
construct :: (a -> b -> b) -> b -> L a b -> b
construct _ x (In Nil) = x
construct f _ (In (Cons e n)) = f e n
fold :: (a -> b -> b) -> b -> List a -> b
fold f m = cata (construct f m)
OK, catamorphisms break data structures down one layer at a time.
Anamorphisms
Anamorphisms over lists are unfolds. Unfolds are less commonly known than there fold duals, they have a type like:
unfoldr :: (b -> Maybe (a, b)) -> b -> [a]
As you can see anamorphisms build up data structures. Here's the more general type:
ana :: Functor a => (a -> t a) -> a -> Fix t
This should immediately look quite familiar. The definition is also reminiscent of the catamorphism.
ana f = In . fmap (ana f) . f
It's just the same thing reversed. Constructing unfold from ana is even simpler than constructing fold from cata. Notice the structural similarity between Maybe (a, b) and L a b.
convert :: Maybe (a, b) -> L a b
convert Nothing = Nil
convert (Just (a, b)) = Cons a b
unfold :: (b -> Maybe (a, b)) -> b -> List a
unfold f = ana (convert . f)
Putting theory into practice
filter is an interesting function in that it can be constructed from a catamorphism or from an anamorphism. The other answers to this question (to date) have also used catamorphisms, but I'll define it both ways:
filter p = foldr (\x xs -> if p x then x:xs else xs) []
filter p =
unfoldr (f p)
where
f _ [] =
Nothing
f p (x:xs) =
if p x then
Just (x, xs)
else
f p xs
Yes, yes, I know I used a recursive definition in the unfold version, but forgive me, I taught you lots of theory and anyway filter isn't recursive.
I'd suggest you look at foldr.
Well, are ifs and empty list allowed?
filter = (\f -> (>>= (\x -> if (f x) then return x else [])))
For a list of Integers
filter2::(Int->Bool)->[Int]->[Int]
filter2 f []=[]
filter2 f (hd:tl) = if f hd then hd:filter2 f tl
else filter2 f tl
I couldn't resist answering this question in another way, this time with no recursion at all.
-- This is a type hack to allow the y combinator to be represented
newtype Mu a = Roll { unroll :: Mu a -> a }
-- This is the y combinator
fix f = (\x -> f ((unroll x) x))(Roll (\x -> f ((unroll x) x)))
filter :: (a -> Bool) -> [a] -> [a]
filter =
fix filter'
where
-- This is essentially a recursive definition of filter
-- except instead of calling itself, it calls f, a function that's passed in
filter' _ _ [] = []
filter' f p (x:xs) =
if p x then
(x:f p xs)
else
f p xs