Why can't ++ be used in pattern matching? - list

From LearnYouAHaskell:
One more thing — you can't use ++ in pattern matches. If you tried to pattern match against (xs ++ ys), what would be in the first and what would be in the second list? It doesn't make much sense. It would make sense to match stuff against (xs ++ [x,y,z]) or just (xs ++ [x]), but because of the nature of lists, you can't do that.
I'm struggling to reason about what he means by the nature of lists why this can't be.

You pattern match against constructors, and for lists there are two constructors, the empty list [] and 'cons' :, where cons has two arguments, the head of the list, and the tail. ++ is a function to append two lists, so you can't match against it.
If you could match against it, there would be multiple possible matches for xs and ys in the pattern xs ++ ys. For example if you took even a small list like [1] then there are two possiblities
xs == [] and ys == [1]
xs == [1] and ys == []
so the match is ambiguous which is what the quote is about.

You can only pattern match on data constructors. This is made confusing because the List type in Haskell relies on some syntatic sugar. So let's remove that and define our own List type so it's easier to see what's going on.
data List a = Nil | Cons a (List a)
Now when you want to declare a function that uses this List type, you can pattern match on the constructors.
myHead :: List a -> Maybe a
myHead Nil = Nothing
myHead (Cons a as) = Just a
This is the only kind of pattern matching you can do by default. List in Haskell is implemented with some sugar that renames Cons to (:) and Nil to []. And it let's you write your lists in [a, b, c] notation, but fundamentally, these things are the same.
(++) on the other hand is a normal function and not a data constructor.
My guess on what you're trying to do is look at the first few elements in a list and separate them from the rest of the list. You can do this in a pretty straightforward way using normal pattern matching.
getFirstThree :: [a] -> Maybe (a, a, a)
getFirstThree (a1:a2:a3:as) = Just (a1, a2, a3)
getFirstThree _ = Nothing
If you could pattern match on normal functions, this would be the same as writing:
getFirstThree :: [a] -> Maybe (a, a, a)
getFirstThree ([a1, a2, a3] ++ as) = Just (a1, a2, a3)
getFirstThree _ = Nothing
And maybe this second definition is more clear to you. I don't believe that this is the case for lists, but there are other data types where this might be true. For this sort of thing, the ViewPatterns and PatternSynonyms extensions exist and would let you define (++) as a matchable pattern by basically telling the compiler how to perform the pattern match. However, as #Lee notes, this match is ambiguous because there are multiple valid matches for any given pattern that looks like (a ++ b).

Related

Passing empty list in Haskell [duplicate]

Consider the following snippet which defines a function foo which takes in a list and performs some operation on the list (like sorting).
I tried to load the snippet in ghci:
-- a function which consumes lists and produces lists
foo :: Ord a => [a] -> [a]
foo [] = []
foo (x:xs) = xs
test1 = foo [1, 2, 3] == [2, 3]
test2 = null $ foo []
yet the following error occurs:
No instance for (Ord a0) arising from a use of ‘foo’
The type variable ‘a0’ is ambiguous
Note: there are several potential instances:
instance (Ord a, Ord b) => Ord (Either a b)
-- Defined in ‘Data.Either’
instance forall (k :: BOX) (s :: k). Ord (Data.Proxy.Proxy s)
-- Defined in ‘Data.Proxy’
instance (GHC.Arr.Ix i, Ord e) => Ord (GHC.Arr.Array i e)
-- Defined in ‘GHC.Arr’
...plus 26 others
In the second argument of ‘($)’, namely ‘foo []’
In the expression: null $ foo []
In an equation for ‘test2’: test2 = null $ foo []
The problem is in the expression test2 = null $ foo []. Furthermore, removing Ord a constraint from the type definition of foo will solve the problem. Strangely, typing null $ foo [] in the interactive mode (after loading the definition for foo) works correctly and produces the expected true.
I need a clear explanation for this behaviour.
I like thinking of typeclasses in "dictionary-passing style". The signature
foo :: Ord a => [a] -> [a]
says that foo takes a dictionary of methods for Ord a, essentially as a parameter, and a list of as, and gives back a list of as. The dictionary has things in it like (<) :: a -> a -> Bool and its cousins. When we call foo, we need to supply such a dictionary. This is done implicitly by the compiler. So
foo [1,2,3]
will use the Ord Integer dictionary, because we know that a is Integer.
However, in foo [], the list could be a list of anything -- there is no information to determine the type. But we still need to find the Ord dictionary to pass to foo (although your foo doesn't use it at all, the signature says that it could, and that's all that matters). That's why there is an ambiguous type error. You can specify the type manually, which will give enough information to fill in the dictionary, like this
null (foo ([] :: [Integer]))
or with the new TypeApplications extension
null (foo #Integer [])
If you remove the Ord constraint, it works, as you have observed, and this is just because we no longer need to supply a dictionary. We don't need to know what specific type a is to call foo anymore (this feels a little magical to me :-).
Note that foo ([] :: Ord a => [a]) does not eliminate the ambiguity, because it is not known which specific Ord dictionary you want to pass; is it Ord Int or Ord (Maybe String), etc.? There is no generic Ord dictionary, so we have to choose, and there is no rule for what type to choose in this case. Whereas when you say (Ord a, Num a) => [a], then defaulting specifies a way to choose, and we pick Integer, since it is a special case of the Num class.
The fact that foo [] works in ghci is due to ghci’s extended defaulting rules. It might be worth reading about type defaulting in general, which is surely not the prettiest part of Haskell, but it is going to come up a lot in the kinds of corner cases you are asking about.

Compare a null list with [(a,b)] Haskell

I am a bit confused on how the lists in Haskell works.
I know that [] is an empty list with type [a].Is there a way to define an empty list with type [(a,b)]?
For example I know that null [1] == [] will give us True
null [(1,2)] == [] gave me the mismatched type error that's what I am assuming.
I want to know if it is possible to say something like null [(1,2)] == [(,)] will give us True
I know that [] is an empty list with type [a]
Yes, but it's important to understand what “type [a]” actually means. Really it means type forall a . [a], i.e. this is not a list of elements of some particular type “a” but rather, given any choice of type a, it is a list of that type. In particular, it can also be a list of tuple type. Thus, null works just as fine with lists of tuples as with lists of any other type.
To actually see null in action on such a tuple list, you just need to supply one. For instance, null [(1,2)] uses it for a tuple-list. But in case of the empty list, there's no content of the list with which you would constrain the type. It may either be clear from the context, as in
Prelude> [null l | l <- [ [], [(1,2)], [(1,3)] ]]
[True,False,False]
or you may explicitly specify it with a signature
Prelude> null ([] :: [(String, Double)])
True
Is there a way to define an empty list with type [(a,b)]?
Simply []. Indeed [] is an empty list, but the type of elements is free. It has type [a], but a can be the same as (b, c) so a 2-tuple with two different types.
For example I know that null [1] == [] will give us True
null :: Foldable f => f a -> Bool is a function that takes an f a (in this case an [a]), and returns a Bool. It checks if a list is empty. It does not generate a list.
i would declare my list like below :
let myList = [] :: [(Integer, Integer)]
then myList will have the type :t myList will yield myList :: [(Integer, Integer)]
When evaluating null myList it yields True
Every type of list is represented as []. If you want to specify the type you can use a signature ([] :: [(a, b)]), but often the compiler can infer the type from context:
λ> [] == [(1, 'x')]
False

Why is Haskell [] (list) not a type class?

I am writing a Haskell function which takes a list as input. That is, there's no reason it couldn't be a queue or dequeue, or anything that allows me to access its "head" and its "tail" (and check if it's empty). So the [a] input type seems too specific. But AFAIK there's no standard library typeclass that captures exactly this interface. Sure, I could wrap my function in a Data.Foldable.toList and make it polymorphic wrt Foldable, but that doesn't quite seem right (idiomatic).
Why is there no standard list type class? (And why is the "container" type class hierarchy in Haskell less developed than I think it should be?) Or am I missing something essential?
A given algebraic datatype can be represented as its catamorphism, a transformation known as Church encoding. That means lists are isomorphic to their foldr:
type List a = forall b. (a -> b -> b) -> b -> b
fromList :: [a] -> List a
fromList xs = \f z -> foldr f z xs
toList :: List a -> [a]
toList l = l (:) []
But foldr also characterises Foldable. You can define foldMap in terms of foldr, and vice versa.
foldMap f = foldr (mappend . f) mempty
foldr f z t = appEndo (foldMap (Endo . f) t) z
(It shouldn't be surprising that foldMap :: Monoid m => (a -> m) -> [a] -> m characterises lists, because lists are a free monoid.) In other words, Foldable basically gives you toList as a class. Instances of Foldable have a "path" through them which can be walked to give you a list; Foldable types have at least as much structure as lists.
Regarding your misgivings:
It's not like Foldable has functions head/tail/isEmpty, which is what I would find more intuitive.
null :: Foldable t => t a -> Bool is your isEmpty, and you can define (a safe version of) head straightforwardly with an appropriate choice of Monoid:
head :: Foldable t :: t a -> Maybe a
head = getFirst . foldMap (First . Just)
tail is kinda tricky in my opinion. It's not obvious what tail would even mean for an arbitrary type. You can certainly write tail :: Foldable t => t a -> Maybe [a] (by toListing and then unconsing), but I think any type T for which tail :: T a -> Maybe (T a) is defined would necessarily be structurally similar to lists (eg Seq). Besides, in my experience, the vast majority of cases where you'd think you need access to a list's tail turn out to be folds after all.
That said, abstracting over unconsable types is occasionally useful. megaparsec, for example, defines a Stream class for (monomorphic) streams of tokens to be used as input for a parser.
The Question
Making your question more concrete, let's ask:
Why isn't the type class
class HasHeadAndTail t where
head :: t a -> Maybe a
tail :: t a -> Maybe (t a)
isEmpty :: t a -> Bool
in the base library?
An Answer
This class is only useful for ordered, linear containers. Map, Set, HashMap, HashTable, and Tree all would not be instances. I'd even argue against making Seq and DList an instance since there are really two possible "heads" of that structure.
Also what can we say about any type that is an instance of this class? I think the only property is if isEmpty is False then head and tail should be non-Nothing. As a result, isEmpty shouldn't even be in the class and instead be a function isEmpty :: HashHeadAndTail t => t a -> Bool ; isEmpty = isNothing . head.
So my answer is:
This class lacks utility in so far as it lacks instances.
This class lacks useful properties and classes that lack properties are frequently discouraged.

<*> for lists implemented as do notation - isn't this "cheating"?

According to 'Learn you a Haskell', the implementation of <*> for lists is:
fs <*> xs = [f x | f <- fs, x <- xs]
Am I mistaken, or is this sugared monadic code based on >>= ?
As far as I understand, it should be possible to implement <*> only using fmap, as it is the case with applicative Maybe.
How could <*> for lists be implemented only using fmap? (and possibly without concat-ing things?)
BTW, a few pages later I see the same issue with regards to the implementation of <*> for applicative IO.
No, this is not sugared monadic code based on >>=. If it were, the definition of >>= in the Monad [] instance would be circular.
instance Monad [] where
{-# INLINE (>>=) #-}
xs >>= f = [y | x <- xs, y <- f x]
...
The list comprehensions are syntactic sugar for let, if, and concatMap. From the Haskell Report:
[ e | b, Q ] = if b then [ e | Q ] else []
[ e | let decls, Q ] = let decls in [ e | Q ]
[ e | p <- l, Q ] = let ok p = [ e | Q ]
ok _ = []
in concatMap ok l
The Monad [] instance is easy to define in terms of concatMap, but concatMap was defined in GHC.List (and is now possibly defined in Data.Foldable). Neither GHC.List nor Data.Foldable is imported into GHC.Base, so defining the Monad instance for lists in GHC.Base in terms of concatMap is impossible:
instance Monad [] where
(>>=) = flip concatMap -- concatMap isn't imported
Defining these instances in terms of list comprehension gets around needing to import the module containing concatMap to reuse it defining >>=.
In GHC there are two implementations of list comprehensions. One rewrites them in terms of the GHC.Base build and foldr similar to the Data.Foldable concatMap. The other implementation generates recursive functions in place of concatMap as described by Wadler.
There are lots of cases where the Applicative instance is satisfied by monadic functions, I've seen
instance Applicative MyMonadThatIsAlreadyDefined where
pure = return
(<*>) = ap
Also, <*> can't be written using only fmap, at least not in general. That's the point of <*>. Try writing <*> in terms of just fmap, I'll be very surprised if you manage it (in such a way that is well behaved and follows the applicative laws). Remember that the chain is
Functor > Applicative > Monad
Where > can be thought of as superset. This is saying that the set of all functors contains the set of all applicatives, which contains the set of all monads. If you have a monad, then you have all the tools needed to use it as an applicative and as a functor. There are types that are functorial but not applicative, and types that are applicative by not monadic. I see no problem in defining the applicative instance in this manner.
Am I mistaken, or is this sugared monadic code based on >>= ?
I don't know if >>= is actually used to de-sugar list comprehensions (but see Cirdec's answer for evidence that it's not), but it's actually completely legal to define <*> in terms of >>=. In mathematical terms, every Monad instance induces a unique corresponding Applicative instance, in the sense that
instance Applicative F where
pure = return
af <*> ax = af >>= \ f -> ax >>= \ x -> return (f x)
is a law-abiding Applicative instance whenever F has a law-abiding Monad instance.
There's an analogy here from mathematics, if you're familiar with it:
Every inner product induces a norm;
Every norm induces a metric;
Every metric induces a topology.
Similarly, for every monad structure there is a compatible applicative structure, and for every applicative structure there is a compatible fmap (fmap f ax = pure f <*> ax), but the reverse implications do not hold.
As far as I understand, it should be possible to implement <*> only using fmap, as it is the case with applicative Maybe.
I don't understand what you mean here. fmap is certainly not enough to define <*>, or every Functor would be an Applicative (well, an Apply).
This answer is complementary to the ones already given, and focusses only on a part of the question:
As far as I understand, it should be possible to implement <*> for lists only using fmap, as it is the case with applicative Maybe. How?
You seem to refer to this implementation:
instance Applicative Maybe where
pure = Just
Nothing <*> _ = Nothing
(Just f) <*> something = fmap f something
Well, yes, we can do that for lists as well - using nothing but fmap, pattern matching and constructors:
instance Applicative [] where
pure = []
[] <*> _ = []
(f:fs) <*> something = fmap f something ++ (fs <*> something)
where
[] ++ yy = ys
(x:xs) ++ ys = x : (xs++ys)
Admittedly, this does require some kind of concatenation, as lists are a more complex type than Maybes. There are other possible applicative instances for lists that would require less code than this "everything-with-everything" behaviour, but those are not compatible to the default monad instance (which is the common expectation).
Of course, monad notation does simplify this dramatically:
instance Monad m => Applicative m where
pure = return
mf <*> something = mf >>= (\f -> fmap f something) -- shorter: (`fmap` something)
…which works for both Maybe and [] as m.

What is [] (list constructor) in Haskell?

I'm Having problems understanding functors, specifically what a concrete type is in LYAH. I believe this is because I don't understand what [] really is.
fmap :: (a -> b) -> f a -> f b
Is [], a type-constructor? Or, is it a value constructor?
What does it mean to have the type of: [] :: [a]?
Is it like Maybe type-constructor, or Just value constructor?
If it is like Just then how come Just has a signature like Just :: a -> Maybe a rather than Just :: Maybe a, in other words why isn't [] typed [] :: a -> [a]
LYAH says this as it applies to functors: Notice how we didn't write instance Functor [a] where, because from fmap :: (a -> b) -> f a -> f b, we see that the f has to be a type constructor that takes one type. [a] is already a concrete type (of a list with any type inside it), while [] is a type constructor that takes one type and can produce types such as [Int], [String] or even [[String]]. I'm confused though the type of [] implies it is like a literal for [a] what is LYAH trying to get at?
The type is described (in a GHCI session) as:
$ ghci
Prelude> :info []
data [] a = [] | a : [a] -- Defined
We may also think about it as though it were defined as:
data List a = Nil
| Cons a (List a)
or
data List a = EmptyList
| ListElement a (List a)
Type Constructor
[a] is a polymorphic data type, which may also be written [] a as above. This may be thought about as though it were List a
In this case, [] is a type constructor taking one type argument a and returning the type [] a, which is also permitted to be written as [a].
One may write the type of a function like:
sum :: (Num a) => [a] -> a
Data Constructor
[] is a data constructor which essentially means "empty list." This data constructor takes no value arguments.
There is another data constructor, :, which prepends an element to the front of another list. The signature for this data constructor is a : [a] - it takes an element and another list of elements and returns a resultant list of elements.
The [] notation may also be used as shorthand for constructing a list. Normally we would construct a list as:
myNums = 3 : 2 : 4 : 7 : 12 : 8 : []
which is interpreted as
myNums = 3 : (2 : (4 : (7 : (12 : (8 : [])))))
but Haskell permits us also to use the shorthand
myNums = [ 3, 2, 4, 7, 12, 8 ]
as an equivalent in meaning, but slightly nicer in appearance, notation.
Ambiguous Case
There is an ambiguous case that is commonly seen: [a]. Depending on the context, this notation can mean either "a list of a's" or "a list with exactly one element, namely a." The first meaning is the intended meaning when [a] appears within a type, while the second meaning is the intended meaning when [a] appears within a value.
It's (confusingly, I'll grant you) syntactically overloaded to be both a type constructor and a value constructor.
It means that (the value constructor) [] has the type that, for all types a, it is a list of a (which is written [a]). This is because there is an empty list at every type.
The value constructor [] isn't typed a -> [a] because the empty list has no elements, and therefore it doesn't need an a to make an empty list of a's. Compare to Nothing :: Maybe a instead.
LYAH is talking about the type constructor [] with kind * -> *, as opposed to the value constructor [] with type [a].
it is a type constructor (e.g. [Int] is a type), and a data constructor ([2] is a list structure).
The empty list is a list holding any type
[a] is like Maybe a, [2] is like Just 2.
[] is a zero-ary function (a constant) so it doesn't have function type.
Just to make things more explicit, this data type:
data List a = Cons a (List a)
| Nil
...has the same structure as the built-in list type, but without the (nicer, but potentially confusing) special syntax. Here's what some correspondences look like:
List = [], type constructors with kind * -> *
List a = [a], types with kind *
Nil = [], values with polymorphic types List a and [a] respectively
Cons = :, data constructors with types a -> List a -> List a and a -> [a] -> [a] respectively
Cons 5 Nil = [5] or 5:[], single element lists
f Nil = ... = f [] = ..., pattern matching empty lists
f (Cons x Nil) = ... = f [x] = ...`, pattern matching single-element lists
f (Cons x xs) = ... = f (x:xs) = ..., pattern matching non-empty lists
In fact, if you ask ghci about [], it tells you pretty much the same definition:
> :i []
data [] a = [] | a : [a] -- Defined in GHC.Types
But you can't write such a definition yourself because the list syntax and its "outfix" type constructor is a special case, defined in the language spec.