Parse error in list monad's `do` notation - list

OK, so here's a weird one.
This works perfectly:
test = do
x <- [1..5]
y <- [1..5]
[x+y, x-y]
But this:
test = do
x <- [1..5]
y <- [1..5]
[
x+y,
x-y
]
fails miserably. GHC utterly refuses to parse this. No matter how I fidget with it, I can't seem to convince GHC to allow me to spread the list across multiple lines. Which is a problem, because if you replace x+y and x-y with really big expressions, it quickly becomes hard to read...
Does anybody know why isn't this working, and how can I force it to work? (Or at least do something that looks legible?)

After do, each line which starts on the same column as the first word after do starts a new entry. With explicit braces, your code is equivalent to
test = do
{ x <- [1..5]
; y <- [1..5]
; [
x+y,
x-y
; ]
}
This is due to the indentation rules. As we can see, the last semicolon should not be there -- to avoid it, we should indent the last line more.

If I parse this, I get the following error:
File.hs:10:3: error:
parse error (possibly incorrect indentation or mismatched brackets)
I think the parsers simply sees the closing square bracket ] as a new statement. And it complains that the previous statement had no closing bracket (and the new one a closing bracket without an opening bracket). If you push it one column to the right, it parses correctly (at least with GHC-8.0.2)
test = do
x <- [1..5]
y <- [1..5]
[
x+y,
x-y
] -- one space to the right
As long as you do not go back to the previous indentation level (here one space to the left), it should probably be fine. Since the compiler will see it as one do-statement.

Here are a couple of possible ways to write this legally:
test = do
[x,y] <- replicateM 2 [1..5]
[
x+y
,x-y
]
test = do
[x,y] <- replicateM 2 [1..5]
[ x+y
,x-y ]
test = do
[x,y] <- replicateM 2 [1..5]
[ x+y ,
x-y ]
test = do
[x,y] <- replicateM 2 [1..5]
id [ x+y
, x-y ]
test = do
[x,y] <- replicateM 2 [1..5]
id [
x+y,
x-y
]
test = do
[x,y] <- replicateM 2 [1..5]
([
x+y,
x-y
])

Related

Haskell >> operator with two lists

For a college assignment I am learning Haskell and when reading about the do-notation and sequencing with >>= and >> I came across this behaviour that I did not expect.
[1,2,3] >> [1] -- returns [1,1,1]
Can anyone explain why every element of the first array is replaced by the elements of the second array? It seems like the list is concatenated in some way, while I expected the result of the first expression to be completely ignored, thus I would expect [1] as the result.
Thanks a lot in advance.
The “result” is in this case the values contained in [1,2,3], which are indeed ignored. What >> does not ignore is the context, which for the list monad is the shape (i.e. length) of the list. This can't be ignored, because we must have x >>= pure ≡ x, i.e.
Prelude> [1,2,3] >>= pure
[1,2,3]
Prelude> [1,2,3] >>= \n -> [n]
[1,2,3]
Prelude> [1,2,3] >>= \n -> [1]
[1,1,1]
Prelude> [1,2,3] >>= \_ -> [1]
[1,1,1]
Prelude> [1,2,3] >> [1]
[1,1,1]
An example with length>1 on the RHS:
[1,2,3] >>= \n -> [n, n+10]
[1,11,2,12,3,13]
Prelude> [1,2,3] >>= \n -> [100, 10]
[100,10,100,10,100,10]
Prelude> [1,2,3] >> [100, 10]
[100,10,100,10,100,10]
There are several equivalent ways of writing [1,2,3] >> [1]:
do [1,2,3]
return 1
[ x | _ <- [1,2,3], x <- [1] ]
[ 1 | _ <- [1,2,3] ]
[1,2,3] >>= \_ -> [1]
concatMap (const [1]) [1,2,3]
concat (map (const [1]) [1,2,3])
concat ([1,2,3] $> [1])
It replaces every element of [1..3] with [1] and then collapses it:
concatMap (\_ -> [1]) [1,2,3]
= concat (map (\_ -> [1]) [1,2,3])
= concat [[1],[1],[1]]
= [1,1,1]
It completely ignores the elements of [1,2,3], just using the shape (length). Look what happens if we replace them with undefined:
> do [undefined, undefined, undefined]; return 1
[1,1,1]
The results of the first computation are indeed ignored.
Monads can be seen as generalized nested loops. What you have can be written in pseudocode as
for y in [1,2,3]:
for x in [1]: -- for x in ((\_ -> [1]) y):
yield x
Both y and x are in scope at the innermost level. y's value is indeed ignored.
Still for each y in [1,2,3] each x in [1] is produced, thus defining the overall computation. For lists this means the results produced one by one by the combined computation as a whole are the results produced one by one at the innermost level. Sounds trivial, isn't it.
How exactly this is implemented is an implementational detail. Seeing the lists as data this means splicing the results of the inner computations in place, flattening the list of lists into a one-level list, appending the inner lists together. concatMap is widely known as flatMap in other languages:
[ 1, 2, 3 ]
[1] [1] [1]
-------------------
[ 1, 1, 1 ]
Related answers: Why >> duplicates right-hand side operand , Map and flatMap , How does Monad on list work? , Why list monad combines in that order? , mapM with const functions in Haskell , under the hood reason we can use nested loops in list comprehensions , Generalizing prime pairs in SICP.

Why can't GHC reason about some infinite lists?

This recent question got me thinking about Haskell's ability to work with infinite lists. There are plenty of other questions and answers about infinite lists on StackOverflow, and I understand why we can't have a general solution for all infinite lists, but why can't Haskell reason about some infinite lists?
Let's use the example from the first linked question:
list1 = [1..]
list2 = [x | x <- list1, x <= 4]
print list2
$ [1,2,3,4
#user2297560 writes in the comments:
Pretend you're GHCI. Your user gives you an infinite list and asks you to find all the values in that list that are less than or equal to 4. How would you go about doing it? (Keep in mind that you don't know that the list is in order.)
In this case, the user didn't give you an infinite list. GHC generated it! In fact, it generated it following it's own rules. The Haskell 2010 Standard states the following:
enumFrom :: a -> [a] -- [n..]
For the types Int and Integer, the enumeration functions have the following meaning:
The sequence enumFrom e1 is the list [e1,e1 + 1,e1 + 2,…].
In his answer to the other question, #chepner writes:
You know that the list is monotonically increasing, but Haskell does not.
The statements these users made don't seem to line up with the standard to me. Haskell created the list in an ordered fashion using a monotonic increase. Haskell should know that the list is both ordered and monotonic. So why can't it reason about this infinite list to turn [x | x <- list1, x <= 4] into takeWhile (<= 4) list1 automatically?
Theoretically, one could imagine a rewrite rule such as
{-# RULES
"filterEnumFrom" forall (n :: Int) (m :: Int).
filter (< n) (enumFrom m) = [m..(n-1)]
#-}
And that automatically would convert expressions such as filter (< 4) (enumFrom 1) to [1..3]. So it is possible. There is a glaring problem though: any variation from this exact syntactical pattern won't work. The result is that you end up defining a bunch of rules and you can longer ever be sure if they are triggering or not. If you can't rely on the rules, you eventually just don't use them. (Also, note I've specialized the rule to Ints - as was briefly posted as a comment, this may break down in subtle ways for other types.)
At the end of the day, to perform more advanced analysis, GHC would have to have some tracking information attached to lists to say how they were generated. That would either make lists less lightweight of an abstraction or mean that GHC would have some special machinery in it just for optimizing lists at compile time. Neither of these options is nice.
That said, you can always add your own tracking information by making a list type on top of lists.
data List a where
EnumFromTo :: Enum a => a -> Maybe a -> List a
Filter :: (a -> Bool) -> List a -> List a
Unstructured :: [a] -> List a
This may end up being easier to optimize.
So why can't it reason about this infinite list to turn [x | x <- list1, x <= 4] into takeWhile (<= 4) list1 automatically?
The answer isn't any more specific than "It doesn't use takeWhile because it doesn't use takeWhile". The spec says:
Translation: List comprehensions satisfy these identities, which may
be used as a translation into the kernel:
[ e | True ] = [ e ]
[ e | q ] = [ e | q, True ]
[ e | b, Q ] = if b then [ e | Q ] else []
[ e | p <- l, Q ] = let ok p = [ e | Q ]
ok _ = []
in concatMap ok l
[ e | let decls, Q ] = let decls in [ e | Q ]
That is, the meaning of a list comprehension is given by translation into a simpler language with if-expressions, let-bindings, and calls to concatMap. We can figure out the meaning of your example by translating it through the following steps:
[x | x <- [1..], x <= 4]
-- apply rule 4 --
let ok x = [ x | x <= 4 ]
ok _ = []
in concatMap ok [1..]
-- eliminate unreachable clause in ok --
let ok x = [ x | x <= 4 ]
in concatMap ok [1..]
-- apply rule 2 --
let ok x = [ x | x <= 4, True ]
in concatMap ok [1..]
-- apply rule 3 --
let ok x = if x <= 4 then [ x | True ] else []
in concatMap ok [1..]
-- apply rule 1 --
let ok x = if x <= 4 then [ x ] else []
in concatMap ok [1..]
-- inline ok --
concatMap (\x -> if x <= 4 then [ x ] else []) [1..]

Is there a way to generate a series of list comprehensions programmatically in Haskell?

In my ongoing attempt to get better at Haskell, I'm attempting to solve a problem where I'd like to create a series of list comprehensions of this form:
m2 = [[x1,x2] | x1 <- [2..110], x2 <- [x1..111]]
m3 = [[x1,x2,x3] | x1 <- [2..22], x2 <- [x1..22], x3 <- [x2..24]]
m4 = [[x1,x2,x3,x4] | x1 <- [2..10], x2 <- [x1..10], x3 <- [x2..10], x4 <- [x3..12]]
...
Where x1 <= x2 ... <= xn, the number following m is the length of the sublists, and the first n - 1 terms are bounded by the same upper bound, while the nth term is bounded by some larger number.
I could certainly write all of it out by hand, but that's not particularly good practice. I'm wondering if there's a way to generate these lists up to a particular maximum m value. My immediate thought was Template Haskell, but I don't know enough about it to determine whether it's usable. Is there some other solution that's escaping me?
In pseudo-Haskell, what I'm looking for is some method that does something like:
mOfN n bound term = [ [x1..xn] | x1 <- [2..bound], x2 <- [x1..bound], ..., xn <- [x(n-1)..term] ]
The main issue is that I can't figure out how I would dynamically create x1,x2, etc.
Is this what you are looking for?
import Data.List (tails)
mofn 0 xs = [ [] ]
mofn m xs = [ y:zs | (y:ys) <- tails xs, zs <- mofn (m-1) ys ]
i.e. mofn 3 [1..5] is:
[[1,2,3],[1,2,4],[1,2,5],[1,3,4],[1,3,5],[1,4,5],[2,3,4],[2,3,5],[2,4,5],[3,4,5]]
The key is the tails function which returns successive tails of a list.
Update
Is this what you are looking for?
mofn' 1 lo hi bnd = [ [x] | x <- [lo..bnd] ]
mofn' k lo hi bnd = [ x:ys | x <- [lo..hi], ys <- mofn' (k-1) x hi bnd ]
mofn' 3 1 3 5 is:
[[1,1,1], [1,1,2], [1,1,3], [1,1,4], [1,1,5],
[1,2,2], [1,2,3], [1,2,4], [1,2,5],
[1,3,3], [1,3,4], [1,3,5],
[2,2,2], [2,2,3], [2,2,4], [2,2,5],
[2,3,3], [2,3,4], [2,3,5],
[3,3,3], [3,3,4], [3,3,5]
]

How do I read names in Haskell?

I just encountered this example on learnyouahaskell.com. However, I don't understand it at all.
ghci> let xxs = [[1,3,5,2,3,1,2,4,5],[1,2,3,4,5,6,7,8,9],[1,2,4,2,1,6,3,1,3,2,3,6]]
ghci> [ [ x | x <- xs, even x ] | xs <- xxs]
[[2,2,4],[2,4,6,8],[2,4,2,6,2,6]] -- This is the output
My problem is that while I do understand the idea of list comprehensions, I don't get what the xxs means.
If it was just the name of the list of lists, how can we split up a name and do something like xs <- xxs. To me that doesn't make sense at all.
Can someone help?
xxs is the list of lists bound in the let expression. I think you're being confused by the similarity of xxs and xs, they are just two independent names with no relation. You can replace xs with sublist or any other valid name.
ghci> let xxs = [[1,3,5,2,3,1,2,4,5],[1,2,3,4,5,6,7,8,9],[1,2,4,2,1,6,3,1,3,2,3,6]]
ghci> [ [ x | x <- sublist, even x ] | sublist <- xxs]
So we're not splitting on the name we're just using the <- list comprehension operator to iterate over the elements of a [[a]] and then have another comprehension for iterating over the elements of each [a] sublist.

Creating a new list that adds and sums elements from an old list

I have a list xxs and I need to create a new one that adds and sums elements from the old list.
Let me draw it to demonstrate:
So, I have the list:
xxs = [("a","b", [(1,"a","b"),(2,"a","b")]), ("c","d",[(3,"a","b"),(4,"a","b")])]
My best approach so far is:
infoBasicas = [ (x,y,aux) | (x,y,_) <- xxs]
where aux = sum [ z | (_,_,ys) <- xxs, (z,_,_) <- ys]
Output:
[("a","b",10),("c","d",10)]
Although I’m not far away… I'm not quite there yet and would really appreciate some suggestions.
The problem with you solution is that aux is the same for each element of xxs. when you write (x,y,_) <- xxs, you are throwing away the list with the numbers you want to sum. Instead, keep that list, working one element at a time, so:
infoBasicas = [(x, y, doSum innerList) | (x, y, innerList) <- xxs]
To find the sum of the innerLists, you only want the numbers, so you can throw them away. After that is done, you are left with a list of numbers, which can just be summed with the standard sum function:
doSum list = sum (fst3 list) -- There is one small error here. Can you see what it is?
fst3 (a, _, _) = a
Not that we are using fst3 here, instead of fst, as these are triples, not pairs.
You were really close!
As the gereeter said: your main problem is that you're using the same value of aux for everything. If you change aux into a function taking the list of (Int,String,String) tuples then it should work for you.
infoBasicas = [ (x,y,aux z) | (x,y,z) <- xxs ]
where aux xs = sum [ z | (z,_,_) <- xs ]
(I haven't really added anything to gereeter's answer except to change the form of the sample code to more closely resemble yours.)