Why right folding can process infinite list in Haskell? - list

For an infinite list, there's no "last" element of it. So how could foldr work with it?
I've this code snippet from the Haskell book:
(&&)::Bool->Bool->Bool
True && x = x
False && _ = False
and' :: [Bool]->Bool
and' xs=foldr (Main.&&) True xs
Then in Prelude load this hs file and run:
*Main> and' (repeat False)
False
It works as expected, but I don't understand:
The definition of the (&&) receives boolean on the left of it,
but we apply True && x = x while the variable x is in its right side.
Isn't it strange?
Why foldr stops when (&&) returns False?
In my understanding foldr will loop through the list from tail till head.
Is there any internal "break" mechanism?
foldr starts from the last element of a list, but infinite list has no end.
How does foldr begin to work?

You need to familiarize with the syntax a bit.
In Haskell, a function definition goes like this:
this = that
and, basically, it means: When you see this it means that. In fact
True && x
is the same as x, and
False && x
is False, that is why we can write the definitions as in your example.
For your second question: it is not the case that foldr "stops". It is the && operator who does not evaluate its right operand, when the left one is False.
For your 3rd question: no, foldr does not start from the last element of a list. Please have a look at the definition of foldr:
foldr f z [] = z
foldr f z (x:xs) = x `f` foldr f z xs
Now suppose
foldr (&&) True (False:xs)
it evaluates to
False && foldr (&&) True xs
And since
False && _
is False, the result is False

Let's repeat your definitions and add a couple more:
(&&)::Bool->Bool->Bool
True && x = x
False && _ = False
and :: [Bool]->Bool
and xs = foldr (&&) True xs
repeat :: a -> [a]
repeat a = a : repeat a
foldr :: (a -> r -> r) -> r -> [a] -> r
foldr _ z [] = z
foldr f z (a:as) = f a (foldr f z as)
Now, we can prove this by evaluating it manually, taking care to do it "lazily" (outermost applications first, and evaluating only enough to resolve the outermost data constructors):
and (repeat False)
= foldr (&&) True (repeat False) -- definition of `and`
= foldr (&&) True (False : repeat False) -- definition of `repeat`
= False && foldr (&&) True (repeat False) -- `foldr`, second equation
= False -- `&&`, second equation
The key is that the second equation of the definition of && discards its second argument:
False && _ = False
This means, in runtime terms, that we never force foldr's recursive call at a step where we encounter False.
Another way to look at it is to contemplate foldr's second equation, and what it means when we have lazy evaluation:
foldr f z (a:as) = f a (foldr f z as)
Since the recursive call to foldr happens as an argument to f, this means that, at runtime, the function f decides whether the value of its second argument is necessary or not, and so chooses at each step of the fold whether to recurse down the list or not. And this "decision" process proceeds from left to right.
In my understanding foldr will loop through the list from tail till head. Is there any internal "break" mechanism?
Strictly speaking, in a pure functional language there is no intrinsic notion of evaluation order. Expressions may be evaluated in any order that's consistent with their data dependencies.
What you've said here is a common misunderstanding that people who've learned foldr from impure, eager languages carry over to Haskell. In an eager language that's a useful rule of thumb, but in Haskell, with purity lazy evaluation, that rule will only confuse you. Often the opposite rule of thumb is useful when programming Haskell: foldr will visit the list elements from left to right, and at each step its f argument function gets to decide whether the rest of the list is necessary.
The extreme example of this is to implement a function to get the head of a list using foldr:
-- | Return `Just` the first element of the list, or `Nothing` if the
-- list is empty.
safeHead :: [a] -> Maybe a
safeHead = foldr (\a _ -> Just a) Nothing
So for example:
safeHead [1..]
= foldr (\a _ -> Just a) Nothing [1..]
= foldr (\a _ -> Just a) Nothing (1:[2..])
= (\a _ -> Just a) 1 (foldr (\a _ -> Just a) Nothing [2..])
= Just 1

Related

How to make a tail recursive function

I am really confused on how to make a function "Tail recursive".
Here is my function, but I don't know whether it is already tail recursive or not.
I am trying to merge two lists in Haskell.
merge2 :: Ord a =>[a]->[a]->[a]
merge2 xs [] = xs
merge2 [] ys = ys
merge2 (x:xs)(y:ys) = if y < x then y: merge2 (x:xs) ys else x :merge2 xs (y:ys)
Your function isn't tail-recursive; it's guarded recursive. However, guarded recursion is what you should be using in Haskell if you want to be memory efficient.
For a call to be a tail call, its result must be the result of the entire function. This definition applies to both recursive and non-recursive calls.
For example, in the code
f x y z = (x ++ y) ++ z
the call (x ++ y) ++ z is a tail call because its result is the result of the entire function. The call x ++ y is not a tail call.
For an example of tail-recursion, consider foldl:
foldl :: (b -> a -> b) -> b -> [a] -> b
foldl _ acc [] = acc
foldl f acc (x:xs) = foldl f (f acc x) xs
The recursive call foldl f (f acc x) xs is a tail-recursive call because its result is the result of the entire function. Thus it's a tail call, and it is recursive being a call of foldl to itself.
The recursive calls in your code
merge2 (x:xs) (y:ys) = if y < x then y : merge2 (x:xs) ys
else x : merge2 xs (y:ys)
are not tail-recursive because they do not give the result of the entire function. The result of the call to merge2 is used as a part of the whole returned value, a new list. The (:) constructor, not the recursive call, gives the result of the entire function. And in fact, being lazy, (:) _ _ returns right away, and the holes _ are filled only later, if and when needed. That's why guarded recursion is space efficient.
However, tail-recursion doesn't guarantee space efficiency in a lazy language.
With lazy evaluation, Haskell builds up thunks, or structures in memory that represent code that is yet to be evaluated. Consider the evaluation of the following code:
foldl f 0 (1:2:3:[])
=> foldl f (f 0 1) (2:3:[])
=> foldl f (f (f 0 1) 2) (3:[])
=> foldl f (f (f (f 0 1) 2) 3) []
=> f (f (f 0 1) 2) 3
You can think of lazy evaluation as happening "outside-in." When the recursive calls to foldl are evaluated, thunks are built-up in the accumulator. So, tail recursion with accumulators is not space efficient in a lazy language because of the delayed evaluation (unless the accumulator is forced right away, before the next tail-recursive call is made, thus preventing the thunks build-up and instead presenting the already-calculated value, in the end).
Rather than tail recursion, you should try to use guarded recursion, where the recursive call is hidden inside a lazy data constructor. With lazy evaluation, expressions are evaluated until they are in weak head normal form (WHNF). An expression is in WHNF when it is either:
A lazy data constructor applied to arguments (e.g. Just (1 + 1))
A partially applied function (e.g. const 2)
A lambda expression (e.g. \x -> x)
Consider map:
map :: (a -> b) -> [a] -> [b]
map _ [] = []
map f (x:xs) = f x : map f xs
map (+1) (1:2:3:[])
=> (+1) 1 : map (+1) (2:3:[])
The expression (+1) 1 : map (+1) (2:3:[]) is in WHNF because of the (:) data constructor, and therefore evaluation stops at this point. Your merge2 function also uses guarded recursion, so it too is space-efficient in a lazy language.
TL;DR: In a lazy language, tail-recursion can still take up memory if it builds up thunks in the accumulator, while guarded recursion does not build up thunks.
Helpful links:
https://wiki.haskell.org/Tail_recursion
https://wiki.haskell.org/Stack_overflow
https://wiki.haskell.org/Thunk
https://wiki.haskell.org/Weak_head_normal_form
Does Haskell have tail-recursive optimization?
What is Weak Head Normal Form?

Member Function in Haskell

Working on a small assignment for class, having a lot of trouble with Haskell. I am trying to make a recursive method for finding if an integer is part of a list or not. I know the gist, but am unable to get it working correctly with the haskell syntax. Check if the current list is empty, if so then False, then check if integer is equal to the head of the current list, if so, then True, then call member again with the same value you are searching for, and the tail of the list. What can I do to get this functioning properly.
Currently this is what I have:
member ::Int -> [Int] -> Bool
member x y
if y [] then False
else if x == head y then True
else member x tail y
I have also tried using
member :: (Eq x) => x -> [x] -> Bool
as the beginning line, and also a much simplier :
let member x y = if null y then False
else if x == head y then True
else member x tail y
Any help would be appreciated.
with pattern matching you can write it more clearly
member :: (Eq a) => a -> [a] -> Bool
member x [] = False
member x (y:ys) | x==y = True
| otherwise = member x ys
element _ [] = False
element e (x:xs) = e == x || e `element` xs
-- OR
element e xs = if xs == [] then False
else if e == head xs then True
else e `element` tail xs
-- OR
element e xs = xs /= [] && (e == head xs || e `element` tail xs)
-- x `op` y = op x y
-- If you're feeling cheeky
element = elem
Your syntax appears very confused, but your logic makes sense, so here's a bucket list of things to remember:
Functions can be defined by multiple equations. Equations are checked top to bottom. That means using =.
Pattern matches are not equality tests. A pattern match breaks a value into its constituents if it matches and fails otherwise. An equality test x == y returns a Bool about the equality of x and y.
Pattern matching is used for flow control via...
a case statement, like
case xs of {
[] -> ...
x:xs' -> ...
}
Multiple equations, like
element _ [] = ...
element e (x:xs) = ...
Note that you can ignore a value in a pattern with _. With multiple equations of a function with multiple arguments, you're really pattern matching on all the arguments at once.
Bools are used for flow control via if _ then _ else _:
if xs == [] then False
else True
which is really just
case x == y of {
True -> False
False -> True
}
and Bools can use the ordinary operators (&&) (infixr 3) and (||) (infixr 2)
The difference is especially nefarious on lists. instance Eq a => Eq [a], so in order to use == on lists, you need to know that the elements of the lists can be compared for equality, too. This is true even when you're just checking (== []). [] == [] actually causes an error, because the compiler cannot tell what type the elements are. Here it doesn't matter, but if you say, e.g. nonEmpty xs = xs /= [], you'll get nonEmpty :: Eq a => [a] -> Bool instead of nonEmpty :: [a] -> Bool, so nonEmpty [not] gives a type error when it should be True.
Function application has the highest precedence, and is left-associative:
element x xs reads as ((element x) xs)
element x tail xs reads as (((element x) tail) xs), which doesn't make sense here
f $ x = f x, but it's infixr 0, which means it basically reverses the rules and acts like a big set of parentheses around its right argument
element x $ tail xs reads as ((element x) (tail xs)), which works
Infix functions always have lower precedence than prefix application:
x `element` tail xs means ((element x) (tail xs)), too
let decls in expr is an expression. decls is only in scope inside expr, and the entire thing evaluates to whatever expr evaluates to. It makes no sense on the top level.
Haskell uses indentation to structure code, like Python. Reference

Haskell: Is there a left-side identity for the infix (`:`) operator?

In other words, what syntax (if any) could I use in place of XXX in the following implementation of filter:
filter' :: (a -> Bool) -> [a] -> [a]
filter' _ [] = []
filter' f (x:xs) =
let n = if f x then x else XXX
in n:(filter' f xs)
I'm aware of the following alternative implementation (which is recursive and only prepends) but would still be curious if the infix operator has a LHS identity.
filter' :: (a -> Bool) -> [a] -> [a]
filter' _ [] = []
filter' f (x:xs)
| f x = x:(filter' f xs)
| otherwise = filter' f xs
There is none. This can be seen because
ghci> length (undefined : [])
1
so no matter what element you put there, you will always get a length of 1.
How about this phrasing:
filter' f (x:xs) =
let n = if f x then (x:) else id
in n (filter' f xs)
Handwave alert: the following is strictly speaking a lie (because of undefined and such), but it's still a useful idea.
One key property of types defined with Haskell data declarations is that they are free: the set of values for a data type is isomorphic to the set of normal-form terms (fully evaluated expressions) of that type. If two terms of the type are different, then they have different values.
From this it follows that x : xs and xs (in the same scope) must be different lists, simply because they are different terms.
Put a bit differently, the semantics of data types is that if you pattern match on a constructor application you always get back the same constructor and its arguments. For example, these two expressions are guaranteed to be True, no matter what x and xs may be:
case [] of
[] -> True
x:xs -> False
case (x:xs) of
[] -> False
x':xs' -> x == x' && xs == xs'
The left identity value you're looking for would be a counterexample to the second expression here. Ergo, no such value exists.
There is none, but ++ does have an identity, namely []. So this would work as well:
filter' _ [] = []
filter' f (x:xs) =
let n = if f x then [x] else []
in n ++ (filter' f xs)
This is the same as #luqui's answer (as you could prove) only less efficient
, but it keeps things a bit lower-order.

filtering values into two lists

So i'm new to sml and am trying to understand the ins/out out of it. Recently i tried creating a filter which takes two parameters: a function (that returns a boolean), and a list of values to run against the function. What the filter does is it returns the list of values which return true against the function.
Code:
fun filter f [] = [] |
filter f (x::xs) =
if (f x)
then x::(filter f xs)
else (filter f xs);
So that works. But what i'm trying to do now is just a return a tuple that contains the list of true values, and false. I'm stuck on my conditional and I can't really see another way. Any thoughts on how to solve this?
Code:
fun filter2 f [] = ([],[]) |
filter2 f (x::xs) =
if (f x)
then (x::(filter2 f xs), []) (* error *)
else ([], x::(filter2 f xs)); (* error *)
I think there are several ways to do this.
Reusing Filter
For instance, we could use a inductive approach based on the fact that your tuple would be formed by two elements, the first is the list of elements that satisfy the predicate and the second the list of elements that don't. So, you could reuse your filter function as:
fun partition f xs = (filter f xs, filter (not o f) xs)
This is not the best approach, though, because it evaluates the lists twice, but if the lists are small, this is quite evident and very readable.
Folding
Another way to think about this is in terms of fold. You could think that you are reducing your list to a tuple list, and as you go, you split your items depending on a predicate. Somwewhat like this:
fun parition f xs =
let
fun split x (xs,ys) =
if f x
then (x::xs,ys)
else (xs, x::ys)
val (trueList, falseList) = List.foldl (fn (x,y) => split x y)
([],[]) xs
in
(List.rev trueList, List.rev falseList)
end
Parition
You could also implement your own folding algorithm in the same way as the List.parition method of SML does:
fun partition f xs =
let
fun iter(xs, (trueList,falseList)) =
case xs of
[] => (List.rev trueList, List.rev falseList)
| (x::xs') => if f x
then iter(xs', (x::trueList,falseList))
else iter(xs', (trueList,x::falseList))
in
iter(xs,([],[]))
end
Use SML Basis Method
And ultimately, you can avoid all this and use SML method List.partition whose documentation says:
partition f l
applies f to each element x of l, from left to right, and returns a
pair (pos, neg) where pos is the list of those x for which f x
evaluated to true, and neg is the list of those for which f x
evaluated to false. The elements of pos and neg retain the same
relative order they possessed in l.
This method is implemented as the previous example.
So I will show a good way to do it, and a better way to do it (IMO). But the 'better way' is just for future reference when you learn:
fun filter2 f [] = ([], [])
| filter2 f (x::xs) = let
fun ftuple f (x::xs) trueList falseList =
if (f x)
then ftuple f xs (x::trueList) falseList
else ftuple f xs trueList (x::falseList)
| ftuple _ [] trueList falseList = (trueList, falseList)
in
ftuple f (x::xs) [] []
end;
The reason why yours does not work is because when you call x::(filter2 f xs), the compiler is naively assuming that you are building a single list, it doesn't assume that it is a tuple, it is stepping into the scope of your function call. So while you think to yourself result type is tuple of lists, the compiler gets tunnel vision and thinks result type is list. Here is the better version in my opinion, you should look up the function foldr if you are curious, it is much better to employ this technique since it is more readable, less verbose, and much more importantly ... more predictable and robust:
fun filter2 f l = foldr (fn(x,xs) => if (f x) then (x::(#1(xs)), #2(xs)) else (#1(xs), x::(#2(xs)))) ([],[]) l;
The reason why the first example works is because you are storing default empty lists that accumulate copies of the variables that either fit the condition, or do not fit the condition. However, you have to explicitly tell SML compiler to make sure that the type rules agree. You have to make absolutely sure that SML knows that your return type is a tuple of lists. Any mistake in this chain of command, and this will result in failure to execute. Hence, when working with SML, always study your type inferences. As for the second one, you can see that it is a one-liner, but I will leave you to research that one on your own, just google foldr and foldl.

Haskell pattern matching conundrum

I was trying to search through a list of pairs that could have the element ("$", Undefined) in it at some arbitrary location. I wanted to ONLY search the part of the list in front of that special element, so I tried something like this (alreadyThere is intended to take the element n and the list xs as arguments):
checkNotSameScope :: Env -> VarName -> Expr -> Expr
checkNotSameScope (xs:("$", Undefined):_) n e = if alreadyThere n xs then BoolLit False
else BoolLit True
But that does not work; the compiler seemed to indicate that (xs: ..) only deals with a SINGLE value prepending my list. I cannot use : to indicate the first chunk of a list; only a single element. Looking back, this makes sense; otherwise, how would the compiler know what to do? Adding an "s" to something like "x" doesn't magically make multiple elements! But how can I work around this?
Unfortunately, even with smart compilers and languages, some programming cannot be avoided...
In your case it seems you want the part of a list up to a specific element. More generally, to find the list up to some condition you can use the standard library takeWhile function. Then you can just run alreadyThere on it:
checkNotSameScope :: Env -> VarName -> Expr -> Expr
checkNotSameScope xs n e = if alreadyThere n (takeWhile (/= ("$", Undefined)) xs)
then BoolLit False
else BoolLit True
It maybe does not what you want for lists where ("$", Undefined) does not occur, so beware.
Similar to Joachim's answer, you can use break, which will allow you to detect when ("$", Undefined) doesn't occur (if this is necessary). i.e.
checkNotSameScope xs n e = case break (== ("$", Undefined)) xs of
(_, []) -> .. -- ("$", Undefined) didn't occur!
(xs', _) -> BoolLit . not $ alreadyThere n xs'
(NB. you lose some laziness in this solution, since the list has to be traversed until ("$", Undefined), or to the end, to check the first case.)
Haskell cannot do this kind of pattern matching out of the box, although there are some languages which can, like CLIPS for example, or F#, by using active patterns.
But we can use Haskell's existing pattern matching capabilities to obtain a similar result. Let us first define a function called deconstruct defined like this:
deconstruct :: [a] -> [([a], a, [a])]
deconstruct [] = []
deconstruct [x] = [([], x, [])]
deconstruct (x:xs) = ([], x, xs) : [(x:ys1, y, ys2) | (ys1, y, ys2) <- deconstruct xs]
What this function does is to obtain all the decompositions of a list xs into triples of form (ys1, y, ys2) such that ys1 ++ [y] ++ ys2 == xs. So for example:
deconstruct [1..4] => [([],1,[2,3,4]),([1],2,[3,4]),([1,2],3,[4]),([1,2,3],4,[])]
Using this you can define your function as follows:
checkNotSameScope xs n e =
case [ys | (ys, ("$", Undefined), _) <- deconstruct xs] of
[ys] -> BoolLit $ not $ alreadyThere n xs
_ -> -- handle the case when ("$", Undefined) doesn't occur at all or more than once
We can use the do-notation to obtain something even closer to what you are looking for:
checkNotSameScope xs n e = BoolLit $ not $ any (alreadyThere n) prefixes
where
prefixes = do
(xs, ("$", Undefined), _) <- deconstruct xs
return xs
There are several things going on here. First of all the prefixes variable will store all the prefix lists which occur before the ("$", Undefined) pair - or none if the pair is not in the input list xs. Then using the any function we are checking whether alreadyThere n gives us True for any of the prefixes. And the rest is to complete your function's logic.