Checking whether element is in list or not - list

I have just started writing my first few programs in Haskell and I'm struggling a lot - so please excuse errors galore. I found this practice problem online and it wants me to take in a list, an int, and return a bool. I should check if the int is to be found within the list or not and return true or false, respectively.
This is what I have:
import Data.List
elem':: (Eq a) => a -> [a] -> Bool
elem' x [x]
| x == [x:xs] = True
| length [x] == 1 = False
| otherwise = elem' tail [x]
The compiler returns
Illegal type signature [...] Type signatures are only allowed in patterns with ScopedTypeVariables
which I cannot really make sense of yet. I've found some people saying that this error occurs when using tabs instead of spaces, yet I solely used spaces. I am grateful for any advice!

First you have a simple indentation problem. Your definition is parsed as
elem':: (Eq a) => a -> [a] -> Bool elem' x [x]
| x == [x:xs] = True
| length [x] == 1 = False
| otherwise = elem' tail [x]
I'm a bit surprised that this isn't reported as a parse error, but basically it is one.
What you need to do is, indent all function clauses to the same level as the signature (i.e. in this case, not indented at all).
elem':: (Eq a) => a -> [a] -> Bool
elem' x [x]
| x == [x:xs] = True
| length [x] == 1 = False
| otherwise = elem' tail [x]
This is still very wrong though. First, you're matching twice on x. This is not possible in Haskell. OTOH there's no variable xs in scope. If it were, it would have to have type [a], and then [x:xs] would have type [[a]]: a nested list. But then comparing it to x, which is not even a simple list, doesn't make any sense. length [x] is always 1, regardless of what x is. Similarly, tail [x] is always the empty list (but that's not even what's used here, you've actually wrote that the tail function should be passed as an argument to elem').
Perhaps the confusion comes from the fact that [ ] can mean two different things: one the type level, it means you're talking about lists of elements of the enclosed type. I.e. [a] is the type of lists whose elements have type a.By contrast, on the value level, [x] is specifically the list that contains only a single element, namely x.
You should make sure you understand how pattern matching works. Your clause would have been legal if you had two different variables in it. Then, matching [y] would mean this clause only accepts lists with exactly one element, and y will than be the value of the single element. In this case, the correct implementation would be
elem' x [y] = x==y
(Alternatively, you could have guards here yielding True and False, but that's just redundant work.)
Actually though there's no good reason to have a clause for exactly one element. All you need is a clause for zero elements
elem' x [] = ...
and a clause for at least one element:
elem' x (y:ys)
| ...
| ...

Related

Haskell self-referential List termination

EDIT: see this followup question that simplifies the problem I am trying to identify here, and asks for input on a GHC modification proposal.
So I was trying to write a generic breadth-first search function and came up with the following:
bfs :: (a -> Bool) -> (a -> [a]) -> [a] -> Maybe a
bfs predf expandf xs = find predf bfsList
where bfsList = xs ++ concatMap expandf bfsList
which I thought was pretty elegant, however in the does-not-exist case it blocks forever.
After all the terms have been expanded to [], concatMap will never return another item, so concatMap is blocking waiting for another item from itself? Could Haskell be made smart enough to realize the list generation is blocked reading the self-reference and terminate the list?
The best replacement I've been able to come up with isn't quite as elegant, since I have to handle the termination case myself:
where bfsList = concat.takeWhile (not.null) $ iterate (concatMap expandf) xs
For concrete examples, the first search terminates with success, and the second one blocks:
bfs (==3) (\x -> if x<1 then [] else [x/2, x/5]) [5, 3*2**8]
bfs (==3) (\x -> if x<1 then [] else [x/2, x/5]) [5, 2**8]
Edited to add a note to explain my bfs' solution below.
The way your question is phrased ("could Haskell be made smart enough"), it sounds like you think the correct value for a computation like:
bfs (\x -> False) (\x -> []) []
given your original definition of bfs should be Nothing, and Haskell is just failing to find the correct answer.
However, the correct value for the above computation is bottom. Substituting the definition of bfs (and simplifying the [] ++ expression), the above computation is equal to:
find (\x -> False) bfsList
where bfsList = concatMap (\x -> []) bfsList
Evaluating find requires determining if bfsList is empty or not, so it must be forced to weak head normal form. This forcing requires evaluating the concatMap expression, which also must determine if bfsList is empty or not, forcing it to WHNF. This forcing loop implies bfsList is bottom, and therefore so is find.
Haskell could be smarter in detecting the loop and giving an error, but it would be incorrect to return [].
Ultimately, this is the same thing that happens with:
foo = case foo of [] -> []
which also loops infinitely. Haskell's semantics imply that this case construct must force foo, and forcing foo requires forcing foo, so the result is bottom. It's true that if we considered this definition an equation, then substituting foo = [] would "satisfy" it, but that's not how Haskell semantics work, for the same reason that:
bar = bar
does not have value 1 or "awesome", even though these values satisfy it as an "equation".
So, the answer to your question is, no, this behavior couldn't be changed so as to return an empty list without fundamentally changing Haskell semantics.
Also, as an alternative that looks pretty slick -- even with its explicit termination condition -- maybe consider:
bfs' :: (a -> Bool) -> (a -> [a]) -> [a] -> Maybe a
bfs' predf expandf = look
where look [] = Nothing
look xs = find predf xs <|> look (concatMap expandf xs)
This uses the Alternative instance for Maybe, which is really very straightforward:
Just x <|> ... -- yields `Just x`
Nothing <|> Just y -- yields `Just y`
Nothing <|> Nothing -- yields `Nothing` (doesn't happen above)
so look checks the current set of values xs with find, and if it fails and returns Nothing, it recursively looks in their expansions.
As a silly example that makes the termination condition look less explicit, here's its double-monad (Maybe in implicit Reader) version using listToMaybe as the terminator! (Not recommended in real code.)
bfs'' :: (a -> Bool) -> (a -> [a]) -> [a] -> Maybe a
bfs'' predf expandf = look
where look = listToMaybe *>* find predf *|* (look . concatMap expandf)
(*>*) = liftM2 (>>)
(*|*) = liftM2 (<|>)
infixl 1 *>*
infixl 3 *|*
How does this work? Well, it's a joke. As a hint, the definition of look is the same as:
where look xs = listToMaybe xs >>
(find predf xs <|> look (concatMap expandf xs))
We produce the results list (queue) in steps. On each step we consume what we have produced on the previous step. When the last expansion step added nothing, we stop:
bfs :: (a -> Bool) -> (a -> [a]) -> [a] -> Maybe a
bfs predf expandf xs = find predf queue
where
queue = xs ++ gen (length xs) queue -- start the queue with `xs`, and
gen 0 _ = [] -- when nothing in queue, stop;
gen n q = let next = concatMap expandf (take n q) -- take n elemts from queue,
in next ++ -- process, enqueue the results,
gen (length next) (drop n q) -- advance by `n` and continue
Thus we get
~> bfs (==3) (\x -> if x<1 then [] else [x/2, x/5]) [5, 3*2**8]
Just 3.0
~> bfs (==3) (\x -> if x<1 then [] else [x/2, x/5]) [5, 2**8]
Nothing
One potentially serious flow in this solution is that if any expandf step produces an infinite list of results, it will get stuck calculating its length, totally needlessly so.
In general, just introduce a counter and increment it by the length of solutions produced at each expansion step (length . concatMap expandf or something), decrementing by the amount that was consumed. When it reaches 0, do not attempt to consume anything anymore because there's nothing to consume at that point, and you should instead terminate.
This counter serves in effect as a pointer back into the queue being constructed. A value of n indicates that the place where the next result will be placed is n notches ahead of the place in the list from which the input is taken. 1 thus means that the next result is placed directly after the input value.
The following code can be found in Wikipedia's article about corecursion (search for "corecursive queue"):
data Tree a b = Leaf a | Branch b (Tree a b) (Tree a b)
bftrav :: Tree a b -> [Tree a b]
bftrav tree = queue
where
queue = tree : gen 1 queue -- have one value in queue from the start
gen 0 _ = []
gen len (Leaf _ : s) = gen (len-1) s -- consumed one, produced none
gen len (Branch _ l r : s) = l : r : gen (len+1) s -- consumed one, produced two
This technique is natural in Prolog with top-down list instantiation and logical variables which can be explicitly in a not-yet-set state. See also tailrecursion-modulo-cons.
gen in bfs can be re-written to be more incremental, which is usually a good thing to have:
gen 0 _ = []
gen n (y:ys) = let next = expandf y
in next ++ gen (n - 1 + length next) ys
bfsList is defined recursively, which is not in itself a problem in Haskell. It does, however, produce an infinite list, which, again, isn't in itself a problem, because Haskell is lazily evaluated.
As long as find eventually finds what it's looking for, it's not an issue that there's still an infinity of elements, because at that point evaluation stops (or, rather, moves on to do other things).
AFAICT, the problem in the second case is that the predicate is never matched, so bfsList just keeps producing new elements, and find keeps on looking.
After all the terms have been expanded to [] concatMap will never return another item
Are you sure that's the correct diagnosis? As far as I can tell, with the lambda expressions supplied above, each input element always expand to two new elements - never to []. The list is, however, infinite, so if the predicate goes unmatched, the function will evaluate forever.
Could Haskell be made smart enough to realize the list generation is blocked reading the self-reference and terminate the list?
It'd be nice if there was a general-purpose algorithm to determine whether or not a computation would eventually complete. Alas, as both Turing and Church (independently of each other) proved in 1936, such an algorithm can't exist. This is also known as the Halting problem. I'm not a mathematician, though, so I may be wrong, but I think it applies here as well...
The best replacement I've been able to come up with isn't quite as elegant
Not sure about that one... If I try to use it instead of the other definition of bfsList, it doesn't compile... Still, I don't think the problem is the empty list.

Pattern matching and infinite lists

I'm having trouble understanding this simple snippet of code:
-- This works: foldr go1 [] [1..]
-- This doesn't: foldr go2 [] [1..]
go1 a b = a : b
go2 a [] = a : []
go2 a b = a : b
Folding with go1 immediately starts returning values, but go2 appears to be waiting for the end of the list.
Clearly the pattern matching is causing something to be handled differently. Can someone explain what exactly is going on here?
Unlike go1, go2 checks whether or not its second argument is empty. In order to do that the second argument needs to be evaluated, at least enough to determine whether it is empty or not.
So for your call to foldr this means the following:
Both go1 and go2 are first called with two arguments: 1 and the result of foldr go [] [2 ..]. In the case of go1 the second argument remains untouched, so the result of the foldr is simply 1 :: foldr go [] [2 ..] without evaluating the tail any further until it is accessed.
In the case of go2 however, foldr go [] [2 ..] needs to be evaluated to check whether it is empty. And to do that foldr go [] [3 ..] then needs to be evaluated for the same reason. And so on ad infinitum.
To test, whether an expression satisfies some pattern, you need to evaluate it to weak head normal form at least. So pattern-matching forces evaluation.
One common example is the interleave function, which interleaves two lists. It could be defined like
interleave :: [a] -> [a] -> [a]
interleave xs [] = xs
interleave [] ys = ys
interleave (x:xs) (y:ys) = x : y : interleave xs ys
But this function is strict in the second argument. And more lazy version is
interleave [] ys = ys
interleave (x:xs) ys = x : interleave ys xs
You can read more here: http://en.wikibooks.org/wiki/Haskell/Laziness
It is because of laziness.... Because of the way that go1 and go2 were defined in this example, they will behave exactly the same was for b==[], but the compiler doesn't know this.
For go1, the left-most fold will use tail-recursion to immediately output the value of a, and then compute the value of b.
go1 a b -> create and return the value of a, then calculate b
For go2, the compiler doesn't even know which case to match until the value of b is computed.... which will never happen.
go2 a b -> wait for the value of b, pattern match against it, then output a:b
To see the difference try this in GHCi:
> head (go1 1 (error "urk!"))
1
> head (go2 1 (error "urk!"))
*** Exception: urk!
The issue is that go2 will evaluate its second argument before returning the head of the list. That is, go2 is strict in its second argument, unlike go1 which is lazy.
This matters when you fold over infinite lists:
fold1 go1 [] [1..] =
go1 1 (go1 2 (go1 3 ( ..... =
1 : (go1 2 (go1 3 ( ..... =
1 : 2 : (go1 3 ( ...
works fine, but
fold1 go1 [] [1..] =
go2 1 (go2 2 (go2 3 ( .....
can not be simplified to 1:... since go2 insists in evaluating its second argument, which is another call to go2, which in turn requires its own second argument to be evaluated, which is another ...
Well, you get the point. The second one will not halt.

Haskell - List pattern matching on multiple parameters? (cannot construct the infinite type)

I'm having trouble using list pattern with multiple parameters. For example, trying to define:
somefunction (x:xs) (y:ys) = x:[y]
results in Occurs check: cannot construct the infinite type: t0 = [t0].
Basically, I want to take two lists as parameters to a function and manipulate each of them using the (x:xs) pattern matching approach. Why is this wrong and how can I do it right? Thank you much!
EDIT: Update with more code as suggested was needed in answers.
somefunction a [] = [a]:[]
somefunction [] b = [b]:[]
somefunction (x:xs) (y:ys) = x:[y]
EDIT 2: Missed an important update. The error I'm getting with the above code is Occurs check: cannot construct the infinite type: t0 = [[t0]]. I think I understand the problem now.
Your function snippet is perfectly sound:
(! 514)-> ghci
GHCi, version 7.6.3: http://www.haskell.org/ghc/ :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Prelude> let f (x:xs) (y:ys) = x:[y]
Prelude> :type f
f :: [a] -> [a] -> [a]
But the context contradicts that type, and the type inference give you that error. For instance, I can create a context that will give this error:
Prelude> let g xs ys = xs : ys
Prelude> :type g
g :: a -> [a] -> [a]
And then if I combine f and g like below, then I get your error:
Prelude> let z x y = g x (f x y)
<interactive>:7:20:
Occurs check: cannot construct the infinite type: a0 = [a0]
In the first argument of `f', namely `x'
In the second argument of `g', namely `(f x y)'
In the expression: g x (f x y)
Prelude>
To understand you error properly, you will need to examine (or post) enough context.
The problem is with all 3 lines taken together:
somefunction a [] = [a]:[]
somefunction [] b = [b]:[]
somefunction (x:xs) (y:ys) = x:[y]
None of them are incorrect taken on their own. The problem is that the three equations are inconsistent about the return type of somefunction.
From the last equation, we can see that both arguments are lists (since you pattern match on them using the list constructor :).
From the last equation, we can see that the return type is a list whose elements must be the same type as the elements of the argument lists (which must also both be the same type), since the return value is x:[y] (which is more often written [x, y]; just the list containing only the two elements x and y) and x and y were elements of the argument lists. So if x has type t0, the arguments to somefunction both have type [t0] and the return type is [t0].
Now try to apply those facts to the first equation. a must be a list. So [a] (the list containing exactly one element a) must be a list of lists. And then [a]:[] (the list whose first element is [a] and whose tail is empty - also written [[a]]) must be a list of lists of lists! If the parameter a has type [t0] (to match the type we figured out from looking at the last equation), then [a] has type [[t0]] and [a]:[] (or [[a]]) has type [[[t0]]], which is the return type we get from this equation.
To reconcile what we learned from those two equations we need to find some type expression to use for t0 such that [t0] = [[[t0]]], which also requires that t0 = [[t0]]. This is impossible, which is what the error message Occurs check: cannot construct the infinite type: t0 = [[t0]] was about.
If your intention was to return one of the parameters as-is when the other one is empty, then you need something more like:
somefunction a [] = a
somefunction [] b = b
somefunction (x:xs) (y:ys) = [x, y]
Or it's possible that the first two equations were correct (you intend to return a list of lists of lists?), in which case the last one needs to be modified. Without knowing what you wanted the function to do, I can't say.
May be you want to write:
somefunction xs [] = xs
somefunction [] ys = ys
somefunction (x:xs) (y:ys) = x : y : []
You have extra brackets. And your definition of x : y not contains []. So compiler think, y is already a list

SML: How can I pass a function a list and return the list with all negative reals removed?

Here's what I've got so far...
fun positive l1 = positive(l1,[],[])
| positive (l1, p, n) =
if hd(l1) < 0
then positive(tl(l1), p, n # [hd(l1])
else if hd(l1) >= 0
then positive(tl(l1), p # [hd(l1)], n)
else if null (h1(l1))
then p
Yes, this is for my educational purposes. I'm taking an ML class in college and we had to write a program that would return the biggest integer in a list and I want to go above and beyond that to see if I can remove the positives from it as well.
Also, if possible, can anyone point me to a decent ML book or primer? Our class text doesn't explain things well at all.
You fail to mention that your code doesn't type.
Your first function clause just has the variable l1, which is used in the recursive. However here it is used as the first element of the triple, which is given as the argument. This doesn't really go hand in hand with the Hindley–Milner type system that SML uses. This is perhaps better seen by the following informal thoughts:
Lets start by assuming that l1 has the type 'a, and thus the function must take arguments of that type and return something unknown 'a -> .... However on the right hand side you create an argument (l1, [], []) which must have the type 'a * 'b list * 'c list. But since it is passed as an argument to the function, that must also mean that 'a is equal to 'a * 'b list * 'c list, which clearly is not the case.
Clearly this was not your original intent. It seems that your intent was to have a function that takes an list as argument, and then at the same time have a recursive helper function, which takes two extra accumulation arguments, namely a list of positive and negative numbers in the original list.
To do this, you at least need to give your helper function another name, such that its definition won't rebind the definition of the original function.
Then you have some options, as to which scope this helper function should be in. In general if it doesn't make any sense to be calling this helper function other than from the "main" function, then it should not be places in a scope outside the "main" function. This can be done using a let binding like this:
fun positive xs =
let
fun positive' ys p n = ...
in
positive' xs [] []
end
This way the helper function positives' can't be called outside of the positive function.
With this take care of there are some more issues with your original code.
Since you are only returning the list of positive integers, there is no need to keep track of the
negative ones.
You should be using pattern matching to decompose the list elements. This way you eliminate the
use of taking the head and tail of the list, and also the need to verify whether there actually is
a head and tail in the list.
fun foo [] = ... (* input list is empty *)
| foo (x::xs) = ... (* x is now the head, and xs is the tail *)
You should not use the append operator (#), whenever you can avoid it (which you always can).
The problem is that it has a terrible running time when you have a huge list on the left hand
side and a small list on the right hand side (which is often the case for the right hand side, as
it is mostly used to append a single element). Thus it should in general be considered bad
practice to use it.
However there exists a very simple solution to this, which is to always concatenate the element
in front of the list (constructing the list in reverse order), and then just reversing the list
when returning it as the last thing (making it in expected order):
fun foo [] acc = rev acc
| foo (x::xs) acc = foo xs (x::acc)
Given these small notes, we end up with a function that looks something like this
fun positive xs =
let
fun positive' [] p = rev p
| positive' (y::ys) p =
if y < 0 then
positive' ys p
else
positive' ys (y :: p)
in
positive' xs []
end
Have you learned about List.filter? It might be appropriate here - it takes a function (which is a predicate) of type 'a -> bool and a list of type 'a list, and returns a list consisting of only the elements for which the predicate evaluates to true. For example:
List.filter (fn x => Real.>= (x, 0.0)) [1.0, 4.5, ~3.4, 42.0, ~9.0]
Your existing code won't work because you're comparing to integers using the intversion of <. The code hd(l1) < 0 will work over a list of int, not a list of real. Numeric literals are not automatically coerced by Standard ML. One must explicitly write 0.0, and use Real.< (hd(l1), 0.0) for your test.
If you don't want to use filter from the standard library, you could consider how one might implement filter yourself. Here's one way:
fun filter f [] = []
| filter f (h::t) =
if f h
then h :: filter f t
else filter f t

What does :: and ' mean in oCaml?

What doesx :: xs' mean?
I dont have much functional experience but IIRC in F# 1 :: 2 :: 3 :: [];; creates an array of [1,2,3]
so what does the ' do?
let rec sum xs =
match xs with
| [] -> 0
| x :: xs' -> x + sum xs'
I think sepp2k already answered most of the question, but I'd like to add a couple of points that may clarify how F#/OCaml compiler interprets the code and explain some common uses.
Regarding the ' symbol - this is just a part of a name (a valid identifier starts with a letter and then contains one or more letters, numbers or ' symbols). It is usually used if you have a function or value that is very similar to some other, but is in some way new or modified.
In your example, xs is a list that should be summed and the pattern matching decomposes the list and gives you a new list (without the first element) that you need to sum, so it is called xs'
Another frequent use is when declaring a local utility function that implements the functionality and takes an additional parameter (typically, when writing tail-recursive code):
let sum list =
let rec sum' list res =
match list with
| [] -> res
| x::xs -> sum' xs (res + x)
sum' list 0
However, I think there is usually a better name for the function/value, so I try to avoid using ' when writing code (I think it isn't particularly readable and moreover, it doesn't colorize correctly on StackOverflow!)
Regarding the :: symbol - as already mentioned, it is used to create lists from a single element and a list (1::[2;3] creates a list [1;2;3]). It is however worth noting that the symbol can be used in two different ways and it is also interpreted in two different ways by the compiler.
When creating a list, you use it as an operator that constructs a list (just like when you use + to add two numbers). However, when you use it in the match construct, it is used as a pattern, which is a different syntactic category - the pattern is used to decompose the list into an element and the remainder and it succeeds for any non-empty list:
// operator
let x = 0
let xs = [1;2;3]
let list = x::xs
// pattern
match list with
| y::ys -> // ...
The ' is simply part of the variable name. And yes foo :: bar, where foo is an element of type a and bar is a list of type a, means "the list that has foo as its first element, followed by the elements of bar". So the meaning of the match statement is:
If xs is the empty list, the value is 0. If xs is the list containing the item x followed by the items in xs' the value is x + sum xs'. Since x and xs' are fresh variables, this has the effect that for any non empty list, x will be assigned the value of the first element and xs' will be assigned the list containing all other elements.
Like others have said, the ' is a carryover from mathematics where x' would be said as "x prime"
It's idiomatic in ML-family languages to name a variable foo' to indicate that it's somewhat related to another variable foo, especially in recursions like your code sample. Just like in imperative languages you use i, j for loop indices. This naming convention may be a little surprising since ' is typically an illegal symbol for identifiers in C-like languages.
What does x :: xs' mean?
If you have two variables called x and xs' then x :: xs' creates a new list with x prepended onto the front of xs'.
I dont have much functional experience but IIRC in F# 1 :: 2 :: 3 :: [];; creates an array of [1,2,3]
Not quite. It's a list.
so what does the ' do?
It is treated as an alphabetical character, so the following is equivalent:
let rec sum xs =
match xs with
| [] -> 0
| x :: ys -> x + sum ys
Note that :: is technically a type constructor which is why you can use it in both patterns and expressions.