I'm very confused as to why [a,b,c,d] and [a,b|[c,d]] unify. From my limited understanding, the "|" operator splits the head and tail producing [a,b] and [[c,d]]. So how can this result object with apparent double list unify with the c,d, of the first one?
The lists [a,b,c,d] and [a,b|[c,d]] are the same thing, the "|" splits the list to the head a and Tail which is [b,c,d] (see #Will Ness's comment). Take a look below to understand it better:
[a,b,c] unifies with [Head|Tail] resulting in Head=a and Tail=[b,c]
[a] unifies with [H|T] resulting in H=a and T=[]
[a,b,c] unifies with [a|T] resulting in T=[b,c]
[a,b,c] doesn't unify with [b|T]
[] doesn't unify with [H|T]
[] unifies with []. Two empty lists always match
In the second example you can see that [a] is same to [a|[]], and in the third example: [a,b,c] is unifying with [a|[b,c]] which is same to your example.
Related
I think that the span function is Haskell is to apply a predicate to a list, and return a tuple where the first element is elements in the list that satisfy that predicate and the second element is the reminder of the list.
And it works well when I put:
span (<3) [1,2,4,5,6]. It just returns in GHCI:
([1,2], [4,5,6]).
However, when I enter span (>3) [1,2,4,5,6], it returns ([],[1,2,4,5,6]). But I thought it should return ([4,5,6],[1,2]). So I was wondering the reason of it .
Your understanding of span is not entirely correct, this is what the official docs say:
span, applied to a predicate p and a list xs,
returns a tuple where first element is longest prefix (possibly empty)
of xs of elements that satisfy p and second element
is the remainder of the list
(emphasis mine).
Hence, the predicate is applied to each element of the list starting from the beginning. This means that the predicate in
span (<3) [1,2,4,5,6]
is satisfied for the first two elements, and the result is
([1,2], [4,5,6])
But in this other example
span (>3) [1,2,4,5,6]
the first element of the list already doesn't satisfy the predicate, so the first element of the returned tuple will be an empty list.
What you describe here is partition :: (a -> Bool) -> [a] -> ([a], [a]). This is a function that for a given predicate will take a list, and make a 2-tuple where the first item is a list with items that satisfy the predicate, and the second item a list of items that do not satisfy the predicate. Indeed:
Prelude Data.List> partition (>3) [1,2,4,5,6]
([4,5,6],[1,2])
span :: (a -> Bool) -> [a] -> ([a], [a]) on the other hand makes a 2-tuple where the first item is the longest prefix of elements in the list that satisfy the predicate, and the second item is the list of remaining elements. Since for span (>3) [1,2,4,5,6], the first item does not satisfy the predicate. The longest prefix is the empty list [], and all elements of the given list, appear in the second item.
Basically span's complement is kind of break :: (a -> Bool) -> [a] -> ([a], [a]). You might even need to read the part twice in the docs to understand the subtle difference between break and span.
break, applied to a predicate p and a list xs, returns a tuple where
first element is longest prefix (possibly empty) of xs of elements
that do not satisfy p and second element is the remainder of the list:
So coming back to your question
λ> break (>3) [1,2,4,5,6]
([1,2],[4,5,6])
You may of course swap :: (a, b) -> (b, a) the tuple if that's essential. ie. swap . break
Another way to view the span function is to see it as consisting of both takeWhile and dropWhile function. i.e it is the application of both takeWhile and dropWhile function.
And what are these two functions?
takeWhile according to the documentation
takeWhile, applied to a predicate p and a list xs, returns the longest prefix (possibly empty) of xs of elements that satisfy p:
Basically it keeps returning elements from the list as long as the predicate is true. This means If the first element fails the predicate takeWhile will return []
dropWhile on the other hand, according to the documentation
dropWhile p xs returns the suffix remaining after takeWhile p xs
Basically it keeps skipping elements from the list while the predicate returns true. Once the predicate returns false, the remaining elements in the list is returned.
To see if the statement I made above about span being the application of both takeWhile and dropWhile function, we can apply these function independently to see their results.
Using just takeWhile:
*Main Data.List> takeWhile (>3) [1,2,4,5,6]
[]
Using just dropWhile:
*Main Data.List> dropWhile (>3) [1,2,4,5,6]
[1,2,4,5,6]
Now using span
*Main Data.List> span (>3) [1,2,4,5,6]
([],[1,2,4,5,6])
The result confirms span is applying both takeWhile and dropWhile
Note that what you describe here:
I think that the span function is Haskell is to apply a predicate to a list, and return a tuple where the first element is elements in the list that satisfy that predicate and the second element is the reminder of the list
Is the partition function as can be seen from the documentation
Given this definition of a function f:
f :: [Int] -> [Int]
f [] = []
f (x:xs) = x:[]
I would assume that a call such as
f [1]
would not match, because the pattern (x:xs) only matches, if there are more elements xs after the x in the list, which is not the case for the list [1]. Or is it?
If you write [x], a list with one element, this is short for (x : []), or even more verbose (:) x []. So it is a "cons" ((:)) with x as element, and the empty list as tail.
So your function f (x:xs) will indeed match a list with one (or more) elements. For a list with one element, x will be the element, and xs an empty list.
would not match, because the pattern (x:xs) only matches, if there are more elements xs after the x in the list, which is not the case for the list [1].
No the (x:xs) matches with every non-empty list, with x the first element of the list, and xs a (possibly empty) list of remaining elements.
If you want to match only lists with for example two or more elements. You can match this with:
-- two or more elements
f (x1 : x2 : xs) = …
Here x1 and x2 will match the first and second item of the list respectively, and xs is a list that contains the remaining elements.
EDIT: to answer your comments:
I wonder why my function definition does even compile in the first place, because the function type is [Int] -> [Int], so if I give it the empty list, then that is not [Int] as a result, is it?
The empty list [] is one of the data constructors of the [a] type, so that means that [] has type [] :: [a]. It can match the type variable a with Int, and thus [] can have type [] :: [Int].
Second then, how to I match a list with exactly two elements? [a, b]?
You can match such list with:
f (a : b : []) = …
or you can match it with:
f [a, b] = …
The two are equivalent. [a, b] is syntactical sugar: it is replaced by the compiler to (a : b : []), but for humans, it is of course more convenient to work with [a, b].
I am reading Bratko's Prolog: Programming for Artificial Intelligence. The easiest way for me to understand lists is visualising them as binary trees, which goes well. However, I am confused about the empty list []. It seems to me that it has two meanings.
When part of a list or enumeration, it is seen as an actual (empty) list element (because somewhere in the tree it is part of some Head), e.g. [a, []]
When it is the only item inside a Tail, it isn’t an element it literally is nothing, e.g. [a|[]]
My issue is that I do not see the logic behind 2. Why is it required for lists to have this possible ‘nothingness’ as a final tail? Simply because the trees have to be binary? Or is there another reason? (In other words, why is [] counted as an element in 1. but it isn't when it is in a Tail in 2?) Also, are there cases where the final (rightmost, deepest) final node of a tree is not ‘nothing’?
In other words, why is [] counted as an element in 1. but it isn't when it is in a Tail in 2?
Those are two different things. Lists in Prolog are (degenerate) binary trees, but also very much like a singly linked list in a language that has pointers, say C.
In C, you would have a struct with two members: the value, and a pointer to the next list element. Importantly, when the pointer to next points to a sentinel, this is the end of the list.
In Prolog, you have a functor with arity 2: ./2 that holds the value in the first argument, and the rest of the list in the second:
.(a, Rest)
The sentinel for a list in Prolog is the special []. This is not a list, it is the empty list! Traditionally, it is an atom, or a functor with arity 0, if you wish.
In your question:
[a, []] is actually .(a, .([], []))
[a|[]] is actually .(a, [])
which is why:
?- length([a,[]], N).
N = 2.
This is now a list with two elements, the first element is a, the second element is the empty list [].
?- [a|[]] = [a].
true.
This is a list with a single element, a. The [] at the tail just closes the list.
Question: what kind of list is .([], [])?
Also, are there cases where the final (rightmost, deepest) final node of a tree is not ‘nothing’?
Yes, you can leave a free variable there; then, you have a "hole" at the end of the list that you can fill later. Like this:
?- A = [a, a|Tail], % partial list with two 'a's and the Tail
B = [b,b], % proper list
Tail = B. % the tail of A is now B
A = [a, a, b, b], % we appended A and B without traversing A
Tail = B, B = [b, b].
You can also make circular lists, for example, a list with infinitely many x in it would be:
?- Xs = [x|Xs].
Xs = [x|Xs].
Is this useful? I don't know for sure. You could for example get a list that repeats a, b, c with a length of 7 like this:
?- ABCs = [a,b,c|ABCs], % a list that repeats "a, b, c" forever
length(L, 7), % a proper list of length 7
append(L, _, ABCs). % L is the first 7 elements of ABCs
ABCs = [a, b, c|ABCs],
L = [a, b, c, a, b, c, a].
In R at least many functions "recycle" shorter vectors, so this might be a valid use case.
See this answer for a discussion on difference lists, which is what A and Rest from the last example are usually called.
See this answer for implementation of a queue using difference lists.
Your confusion comes from the fact that lists are printed (and read) according to a special human-friendly format. Thus:
[a, b, c, d]
... is syntactic sugar for .(a, .(b, .(c, .(d, [])))).
The . predicate represents two values: the item stored in a list and a sublist. When [] is present in the data argument, it is printed as data.
In other words, this:
[[], []]
... is syntactic sugar for .([], .([], [])).
The last [] is not printed because in that context it does not need to. It is only used to mark the end of current list. Other [] are lists stored in the main list.
I understand that but I don't quite get why there is such a need for that final empty list.
The final empty list is a convention. It could be written empty or nil (like Lisp), but in Prolog this is denoted by the [] atom.
Note that in prolog, you can leave the sublist part uninstantiated, like in:
[a | T]
which is the same as:
.(a, T)
Those are known as difference lists.
Your understanding of 1. and 2. is correct -- where by "nothing" you mean, element-wise. Yes, an empty list has nothing (i.e. no elements) inside it.
The logic behind having a special sentinel value SENTINEL = [] to mark the end of a cons-cells chain, as in [1,2,3] = [1,2|[3]] = [1,2,3|SENTINEL] = .(1,.(2,.(3,SENTINEL))), as opposed to some ad-hoc encoding, like .(1,.(2,3)) = [1,2|3], is types consistency. We want the first field of a cons cell (or, in Prolog, the first argument of a . functored term) to always be treated as "a list's element", and the second -- as "a list". That's why [] in [1, []] counts as a list's element (as it appears as a 1st argument of a .-functored compound term), while the [] in [1 | []] does not (as it appears as a 2nd argument of such term).
Yes, the trees have to be binary -- i.e. the functor . as used to encode lists is binary -- and so what should we put there in the final node's tail field, that would signal to us that it is in fact the final node of the chain? It must be something, consistent and easily testable. And it must also represent the empty list, []. So it's only logical to use the representation of an empty list to represent the empty tail of a list.
And yes, having a non-[] final "tail" is perfectly valid, like in [1,2|3], which is a perfectly valid Prolog term -- it just isn't a representation of a list {1 2 3}, as understood by the rest of Prolog's built-ins.
I've only been at Haskell for two days now, and was wondering what the difference between the two function definitions below are:
Prelude> let swap (x1:x2:xs) = x2:x1:xs
Prelude> swap [1..5]
[2,1,3,4,5]
Prelude> let swap' (x1:x2:xs) = [x2] ++ [x1] ++ xs
Prelude> swap' [1..5]
[2,1,3,4,5]
That is, what makes x2:x1:xs different from [x2] ++ [x1] ++ xs ?
Please and thanks.
The type signatures are a good place to start:
(:) :: a -> [a] -> [a]
(++) :: [a] -> [a] -> [a]
You can find these out with :type (:) and :type (++) in ghci.
As you can see from the type signatures, both are used to produce lists.
The : operator is used to construct lists (and to take them apart again for pattern matching). To make a list [1,2,3] you just build it up with 1 : 2 : 3 : []. The first element of : is the item to add on the front of the list, and the second element is either a list (also built up with : or the empty list signified by []).
The ++ operator is list concatenation. It takes two lists and appends them together. [1,2,3] ++ [4,5,6] is legal, whereas 1 ++ [1,2,3] is not.
This has nothing to do with syntax. (:) and (++) are just different operators. (:) is a constructor who constructs a list from an element and another list. (++) makes a new list that is the concatenation of two lists. Because (++) is not a constructor you can't use it in patterns.
Now we come to Syntax: the notation
[x2]
that you use is a shorthand for
x2:[]
So what you really have done in the second example is:
(x2:[]) ++ (x1:[]) ++ xs
Therefore, when constructing a list, you can't avoid (:), it's ultimatively the only way to do it. Note that you must construct intermediate lists to be able to use (++).
head' :: [a] -> a
head' [] = error "No head for empty lists!"
head' (x:_) = x
head' :: [a] -> a
head' xs = case xs of [] -> error "No head for empty lists!"
(x:_) -> x
I am asking for a fairly easy question which I don't understand.
In the code above, I see that it takes a list for an input.
But on the third line, it says (x:_) which confuses me.
Can anyone explain to me why they wrote (x:_) instead of [x:_]?
And plus, I don't understand what (x:_) means.
Thank you.
: is a constructor for lists, which takes the head of the new list as its left argument and the tail as its right argument. If you use it as a pattern like here that means that the head of the list you match is given to the left pattern and the tail to the right.
So in this case the head of the list is stored in the variable x and the tail is not used (_ means that you don't care about the value).
And yes, you can also use [] to pattern match against lists, but only lists of fixed size. For example the pattern [x] matches a list with exactly one element, which is then stored in the variable x. Likewise [x,y] would match a list with two elements.
Your proposed pattern [x:y] would thus match a list with one element, which matches the pattern x:y. In other words, it would match a list of lists which contains exactly one list.
This is a concept called pattern matching. : is an infix constructor, just like + is an infix function. In Haskell you pattern match with constructors.
(1 : 2 : 3 : [])
Is the same as [1, 2, 3], the square bracket notation is just syntactic sugar for creating lists.
Your pattern (x : _) means that you want to bind the first element of the list to x and that you do not care about the rest of the list _.