What is a "Test succeeded with choicepoint" warning in PL-Unit, and how do I fix it? - unit-testing

I'm writing a prolog program to check if a variable is an integer.
The way I'm "returning" the result is strange, but I don't think it's important for answering my question.
The Tests
I've written passing unit tests for this behaviour; here they are...
foo_test.pl
:- begin_tests('foo').
:- consult('foo').
test('that_1_is_recognised_as_int') :-
count_ints(1, 1).
test('that_atom_is_not_recognised_as_int') :-
count_ints(arbitrary, 0).
:- end_tests('foo').
:- run_tests.
The Code
And here's the code that passes those tests...
foo.pl
count_ints(X, Answer) :-
integer(X),
Answer is 1.
count_ints(X, Answer) :-
\+ integer(X),
Answer is 0.
The Output
The tests are passing, which is good, but I'm receiving a warning when I run them. Here is the output when running the tests...
?- ['foo_test'].
% foo compiled into plunit_foo 0.00 sec, 3 clauses
% PL-Unit: foo
Warning: /home/brandon/projects/sillybin/prolog/foo_test.pl:11:
/home/brandon/projects/sillybin/prolog/foo_test.pl:4:
PL-Unit: Test that_1_is_recognised_as_int: Test succeeded with choicepoint
. done
% All 2 tests passed
% foo_test compiled 0.03 sec, 1,848 clauses
true.
I'm using SWI-Prolog (Multi-threaded, 64 bits, Version 6.6.6)
I have tried combining the two count_ints predicates into one, using ;, but it still produces the same warning.
I'm on Debian 8 (I doubt it makes a difference).
The Question(s)
What does this warning mean? And...
How do I prevent it?

First, let us forget the whole testing framework and simply consider the query on the toplevel:
?- count_ints(1, 1).
true ;
false.
This interaction tells you that after the first solution, a choice point is left. This means that alternatives are left to be tried, and they are tried on backtracking. In this case, there are no further solutions, but the system was not able to tell this before actually trying them.
Using all/1 option for test cases
There are several ways to fix the warning. A straight-forward one is to state the test case like this:
test('that_1_is_recognised_as_int', all(Count = [1])) :-
count_ints(1, Count).
This implicitly collects all solutions, and then makes a statement about all of them at once.
Using if-then-else
A somewhat more intelligent solution is to make count_ints/2 itself deterministic!
One way to do this is using if-then-else, like this:
count_ints(X, Answer) :-
( integer(X) -> Answer = 1
; Answer = 0
).
We now have:
?- count_ints(1, 1).
true.
i.e., the query now succeeds deterministically.
Pure solution: Clean data structures
However, the most elegant solution is to use a clean representation, so that you and the Prolog engine can distinguish all cases by pattern matching.
For example, we could represent integers as i(N), and everything else as other(T).
In this case, I am using the wrappers i/1 and other/1 to distinguish the cases.
Now we have:
count_ints(i(_), 1).
count_ints(other(_), 0).
And the test cases could look like:
test('that_1_is_recognised_as_int') :-
count_ints(i(1), 1).
test('that_atom_is_not_recognised_as_int') :-
count_ints(other(arbitrary), 0).
This also runs without warnings, and has the significant advantage that the code can actually be used for generating answers:
?- count_ints(Term, Count).
Term = i(_1900),
Count = 1 ;
Term = other(_1900),
Count = 0.
In comparison, we have with the other versions:
?- count_ints(Term, Count).
Count = 0.
Which, unfortunately, can at best be considered covering only 50% of the possible cases...
Tighter constraints
As Boris correctly points out in the comments, we can make the code even stricter by constraining the argument of i/1 terms to integers. For example, we can write:
count_ints(i(I), 1) :- I in inf..sup.
count_ints(other(_), 0).
Now, the argument must be an integer, which becomes clear by queries like:
?- count_ints(X, 1).
X = i(_1820),
_1820 in inf..sup.
?- count_ints(i(any), 1).
ERROR: Type error: `integer' expected, found `any' (an atom)
Note that the example Boris mentioned fails also without such stricter constraints:
?- count_ints(X, 1), X = anything.
false.
Still, it is often useful to add further constraints on arguments, and if you need to reason over integers, CLP(FD) constraints are often a good and general solution to explicitly state type constraints that are otherwise only implicit in your program.
Note that integer/1 did not get the memo:
?- X in inf..sup, integer(X).
false.
This shows that, although X is without a shadow of a doubt constrained to integers in this example, integer(X) still does not succeed. Thus, you cannot use predicates like integer/1 etc. as a reliable detector of types. It is much better to rely on pattern matching and using constraints to increase the generality of your program.

First things first: the documentation of the SWI-Prolog Prolog Unit Tests package is quite good. The different modes are explained in Section 2.2. Writing the test body. The relevant sentence in 2.2.1 is:
Deterministic predicates are predicates that must succeed exactly once and, for well behaved predicates, leave no choicepoints. [emphasis mine]
What is a choice point?
In procedural programming, when you call a function, it can return a value, or a set of values; it can modify state (local or global); whatever it does, it will do it exactly once.
In Prolog, when you evaluate a predicate, a proof tree is searched for solutions. It is possible that there is more than one solution! Say you use between/3 like this:
For x = 1, is x in [0, 1, 2]?
?- between(0, 2, 1).
true.
But you can also ask:
Enumerate all x such that x is in [0, 1, 2].
?- between(0, 2, X).
X = 0 ;
X = 1 ;
X = 2.
After you get the first solution, X = 0, Prolog stops and waits; this means:
The query between(0, 2, X) has at least one solution, X = 0. It might have further solutions; press ; and Prolog will search the proof tree for the next solution.
The choice point is the mark that Prolog puts in the search tree after finding a solution. It will resume the search for the next solution from that mark.
The warning "Test succeeded with choicepoint" means:
The solution Prolog found was the solution the test expected; however, there it leaves behind a choice point, so it is not "well-behaved".
Are choice points a problem?
Choice points you didn't put there on purpose could be a problem. Without going into detail, they can prevent certain optimizations and create inefficiencies. That's kind of OK, but sometimes only the first solution is the solution you (the programmer) intended, and a next solution can be misleading or wrong. Or, famously, after giving you one useful answer, Prolog can go into an infinite loop.
Again, this is fine if you know it: you just never ask for more than one solution when you evaluate this predicate. You can wrap it in once/1, like this:
?- once( between(0, 2, X) ).
or
?- once( count_ints(X, Answer) ).
If someone else uses your code though all bets are off. Succeeding with a choice point can mean anything from "there are other useful solutions" to "no more solutions, this will now fail" to "other solutions, but not the kind you wanted" to "going into an infinite loop now!"
Getting rid of choice points
To the particular example: You have a built-in, integer/1, which will succeed or fail without leaving choice points. So, these two clauses from your original definition of count_ints/2 are mutually exclusive for any value of X:
count_ints(X, Answer) :-
integer(X), ...
count_ints(X, Answer) :-
\+ integer(X), ...
However, Prolog doesn't know that. It only looks at the clause heads and those two are identical:
count_ints(X, Answer) :- ...
count_ints(X, Answer) :- ...
The two heads are identical, Prolog doesn't look any further that the clause head to decide whether the other clause is worth trying, so it tries the second clause even if the first argument is indeed an integer (this is the "choice point" in the warning you get), and invariably fails.
Since you know that the two clauses are mutually exclusive, it is safe to tell Prolog to forget about the other clause. You can use once/1, as show above. You can also cut the remainder of the proof tree when the first argument is indeed an integer:
count_ints(X, 1) :- integer(X), !.
count_ints(_, 0).
The exactly same operational semantics, but maybe easier for the Prolog compiler to optimize:
count_ints(X, Answer) :-
( integer(X)
-> Answer = 1
; Answer = 0
).
... as in the answer by mat. As for using pattern matching, it's all good, but if the X comes from somewhere else, and not from the code you have written yourself, you will still have to make this check at some point. You end up with something like:
variable_tagged(X, T) :-
( integer(X) -> T = i(X)
; float(X) -> T = f(X)
; atom(X) -> T = a(X)
; var(X) -> T = v(X)
% and so on
; T = other(X)
).
At that point you can write your count_ints/2 as suggested by mat, and Prolog will know by looking at the clause heads that your two clauses are mutually exclusive.
I once asked a question that boils down to the same Prolog behaviour and how to deal with it. The answer by mat recommends the same approach. The comment by mat to my comment below the answer is just as important as the answer itself (if you are writing real programs at least).

Related

Checking if the difference between consecutive elements is the same

I am new to using arithmetic in Prolog.
I’ve done a few small programs, but mostly involving logic. I am trying to implement a function that will return true or false if the difference between every consecutive pair of elements is the same or not.
My input would look like this: sameSeqDiffs([3, 5, 7, 9], 2)
I feel like I need to split the first two elements from the list, find their difference, and add the result to a new list. Once all the elements have been processed, check if the elements of the new list are all the same.
I’ve been taught some Prolog with building relationships and querying those, but this doesn’t seem to fit in with Prolog.
Update1: This is what I've come up with so far. I am brand new to this syntax and am still getting an error on my code, but I hope it conveys the general idea of what I'm trying to do.
diff([X,Y|Rest], Result):-
diff([Y,Z|Rest], Result2):-
Result2 = Result,
Z - Y = Result.
Update2: I know I still have much to do on this code, but here is where I will remain until this weekend, I have some other stuff to do. I think I understand the logic of it a bit more, and I think I need to figure out how to run the last line of the function only if there is at least two more things in the rest of the list to process.
diff([X,Y|Rest], Result):-
number(Y),
Y-X=Result,
diff([Rest], Result).
Update3: I believe I have the function the way I want it to. The only quirk I noticed is that when I run and input like: sameSeqDiffs([3,5,7],2).I get true returned immediately followed by a false. Is this the correct operation or am I still missing something?
sameSeqDiffs([X,Y], Result):-
A is Y - X,
A = Result.
sameSeqDiffs([X,Y,Z|T], Result):-
sameSeqDiffs([Y,Z|T], Result).
Update 4: I posted a new question about this....here is the link: Output seems to only test the very last in the list for difference function
Prolog's syntax
The syntax is a bit off: normally a clause has a head like foo(X, Y, Z), then an arrow (:-), followed by a body. That body normally does not contain any arrows :-. So the second arrow :- makes not much sense.
Predicates and unification
Secondly in Prolog predicates have no input or output, a predicate is true or false (well it can also error, or got stuck into an infinite loop, but that is typically behavior we want to avoid). It communicates answers by unifying variables. For example a call sameSeqDiffs([3, 5, 7, 9], X). can succeed by unifying X with 2, and then the predicate - given it is implemented correctly - will return true..
Inductive definitions
In order to design a predicate, on typically first aims to come up with an inductive definition: a definition that consists out of one or more base cases, and one or more "recursive" cases (where the predicate is defined by parts of itself).
For example here we can say:
(base case) For a list of exactly two elements [X, Y], the predicate sameSeqDiffs([X, Y], D) holds, given D is the difference between Y and X.
In Prolog this will look like:
sameSeqDiffs([X, Y], D) :-
___.
(with the ___ to be filled in).
Now for the inductive case we can define a sameSeqDiffs/2 in terms of itself, although not with the same parameters of course. In mathematics, one sometimes defines a function f such that for example f(i) = 2×f(i-1); with for example f(0) = 1 as base. We can in a similar way define an inductive case for sameSeqDiffs/2:
(inductive case) For a list of more than two elements, all elements in the list have the same difference, given the first two elements have a difference D, and in the list of elements except the first element, all elements have that difference D as well.
In Prolog this will look like:
sameSeqDiffs([X, Y, Z|T], D) :-
___,
sameSeqDiffs(___, ___).
Arithmetic in Prolog
A common mistake people who start programming in Prolog make is they think that, like it is common in many programming languages, Prolog add semantics to certain functors.
For example one can think that A - 1 will decrement A. For Prolog this is however just -(A, 1), it is not minus, or anything else, just a functor. As a result Prolog will not evaluate such expressions. So if you write X = A - 1, then X is just X = -(A,1).
Then how can we perform numerical operations? Prolog systems have a predicate is/2, that evaluates the right hand side by attaching semantics to the right hand side. So the is/2 predicate will interpret this (+)/2, (-)/2, etc. functors ((+)/2 as plus, (-)/2 as minus, etc.).
So we can evaluate an expression like:
A = 4, is(X, A - 1).
and then X will be set to 3, not 4-1. Prolog also allows to write the is infix, like:
A = 4, X is A - 1.
Here you will need this to calculate the difference between two elements.
You were very close with your second attempt. It should have been
samediffs( [X, Y | Rest], Result):-
Result is Y - X,
samediffs( [Y | Rest], Result).
And you don't even need "to split the first two elements from the list". This will take care of itself.
How? Simple: calling samediffs( List, D), on the first entry into the predicate, the not yet instantiated D = Result will be instantiated to the calculated difference between the second and the first element in the list by the call Result is Y - X.
On each subsequent entry into the predicate, which is to say, for each subsequent pair of elements X, Y in the list, the call Result is Y - X will calculate the difference for that pair, and will check the numerical equality for it and Result which at this point holds the previously calculated value.
In case they aren't equal, the predicate will fail.
In case they are, the recursion will continue.
The only thing missing is the base case for this recursion:
samediffs( [_], _Result).
samediffs( [], _Result).
In case it was a singleton (or even empty) list all along, this will leave the differences argument _Result uninstantiated. It can be interpreted as a checking predicate, in such a case. There's certainly no unequal differences between elements in a singleton (or even more so, empty) list.
In general, ......
recursion(A, B):- base_case( A, B).
recursion( Thing, NewThing):-
combined( Thing, Shell, Core),
recursion( Core, NewCore),
combined( NewThing, Shell, NewCore).
...... Recursion!

Prolog union fails

I'm trying to understand the use of union (the built in predicate) in Prolog. In many cases it seems to fail when it should succeed. It seems it has something to do with the order of the elements of the lists. All of the below cases fail (they come back with "false.").
?- union([1,2,3],[],[2,3,1]).
?- union([1,5,3], [1,2], [1,5,3,2]).
?- union([4,6,2,1], [2], [1,2,4,6]).
?- union([1,2], [], [2,1]).
Shouldn't all of these be true? Any explanation as to why these cases keep failing would be very helpful.
Also: Why does the below not succeed and find the correct list for A?
?- union([1,5,3], A, [4,1,5,3,2]). /** comes back with "fail." */
There are a couple of issues here. Declarative and procedural ones. Let's start with the declarative ones, they are really sitting a bit deeper. The procedural aspects can be handled easily with appropriate programming techniques, as in this answer.
When we consider declarative properties of a predicate, we consider its set of solutions. So we pretend that all we care about is what solutions the predicate will describe. We will completely ignore how all of this is implemented. For very simple predicates, that's a simple enumeration of facts - just like a database table. It is all obvious in such situations. It becomes much more unintuitive if the set of solutions is infinite. And this happens so easily. Think of the query
?- length(Xs,1).
This harmless looking query asks for all lists of length one. All of them! Let me count - that's infinitely many!
Before we look at the actual answer Prolog produces, think what you would do in such a situation. How would you answer that query? Some of my feeble attempts
?- length(Xs,1).
Xs = [1]
; Xs = [42]
; Xs = [ben+jerry]
; Xs = [feel([b,u,r,n])]
; Xs = [cromu-lence]
; Xs = [[[]]]
; ... . % I am running out of imagination
Should Prolog produce all those infinitely many values? How much time would this take? How much time do you have to stare at walls of text? Your lifetime is clearly not enough.
Taming the number of solutions, from solutions to answers
There is a way out: The logic variable!
?- length(Xs, 1).
Xs = [_A].
% ^^
This little _A permits us to collapse all strange solutions into a single answer!
So here we really had a lot of luck: we tamed the infinity with this nice variable.
Now back to your relation. There, we want to represent sets as lists. Lists are clearly not sets per se. Consider the list [a,a] and the list [a]. While they are different, they are meant to represent the same set. Think of it: How many alternate representations are there for [a]? Yep, infinitely many. But now, the logic variable cannot help us to represent all of them compactly1. Thus we have to enumerate them one-by-one. But if we have to enumerate all those answers, practically all queries will not terminate due to infinitely many solutions to enumerate explicitly. OK, some still will:
?- union([], [], Xs).
Xs = [].
And all ground queries. And all failing queries. But once we have a variable like
?- union([a], [], Xs).
Xs = [a]
; Xs = [a,a]
; Xs = [a,a,a]
; ... .
we already are deep into non-termination.
So given that, we have to make some decisions. We somehow need to tame that infinity. One idea is to consider a subset of the actual relation that leans somehow to a side. If we want to ask questions like union([1,2],[3,4], A3) then it is quite natural to impose a subset where we have this functional dependency
A1, A2 → A3
With this functional dependency we now determine exactly one value for A3 for each pair of A1, A2. Here are some examples:
?- union([1,5,3], [1,2], A3).
A3 = [5,3,1,2].
?- union([1,2,3], [], A3).
A3 = [1,2,3].
Note that Prolog always puts a . a the end. That means Prolog says:
Dixi! I have spoken. There are no more solutions.
(Other Prologs will moan "No" at the end.) As a consequence, the queries (from your comments) now fail:
?- union([1,5,3], [1,2], [1,5,3,2]).
false.
?- union([1,2,3],[],[2,3,1]).
false.
So imposing that functional dependency now restricts the set of solutions drastically. And that restriction was an arbitrary decision of the implementer. It could have been different! Sometimes, duplicates are removed, sometimes not. If A1 and A2 both are duplicate free lists, the result A3 will be duplicate free, too.
After looking into its implementation, the following seems to hold (you do not need to do this, the documentation should be good enough - well it isn't): The elements in the last argument are structured as follows and in that order:
The elements of A1 that do not occur in A2, too. In the relative order of A1.
All elements of A2 in their original order.
So with this functional dependency further properties have been sneaked in. Such as that A2 is always a suffix of A3! Consequently the following cannot be true, because there is no suffix of A3 that would make this query true:
?- union([1,5,3], A2, [4,1,5,3,2]).
false.
And there are even more irregularities that can be described on a declarative level. Often, for the sake of efficiency, relations are too general. Like:
?- union([],non_list,non_list).
Such concerns are often swiped away by noting that we are only interested in goals with arguments that are either lists (like [a,b]) or partial lists (like [a,b|Xs]).
Anyway. We finally have now described all the declarative properties we expect. Now comes the next part: That relation should be implemented adequately! There again a new bunch of problems awaits us!
With library(lists) of SWI, I get:
?- union([1,2], [X], [1,2,3]).
false.
?- X = 3, union([1,2], [X], [1,2,3]).
X = 3.
Which is really incorrect: This can only be understood procedurally, looking at the actual implementation. This no longer is a clean relation. But this problem can be fixed!
You can avoid the correctness issues altogether by sticking to the pure, monotonic subset of Prolog. See above for more.
1) To tell the truth, it would be possible to represent that infinite set with some form of constraints. But the mere fact that there is not a single library for sets provided by current Prolog systems should make it clear that this is not an obvious choice.

Define a rule to determine if a list contains a given member

I have recently started learning prolog, and facing a problem with this question:
Define a rule to determine if a list contains a given member.
I searched all over stack overflow to get some links to understand this problem better and write solutions for it but couldn't find anything. Could anyone of you advice to solve this particular problem?
My Approach:
Iterative over the list and see if your member matches with head:
on(Item,[Item|Rest]). /* is my target item on the list */
on(Item,[DisregardHead|Tail]):-
on(Item,Tail).
Do you think my approach is correct?
What you have is indeed a "correct" implementation. The standard name for a predicate that does that is member/2, and is available (under that name) in any Prolog, and should be quite easy to find once you know its name.
Some things to note however. First, with the classical definition (this is exactly as in "The Art of Prolog" by Sterling and Shapiro, p. 58, and identical to yours):
member_classic(X, [X|Xs]).
member_classic(X, [Y|Ys]) :-
member_classic(X, Ys).
If you try to compile this, you will get singleton errors. This is because you have named variables that appear only once in their scope: the Xs in the first clause and the Y in the second. This aside, here is what the program does:
?- member_classic(c, [a,b,c,x]).
true ;
false.
?- member_classic(c, [c]).
true ;
false.
?- member_classic(X, [a,b,c]).
X = a ;
X = b ;
X = c ;
false.
In other words, with this definition, Prolog will leave behind a choice point even when it is quite obvious that there could not be further solutions (because it is at the end of the list). One way to avoid this is to use a technique called "lagging", as demonstrated by the SWI-Prolog library implementation of member/2.
And another thing: with your current problem statement, it might be that this is considered undesirable behaviour:
?- member_classic(a, [a,a,a]).
true ;
true ;
true ;
false.
There is another predicate usually called member_check/2 or memberchk/2 which does exactly what you have written, namely, succeeds or fails exactly once:
?- memberchk(a, [a,a,a]).
true.
?- memberchk(a, [x,y,z]).
false.
It has, however, the following behaviour when the first argument is a variable that might be undesirable:
?- memberchk(X, [a,b,c]).
X = a. % no more solutions!
There are valid uses for both member/2 and memberchk/2 IMHO (but interestingly enough, some people might argue otherwise).
Yes, your solution is correct and works in all directions. Nice!
Notes:
Your solution is in fact more general than what the task asks for. This is a good thing! The task, in my view, is badly worded. First of all, the first clause is not a rule, but a fact. It would have been better to formulate the task like: "Write a Prolog program that is true if a term occurs in a list." This leaves open other use cases that a good solution will also automatically solve, such as generating solutions.
This common predicate is widely known as member/2. Just like your solution, it also works in all directions. Try for example ?- member(E, Ls).
The name for the predicate could be better. A good naming convention for Prolog makes clear what each argument means. Consider for example: element_list/2, and start from there.

Prolog - Check number of occurences doesn't work as expected

In Prolog:
I have the following function that counts the occurences of a certain element in a list:
%count(L:list,E:int,N:int) (i,i,o)
count([],_,0).
count([H|T],E,C):-H == E,count(T,E,C1),C is C1+1.
count([_|T],E,C):-count(T,E,C).
I tested it and it works well. But here comes the problem, I have another function that has to check if "1" occurs less than 2 times in a list.
check(L):-count(L,1,C),C<2.
Whenever I try to check the list [1,1,1,1] for example, the result I get is "true", which is wrong, and I have no idea why. I tried to make some changes, but the function just won't work.
Improve your testing habits!
When testing Prolog code don't only look at the first answer to some query and conclude "it works".
Non-determinism is central to Prolog.
Quite often, some code appears to be working correctly at first sight (when looking at the first answer) but exhibits problems (mainly wrong answers and/or non-termination) upon backtracking.
Coming back to your original question... If you want / need to preserve logical-purity, consider using the following minimal variation of the code #Ruben presented in his answer:
count([],_,0).
count([E|T],E,C) :-
count(T,E,C1),
C is C1+1.
count([H|T],E,C) :-
dif(H,E),
count(T,E,C).
dif/2 expresses syntactic term inequality in a logical sound way. For info on it look at prolog-dif!
It happens because count([1,1,1,1],1,1) is also true! In your last count it can also be matched when H does equal E. To illustrate this, use ; to make prolog look for more answers to count([1,1,1,1],1,R). You'll see what happens.
count([],_,0).
count([E|T],E,C):-
count(T,E,C1),
C is C1+1.
count([H|T],E,C):-
H \= E,
count(T,E,C).
check(L) :-
count(L,1,C),
C < 2.
?- check([1,1,1,1,1]).
false
?- check([1]).
true
second and third clauses heads match both the same sequence. As a minimal correction, I would commit the test
count([],_,0).
count([H|T],E,C):-H == E,!,count(T,E,C1),C is C1+1.
count([_|T],E,C):-count(T,E,C).

Implementing "last" in Prolog

I am trying to get a feel for Prolog programming by going through Ulle Endriss' lecture notes. When my solution to an exercise does not behave as expected, I find it difficult to give a good explanation. I think this has to do with my shaky understanding of the way Prolog evaluates expressions.
Exercise 2.6 on page 20 calls for a recursive implementation of a predicate last1 which behaves like the built-in predicate last. My attempt is as follows:
last1([_ | Rest], Last) :- last1(Rest, Last).
last1([Last], Last).
It gives the correct answer, but for lists with more than one element, I have to key in the semicolon to terminate the query. This makes last1 different from the built-in last.
?- last1([1], Last).
Last = 1.
?- last1([1, 2], Last).
Last = 2 ;
false.
If I switch the order in which I declared the rule and fact, then I need to key in the semicolon in both cases.
I think I know why Prolog thinks that last1 may have one more solution (thus the semicolon). I imagine it follows the evaluation sequence
last1([1, 2], Last).
==> last1([2], Last).
==> last1([], Last). OR Last = 2.
==> false OR Last = 2.
That seems to suggest that I should look for a way to avoid matching Rest with []. Regardless, I have no explanation why switching the order of declaration ought to have any effect at all.
Question 1: What is the correct explanation for the behavior of last1?
Question 2: How can I implement a predicate last1 which is indistinguishable from the built-in last?
Question 1:
Prolog systems are not always able to decide whether or not a clause will apply prior to executing it. The precise circumstances are implementation dependent. That is, you cannot rely on that decision in general. Systems do improve here from release to release. Consider as the simplest case:
?- X = 1 ; 1 = 2.
X = 1
; false.
A very clever Prolog could detect that 1 = 2 always fails, and thus simply answer X = 1. instead. On the other hand, such "cleverness" is very costly to implement and time is better spent for optimizing more frequent cases.
So why do Prologs show this at all? The primary reason is to avoid asking meekly for another answer, if Prolog already knows that there is no further answer. So prior to this improvement, you were prompted for another answer for all queries containing variables and got the false or "no" on each and every query with exactly one answer. This used to be so cumbersome that many programmers never asked for the next answer and thus were not alerted about unintended answers.
And the secondary reason is to keep you aware of the limitations of the implementation: If Prolog asks for another answer on this general query, this means that it still uses some space which might accumulate and eat up all your computing resources.
In your example with last1/2 you encounter such a case. And you already did something very smart, BTW: You tried to minimize the query to see the first occurrence of the unexpected behavior.
In your example query last1([1,2],X) the Prolog system does not look at the entire list [1,2] but only looks at the principal functor. So for the Prolog system the query looks the same as last1([_|_],X) when it decides which clauses to apply. This goal now fits to both clauses, and this is the reason why Prolog will remember the second clause as an alternative to try out.
But, think of it: This choice is now possible for all elements but the last! Which means that you pay some memory for each element! You can actually observe this by using a very long list. This I get on my tiny 32-bit laptop — you might need to add another zero or two on a larger system:
?- length(L,10000000), last1(L,E).
resource_error(_). % ERROR: Out of local stack
On the other hand, the predefined last/2 works smoothly:
?- length(L,10000000), last(L,E).
L = [_A,_B,_C,_D,_E,_F,_G,_H,_I|...].
In fact, it uses constant space!
There are now two ways out of this:
Try to optimize your definition. Yes, you can do this, but you need to be very smart! The definition by #back_dragon for example is incorrect. It often happens that beginners try to optimize a program when in fact they are destroying its semantics.
Ask yourself if you are actually defining the same predicate as last/2. In fact, you're not.
Question 2:
Consider:
?- last(Xs, X).
Xs = [X]
; Xs = [_A,X]
; Xs = [_A,_B,X]
; Xs = [_A,_B,_C,X]
; Xs = [_A,_B,_C,_D,X]
; ... .
and
?- last1(Xs, X).
loops.
So your definition differs in this case with SWI's definition. Exchange the order of the clauses.
?- length(L,10000000), last2(L,E).
L = [_A,_B,_C,_D,_E,_F,_G,_H,_I|...]
; false.
Again, this false! But this time, the big list works. And this time, the minimal query is:
?- last2([1],E).
E = 1
; false.
And the situation is quite similar: Again, Prolog will look at the query in the same way as last2([_|_],E) and will conclude that both clauses apply. At least, we now have constant overhead instead of linear overhead.
There are several ways to overcome this overhead in a clean fashion - but they all very much depend on the innards of an implementation.
SWI-Prolog attempts to avoid prompting for more solutions when it can determine that there are none. I think that the interpreter inspect the memory looking for some choice point left, and if it can't find any, simply state the termination. Otherwise it waits to let user choice the move.
I would attempt to make last1 deterministic in this way:
last1([_,H|Rest], Last) :- !, last1([H|Rest], Last).
last1([Last], Last).
but I don't think it's indistinguishable from last. Lurking at the source code of the library (it's simple as ?- edit(last).)
%% last(?List, ?Last)
%
% Succeeds when Last is the last element of List. This
% predicate is =semidet= if List is a list and =multi= if List is
% a partial list.
%
% #compat There is no de-facto standard for the argument order of
% last/2. Be careful when porting code or use
% append(_, [Last], List) as a portable alternative.
last([X|Xs], Last) :-
last_(Xs, X, Last).
last_([], Last, Last).
last_([X|Xs], _, Last) :-
last_(Xs, X, Last).
we can appreciate a well thought implementation.
this code would work:
last1([Last], Last).
last1([_ | Rest], Last) :- last1(Rest, Last), !.
it is because prolog things there might be more combinations but, with this symbol: !, prolog won't go back after reaching this point