Problems with deleting punctuation in list of lists - list

I am trying to write prolog code that will delete all punctuation (.,!? etc) from all lists in a list of lists. This is what I have so far:
delete_punctuation(_,[],_).
delete_punctuation(Character,[List|Tail],Resultlist) :-
delete(List,Character,NewList),
delete_punctuation(Character,Tail,[NewList|Resultlist]).
whereas 'Character' will be 33 for ! or 46 for . and so on since I will be using this only on lists of character codes. (I know, that the function would actually work for other elements that I would like to delete from the lists too.)
The results I receive when asking:
delete_punctuation(33,[[45,33,6],[4,55,33]],X).
is just
|: true.
However, I want it to be:
|: X = [[45,6],[4,55]].
What do I need to improve?

For this problem, I'd tackle it by addressing the two sub-problems separately, namely:
Filter/exclude a character code from a single list;
Applying the solution to the above to a list of lists of character codes.
To this end, I'd approach it like this:
exclude2(_, [], []).
exclude2(Code, [Code|Xs], Ys) :-
!, % ignore the next clause if codes match
exclude2(Code, Xs, Ys).
exclude2(Code, [X|Xs], [X|Ys]) :-
% else, Code != X here
exclude2(Code, Xs, Ys).
Note that some implementations like SWI-Prolog provide exclude/3 as a built-in, so you mightn't actually need to define it yourself.
Now, to apply the above predicate to a list of lists:
delete_punctuation(_, [], []).
delete_punctuation(Code, [L|Ls], [NewL|NewLs]) :-
exclude(Code, L, NewL),
delete_punctuation(Code, Ls, NewLs).
However, again, depending on the implementation, a built-in like maplist/3 could be used to achieve the same effect without having to define a new predicate:
?- maplist(exclude2(33), [[45,33,6],[4,55,33]], X).
X = [[45, 6], [4, 55]] ;
false.
n.b. if you want to use all SWI built-ins, exclude/3 requires the test to be a goal, like so:
?- maplist(exclude(==(33)), [[45,33,6],[4,55,33]], X).
X = [[45, 6], [4, 55]] ;
false.
For a more general approach, you could even add all the codes you want to exclude (such as any and all punctuation character codes) to a list to use as the filter:
excludeAll(_, [], []).
excludeAll(Codes, [Code|Xs], Ys) :-
member(Code, Codes),
!,
excludeAll(Codes, Xs, Ys).
excludeAll(Codes, [X|Xs], [X|Ys]) :-
excludeAll(Codes, Xs, Ys).
Then you can add a list with all the codes to delete:
?- maplist(excludeAll([33,63]), [[45,33,6],[4,55,33,63]], X).
X = [[45, 6], [4, 55]] ;
false.

Related

I want to implement the predicate noDupl/2 in Prolog & have trouble with singleton variables

My confusion mainly lies around understanding singleton variables.
I want to implement the predicate noDupl/2 in Prolog. This predicate can be used to identify numbers in a list that appear exactly once, i. e., numbers which are no duplicates. The first argument of noDupl is the list to analyze. The
second argument is the list of numbers which are no duplicates, as described below.
As an example, for the list [2, 0, 3, 2, 1] the result [0, 3, 1] is computed (because 2 is a duplicate).
In my implementation I used the predefined member predicate and used an auxiliary predicate called helper.
I'll explain my logic in pseudocode, so you can help me spot where I went wrong.
First off, If the first element is not a member of the rest of the list, add the first element to the new result List (as it's head).
If the first element is a member of T, call the helper method on the rest of the list, the first element H and the new list.
Helper method, if H is found in the tail, return list without H, i. e., Tail.
noDupl([],[]).
noDupl([H|T],L) :-
\+ member(H,T),
noDupl(T,[H|T]).
noDupl([H|T],L) :-
member(H,T),
helper(T,H,L).
helper([],N,[]).
helper([H|T],H,T). %found place of duplicate & return list without it
helper([H|T],N,L) :-
helper(T,N,[H|T1]).%still couldn't locate the place, so add H to the new List as it's not a duplicate
While I'm writing my code, I'm always having trouble with deciding to choose a new variable or use the one defined in the predicate arguments when it comes to free variables specifically.
Thanks.
Warnings about singleton variables are not the actual problem.
Singleton variables are logical variables that occur once in some Prolog clause (fact or rule). Prolog warns you about these variables if they are named like non-singleton variables, i. e., if their name does not start with a _.
This convention helps avoid typos of the nasty kind—typos which do not cause syntax errors but do change the meaning.
Let's build a canonical solution to your problem.
First, forget about CamelCase and pick a proper predicate name that reflects the relational nature of the problem at hand: how about list_uniques/2?
Then, document cases in which you expect the predicate to give one answer, multiple answers or no answer at all. How?
Not as mere text, but as queries.
Start with the most general query:
?- list_uniques(Xs, Ys).
Add some ground queries:
?- list_uniques([], []).
?- list_uniques([1,2,2,1,3,4], [3,4]).
?- list_uniques([a,b,b,a], []).
And add queries containing variables:
?- list_uniques([n,i,x,o,n], Xs).
?- list_uniques([i,s,p,y,i,s,p,y], Xs).
?- list_uniques([A,B], [X,Y]).
?- list_uniques([A,B,C], [D,E]).
?- list_uniques([A,B,C,D], [X]).
Now let's write some code! Based on library(reif) write:
:- use_module(library(reif)).
list_uniques(Xs, Ys) :-
list_past_uniques(Xs, [], Ys).
list_past_uniques([], _, []). % auxiliary predicate
list_past_uniques([X|Xs], Xs0, Ys) :-
if_((memberd_t(X,Xs) ; memberd_t(X,Xs0)),
Ys = Ys0,
Ys = [X|Ys0]),
list_past_uniques(Xs, [X|Xs0], Ys0).
What's going on?
list_uniques/2 is built upon the helper predicate list_past_uniques/3
At any point, list_past_uniques/3 keeps track of:
all items ahead (Xs) and
all items "behind" (Xs0) some item of the original list X.
If X is a member of either list, then Ys skips X—it's not unique!
Otherwise, X is unique and it occurs in Ys (as its list head).
Let's run some of the above queries using SWI-Prolog 8.0.0:
?- list_uniques(Xs, Ys).
Xs = [], Ys = []
; Xs = [_A], Ys = [_A]
; Xs = [_A,_A], Ys = []
; Xs = [_A,_A,_A], Ys = []
...
?- list_uniques([], []).
true.
?- list_uniques([1,2,2,1,3,4], [3,4]).
true.
?- list_uniques([a,b,b,a], []).
true.
?- list_uniques([1,2,2,1,3,4], Xs).
Xs = [3,4].
?- list_uniques([n,i,x,o,n], Xs).
Xs = [i,x,o].
?- list_uniques([i,s,p,y,i,s,p,y], Xs).
Xs = [].
?- list_uniques([A,B], [X,Y]).
A = X, B = Y, dif(Y,X).
?- list_uniques([A,B,C], [D,E]).
false.
?- list_uniques([A,B,C,D], [X]).
A = B, B = C, D = X, dif(X,C)
; A = B, B = D, C = X, dif(X,D)
; A = C, C = D, B = X, dif(D,X)
; A = X, B = C, C = D, dif(D,X)
; false.
Just like my previous answer, the following answer is based on library(reif)—and uses it in a somewhat more idiomatic way.
:- use_module(library(reif)).
list_uniques([], []).
list_uniques([V|Vs], Xs) :-
tpartition(=(V), Vs, Equals, Difs),
if_(Equals = [], Xs = [V|Xs0], Xs = Xs0),
list_uniques(Difs, Xs0).
While this code does not improve upon my previous one regarding efficiency / complexity, it is arguably more readable (fewer arguments in the recursion).
In this solution a slightly modified version of tpartition is used to have more control over what happens when an item passes the condition (or not):
tpartition_p(P_2, OnTrue_5, OnFalse_5, OnEnd_4, InitialTrue, InitialFalse, Xs, RTrue, RFalse) :-
i_tpartition_p(Xs, P_2, OnTrue_5, OnFalse_5, OnEnd_4, InitialTrue, InitialFalse, RTrue, RFalse).
i_tpartition_p([], _P_2, _OnTrue_5, _OnFalse_5, OnEnd_4, CurrentTrue, CurrentFalse, RTrue, RFalse):-
call(OnEnd_4, CurrentTrue, CurrentFalse, RTrue, RFalse).
i_tpartition_p([X|Xs], P_2, OnTrue_5, OnFalse_5, OnEnd_4, CurrentTrue, CurrentFalse, RTrue, RFalse):-
if_( call(P_2, X)
, call(OnTrue_5, X, CurrentTrue, CurrentFalse, NCurrentTrue, NCurrentFalse)
, call(OnFalse_5, X, CurrentTrue, CurrentFalse, NCurrentTrue, NCurrentFalse) ),
i_tpartition_p(Xs, P_2, OnTrue_5, OnFalse_5, OnEnd_4, NCurrentTrue, NCurrentFalse, RTrue, RFalse).
InitialTrue/InitialFalse and RTrue/RFalse contains the desired initial and final state, procedures OnTrue_5 and OnFalse_5 manage state transition after testing the condition P_2 on each item and OnEnd_4 manages the last transition.
With the following code for list_uniques/2:
list_uniques([], []).
list_uniques([V|Vs], Xs) :-
tpartition_p(=(V), on_true, on_false, on_end, false, Difs, Vs, HasDuplicates, []),
if_(=(HasDuplicates), Xs=Xs0, Xs = [V|Xs0]),
list_uniques(Difs, Xs0).
on_true(_, _, Difs, true, Difs).
on_false(X, HasDuplicates, [X|Xs], HasDuplicates, Xs).
on_end(HasDuplicates, Difs, HasDuplicates, Difs).
When the item passes the filter (its a duplicate) we just mark that the list has duplicates and skip the item, otherwise the item is kept for further processing.
This answer goes similar ways as this previous answer by #gusbro.
However, it does not propose a somewhat baroque version of tpartition/4, but instead an augmented, but hopefully leaner, version of tfilter/3 called tfilter_t/4 which can be defined like so:
tfilter_t(C_2, Es, Fs, T) :-
i_tfilter_t(Es, C_2, Fs, T).
i_tfilter_t([], _, [], true).
i_tfilter_t([E|Es], C_2, Fs0, T) :-
if_(call(C_2,E),
( Fs0 = [E|Fs], i_tfilter_t(Es,C_2,Fs,T) ),
( Fs0 = Fs, T = false, tfilter(C_2,Es,Fs) )).
Adapting list_uniques/2 is straightforward:
list_uniques([], []).
list_uniques([V|Vs], Xs) :-
if_(tfilter_t(dif(V),Vs,Difs), Xs = [V|Xs0], Xs = Xs0),
list_uniques(Difs, Xs0).
Save scrollbars. Stay lean! Use filter_t/4.
You have problems already in the first predicate, noDupl/2.
The first clause, noDupl([], []). looks fine.
The second clause is wrong.
noDupl([H|T],L):-
\+member(H,T),
noDupl(T,[H|T]).
What does that really mean I leave as an exercise to you. If you want, however, to add H to the result, you would write it like this:
noDupl([H|T], [H|L]) :-
\+ member(H, T),
noDupl(T, L).
Please look carefully at this and try to understand. The H is added to the result by unifying the result (the second argument in the head) to a list with H as the head and the variable L as the tail. The singleton variable L in your definition is a singleton because there is a mistake in your definition, namely, you do nothing at all with it.
The last clause has a different kind of problem. You try to clean the rest of the list from this one element, but you never return to the original task of getting rid of all duplicates. It could be fixed like this:
noDupl([H|T], L) :-
member(H, T),
helper(T, H, T0),
noDupl(T0, L).
Your helper/3 cleans the rest of the original list from the duplicate, unifying the result with T0, then uses this clean list to continue removing duplicates.
Now on to your helper. The first clause seems fine but has a singleton variable. This is a valid case where you don't want to do anything with this argument, so you "declare" it unused for example like this:
helper([], _, []).
The second clause is problematic because it removes a single occurrence. What should happen if you call:
?- helper([1,2,3,2], 2, L).
The last clause also has a problem. Just because you use different names for two variables, this doesn't make them different. To fix these two clauses, you can for example do:
helper([H|T], H, L) :-
helper(T, H, L).
helper([H|T], X, [H|L]) :-
dif(H, X),
helper(T, X, L).
These are the minimal corrections that will give you an answer when the first argument of noDupl/2 is ground. You could do this check this by renaming noDupl/2 to noDupl_ground/2 and defining noDupl/2 as:
noDupl(L, R) :-
must_be(ground, L),
noDupl_ground(L, R).
Try to see what you get for different queries with the current naive implementation and ask if you have further questions. It is still full of problems, but it really depends on how you will use it and what you want out of the answer.

Prolog: Reversed list -- can't find the error

I want to write a reverse/2 function. This is my code and I cannot figure out where the error is.
rev([]).
rev([H|T],X):-rev(T,X),append(T,H,_).
The output:
rev ([1,2,3,4], X).
false.
rev(?List1,?List2) is true when elements of List2 are in reversed order compared to List1
rev(Xs, Ys) :-
rev(Xs, [], Ys, Ys).
rev([], Ys, Ys, []).
rev([X|Xs], Rs, Ys, [_|Bound]) :-
rev(Xs, [X|Rs], Ys, Bound).
Output:
?- rev([1,2,3,4],X).
X = [4, 3, 2, 1].
?- rev([3,4,a,56,b,c],X).
X = [c, b, 56, a, 4, 3].
Explanation of rev/4
On call rev([X|Xs](1), Rs(2), Ys(3), [_|Bound](4))
[X|Xs](1) - List1, the input list in our case (we can either call rev(Z,[3,2,1]).)
Rs(2) - ResultList is a helping list, we start with an empty list and on every recursive call we push (adding as head member) a member from [X|Xs](1).
Ys(3) - List2, the output list (reversed list of List1)
[_|Bound](4) - HelpingList for bounding the length of Ys(3) (for iterating "length of Ys" times).
On every recursion call rev(Xs(5), [X|Rs](6), Ys(3), Bound(7)).,
we push head member X ([X|Xs](1)) to the front of Rs ([X|Rs](6)),
and iterating the next member of Ys (Bound(7),[_|Bound](4)).
The recursion ends when rev([](9), Ys(10), Ys(3), [](12)). is true.
Every [X|Xs](1) (now the list is empty [](9)) member moved in reversed order to Ys(10), we bounded the size of Ys(3) (using [_|Bound](4) and now it's empty [](12)).
Notice that append/3 - append(?List1, ?List2, ?List1AndList2).
was wrong used in your code, append(T,H,_) when H is not a List2 (it's the head member of the list).
Example use of append/2 and append/3:
?- append([[1,2],[3]],X). % append/2 - Concatenate a list of lists.
X = [1, 2, 3].
?- append([4],[5],X). % append/3 - X is the concatenation of List1 and List2
X = [4, 5].
You should not place a space between the functor name rev and the argument list. Usually this gives a syntax error:
Welcome to SWI-Prolog (threaded, 64 bits, version 7.7.1)
SWI-Prolog comes with ABSOLUTELY NO WARRANTY. This is free software.
?- rev ([1,2,3],X).
ERROR: Syntax error: Operator expected
ERROR: rev
ERROR: ** here **
ERROR: ([1,2,3],X) .
Otherwise I think the rev/4 solution aims at a bidirectional solution. If you don't need this, and want to go for an accumulator solution that doesn't leave a choice point, you can try:
reverse(X, Y) :-
reverse2(X, [], Y).
reverse2([], X, X).
reverse2([X|Y], Z, T) :-
reverse2(Y, [X|Z], T).

Prolog - How to make a list of lists with a certain length from a flat list

For example:
createlistoflists([1,2,3,4,5,6,7,8,9], NewLists)
NewLists = [[1,2,3], [4,5,6], [7,8,9].
So basically my first argument is a list, my second argument a new list consisting of lists with the proper length (the proper length being 3). My first idea was to use append of some sort. But I have literally no idea how to do this, any thoughts?
thanks in advance
If you don't mind using the nice facilities Prolog provides you, there's a simple approach;
list_length(Size, List) :- length(List, Size).
split_list(List, SubSize, SubLists) :-
maplist(list_length(SubSize), SubLists),
append(SubLists, List).
And you can query it as:
?- split_list([1,2,3,4,5,6,7,8,9], 3, L).
L = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
It will fail if List is instantiated in such a way that it's length is not a multiple of SubSize.
As pointed out by Will Ness in the comments, the above simple solution has a flaw: the maplist(list_length(SubSize), SubList) will continue to query and find longer and longer sets of sublists, unconstrained. Thus, on retry, the query above will not terminate.
The temptation would be to use a cut like so:
split_list(List, SubSize, SubLists) :-
maplist(list_length(SubSize), SubLists), !,
append(SubLists, List).
The cut here assumes you just want to get a single answer as if you were writing an imperative function.
A better approach is to try to constrain, in a logical way, the SubList argument to maplist. A simple approach would be to ensure that the length of SubList doesn't exceed the length of List since, logically, it should never be greater. Adding in this constraint:
list_length(Size, List) :- length(List, Size).
not_longer_than([], []).
not_longer_than([], [_|_]).
not_longer_than([_|X], [_|Y]) :-
not_longer_than(X, Y).
split_list(List, SubSize, SubLists) :-
not_longer_than(SubLists, List),
maplist(list_length(SubSize), SubLists),
append(SubLists, List).
Then the query terminates without losing generality of the solution:
?- split_list([1,2,3,4,5,6,7,8,9], 3, L).
L = [[1, 2, 3], [4, 5, 6], [7, 8, 9]] ;
false.
?-
One could be more precise in the implementation of not_longer_than/2 and have it use the SubSize as a multiple. That would be more efficient but not required to get termination.
not_longer_than_multiple(L1, Mult, L2) :-
not_longer_than_multiple(L1, Mult, Mult, L2).
not_longer_than_multiple([], _, _, []).
not_longer_than_multiple([], _, _, [_|_]).
not_longer_than_multiple([_|Xs], Mult, 1, [_|Ys]) :-
not_longer_than_multiple(Xs, Mult, Mult, Ys).
not_longer_than_multiple(Xs, Mult, C, [_|Ys]) :-
C #> 1,
C1 #= C - 1,
not_longer_than_multiple(Xs, Mult, C1, Ys).
Or something along those lines...
But then, if we're going to go through all that non-sense to cover the sins of this use of maplist, then perhaps hitting the problem head-on makes the cleanest solution:
split_list(List, SubSize, SubLists) :-
split_list(List, SubSize, SubSize, SubLists).
split_list([], _, _, []).
split_list([X|Xs], SubList, 1, [[X]|S]) :-
split_list(Xs, SubList, SubList, S).
split_list([X|Xs], SubSize, C, [[X|T]|S]) :-
C #> 1,
C1 #= C - 1,
split_list(Xs, SubSize, C1, [T|S]).

Prolog: removing all elements that occur in list A from list B

I'm trying to define a predicate "delete(L1, L2, L3)" that is valid when L3 equals L2 minus any of these elements that are contained in L1. E.g. delete([1], [1,2,3], X) => would unify for X = [2,3]. My code is as follows:
isNonElement(_, []).
isNonElement(X, [Y|Z]) :- X \= Y, isNonElement(X,Z).
delete(_, [], []).
delete(Y, [X|W], Z) :- \+(isNonElement(X, Y)), delete(Y, W, Z).
delete(Y, [X|W], [X|Z]) :- isNonElement(X, Y), delete(Y, W, Z).
However it seems not to work for every test case. Can anyone help me out on what could be wrong with my code?
Thanks in advance!
Best regards,
Skyfe.
P.S. I can't tell the cases for which my predicate doesn't work correctly since it's tested by a school system which doesn't tell me which test cases it failed for.
You are very close!
One remaining problem is that you are applying a perfectly sound logical reasoning to predicates that do not admit such a reading. Your code works exactly as intended if you simply apply the following extremely straight-forward changes:
instead of (\=)/2, use dif/2
instead of \+(isNonElement(X, Y)), simply write: member(X, Y).
The first change is always advisable: It typically makes your programs usable in more directions. The second change avoids the use of impure negation by using a pure predicate instead.
In total, we now have:
isNonElement(_, []).
isNonElement(X, [Y|Z]) :- dif(X, Y), isNonElement(X,Z).
delete(_, [], []).
delete(Y, [X|W], Z) :- member(X, Y), delete(Y, W, Z).
delete(Y, [X|W], [X|Z]) :- isNonElement(X, Y), delete(Y, W, Z).
Now check this out: First, your test case:
?- delete([1], [1,2,3], X).
X = [2, 3] ;
false.
Works as expected!
Second, a case with a variable for L2:
?- delete([], L2, []).
L2 = [] ;
false.
This seems also very nice.
Third, another variable:
?- delete([X], [1,2,3], Ls3).
X = 1,
Ls3 = [2, 3] ;
X = 2,
Ls3 = [1, 3] ;
X = 3,
Ls3 = [1, 2] ;
Ls3 = [1, 2, 3],
dif(X, 3),
dif(X, 2),
dif(X, 1) ;
false.
Note now the different possibilities for X, and how dif/2 is used in answers to express that X must be different from certain integers in this case.
The use of impure predicates precludes such more general uses, and your grading system possible tries such cases too.
Note that you can of course easily implement member/2 yourself. It is one of the most straight-forward relations. However, note also that delete/3 is terribly named: An imperative always implies a particular direction of use, but the relation we are considering here also admits many other usage modes!

Remove unique elements only

There are many resources on how to remove duplicates and similar issues but I can't seem to be able to find any on removing unique elements. I'm using SWI-Prolog but I don't want to use built-ins to achieve this.
That is, calling remove_unique([1, 2, 2, 3, 4, 5, 7, 6, 7], X). should happily result in X = [2, 2, 7, 7].
The obvious solution is as something along the lines of
count(_, [], 0) :- !.
count(E, [E | Es], A) :-
S is A + 1,
count(E, Es, S).
count(E, [_ | Es], A) :-
count(E, Es, A).
is_unique(E, Xs) :-
count(E, Xs, 1).
remove_unique(L, R) :- remove_unique(L, L, R).
remove_unique([], _, []) :- !.
remove_unique([X | Xs], O, R) :-
is_unique(X, O), !,
remove_unique(Xs, O, R).
remove_unique([X | Xs], O, [X | R]) :-
remove_unique(Xs, O, R).
It should become quickly apparent why this isn't an ideal solution: count is O(n) and so is is_unique as it just uses count. I could improve this by failing when we find more than one element but worst-case is still O(n).
So then we come to remove_unique. For every element we check whether current element is_unique in O. If the test fails, the element gets added to the resulting list in the next branch. Running in O(n²), we get a lot of inferences. While I don't think we can speed it in the worst case, can we do better than this naïve solution? The only improvement that I can clearly see is to change count to something that fails as soon as >1 elements are identified.
Using tpartition/4 in tandem with
if_/3 and (=)/3, we define remove_unique/2 like this:
remove_unique([], []).
remove_unique([E|Xs0], Ys0) :-
tpartition(=(E), Xs0, Es, Xs),
if_(Es = [], Ys0 = Ys, append([E|Es], Ys, Ys0)),
remove_unique(Xs, Ys).
Here's the sample query, as given by the OP:
?- remove_unique([1,2,2,3,4,5,7,6,7], Xs).
Xs = [2,2,7,7]. % succeeds deterministically
As long as you don't know that the list is sorted in any way, and you want to keep the sequence of the non-unique elements, it seems to me you can't avoid making two passes: first count occurrences, then pick only repeating elements.
What if you use a (self-balancing?) binary tree for counting occurrences and look-up during the second pass? Definitely not O(n²), at least...