There are many resources on how to remove duplicates and similar issues but I can't seem to be able to find any on removing unique elements. I'm using SWI-Prolog but I don't want to use built-ins to achieve this.
That is, calling remove_unique([1, 2, 2, 3, 4, 5, 7, 6, 7], X). should happily result in X = [2, 2, 7, 7].
The obvious solution is as something along the lines of
count(_, [], 0) :- !.
count(E, [E | Es], A) :-
S is A + 1,
count(E, Es, S).
count(E, [_ | Es], A) :-
count(E, Es, A).
is_unique(E, Xs) :-
count(E, Xs, 1).
remove_unique(L, R) :- remove_unique(L, L, R).
remove_unique([], _, []) :- !.
remove_unique([X | Xs], O, R) :-
is_unique(X, O), !,
remove_unique(Xs, O, R).
remove_unique([X | Xs], O, [X | R]) :-
remove_unique(Xs, O, R).
It should become quickly apparent why this isn't an ideal solution: count is O(n) and so is is_unique as it just uses count. I could improve this by failing when we find more than one element but worst-case is still O(n).
So then we come to remove_unique. For every element we check whether current element is_unique in O. If the test fails, the element gets added to the resulting list in the next branch. Running in O(n²), we get a lot of inferences. While I don't think we can speed it in the worst case, can we do better than this naïve solution? The only improvement that I can clearly see is to change count to something that fails as soon as >1 elements are identified.
Using tpartition/4 in tandem with
if_/3 and (=)/3, we define remove_unique/2 like this:
remove_unique([], []).
remove_unique([E|Xs0], Ys0) :-
tpartition(=(E), Xs0, Es, Xs),
if_(Es = [], Ys0 = Ys, append([E|Es], Ys, Ys0)),
remove_unique(Xs, Ys).
Here's the sample query, as given by the OP:
?- remove_unique([1,2,2,3,4,5,7,6,7], Xs).
Xs = [2,2,7,7]. % succeeds deterministically
As long as you don't know that the list is sorted in any way, and you want to keep the sequence of the non-unique elements, it seems to me you can't avoid making two passes: first count occurrences, then pick only repeating elements.
What if you use a (self-balancing?) binary tree for counting occurrences and look-up during the second pass? Definitely not O(n²), at least...
Related
I'm having a hard time coming up with an efficient clause set for the following problem: given a list X find its maximum prefix consisting of same elements along with the remaining suffix. That is:
| ?- trim([a,a,a,b,b,c], [a,a,a], [b,b,c]).
yes
| ?- trim([a,a,a,a,b,b,c,c], X, Y).
X = [a,a,a,a],
Y = [b,b,c,c]
Here is what I have so far:
same([]).
same([_]).
same([X,X|T]) :- same([X|T]).
trim([], [], []).
trim(L, L, []) :- same(L).
trim(L, [A|B], [C|D]) :- append([A|B], [C|D], L), A \= C, same([A|B]).
The append part doesn't seem very efficient though. Is there a simple, iterative way to accomplish this?
Thinking about this problem from the start, we know we want the trivial case to be true:
trim([], [], []).
Then we want the longest repeated element prefix case:
trim([X], [X], []). % Trivial case
trim([X,Y|T], [X], [Y|T]) :- % Non-repeating element, ends recursion
dif(X, Y).
trim([X,X|T], [X|Xs], S) :- % Repeating element, recursive case
trim([X|T], Xs, S).
For example:
createlistoflists([1,2,3,4,5,6,7,8,9], NewLists)
NewLists = [[1,2,3], [4,5,6], [7,8,9].
So basically my first argument is a list, my second argument a new list consisting of lists with the proper length (the proper length being 3). My first idea was to use append of some sort. But I have literally no idea how to do this, any thoughts?
thanks in advance
If you don't mind using the nice facilities Prolog provides you, there's a simple approach;
list_length(Size, List) :- length(List, Size).
split_list(List, SubSize, SubLists) :-
maplist(list_length(SubSize), SubLists),
append(SubLists, List).
And you can query it as:
?- split_list([1,2,3,4,5,6,7,8,9], 3, L).
L = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
It will fail if List is instantiated in such a way that it's length is not a multiple of SubSize.
As pointed out by Will Ness in the comments, the above simple solution has a flaw: the maplist(list_length(SubSize), SubList) will continue to query and find longer and longer sets of sublists, unconstrained. Thus, on retry, the query above will not terminate.
The temptation would be to use a cut like so:
split_list(List, SubSize, SubLists) :-
maplist(list_length(SubSize), SubLists), !,
append(SubLists, List).
The cut here assumes you just want to get a single answer as if you were writing an imperative function.
A better approach is to try to constrain, in a logical way, the SubList argument to maplist. A simple approach would be to ensure that the length of SubList doesn't exceed the length of List since, logically, it should never be greater. Adding in this constraint:
list_length(Size, List) :- length(List, Size).
not_longer_than([], []).
not_longer_than([], [_|_]).
not_longer_than([_|X], [_|Y]) :-
not_longer_than(X, Y).
split_list(List, SubSize, SubLists) :-
not_longer_than(SubLists, List),
maplist(list_length(SubSize), SubLists),
append(SubLists, List).
Then the query terminates without losing generality of the solution:
?- split_list([1,2,3,4,5,6,7,8,9], 3, L).
L = [[1, 2, 3], [4, 5, 6], [7, 8, 9]] ;
false.
?-
One could be more precise in the implementation of not_longer_than/2 and have it use the SubSize as a multiple. That would be more efficient but not required to get termination.
not_longer_than_multiple(L1, Mult, L2) :-
not_longer_than_multiple(L1, Mult, Mult, L2).
not_longer_than_multiple([], _, _, []).
not_longer_than_multiple([], _, _, [_|_]).
not_longer_than_multiple([_|Xs], Mult, 1, [_|Ys]) :-
not_longer_than_multiple(Xs, Mult, Mult, Ys).
not_longer_than_multiple(Xs, Mult, C, [_|Ys]) :-
C #> 1,
C1 #= C - 1,
not_longer_than_multiple(Xs, Mult, C1, Ys).
Or something along those lines...
But then, if we're going to go through all that non-sense to cover the sins of this use of maplist, then perhaps hitting the problem head-on makes the cleanest solution:
split_list(List, SubSize, SubLists) :-
split_list(List, SubSize, SubSize, SubLists).
split_list([], _, _, []).
split_list([X|Xs], SubList, 1, [[X]|S]) :-
split_list(Xs, SubList, SubList, S).
split_list([X|Xs], SubSize, C, [[X|T]|S]) :-
C #> 1,
C1 #= C - 1,
split_list(Xs, SubSize, C1, [T|S]).
I'm trying to create a rule F(C,L) where C and L are integer lists. L contains the index number (starting from 1) of all the elements of C that are equal to 43. My code is shown below. When I try F([43,42,43,42,42,43],L). it returns true. What have I done wrong? Thanks in advance!
F(C,L) :-
forall(
(
member(X,C),
X=43,
nth1(N,C,X)
),
member(N,L)
).
The code by #CapelliC works, but only when used with sufficient instantiation.
?- f([43,42,43,42,42,43], Ps).
Ps = [1,3,6]. % ok
?- f([A,B], Ps).
Ps = [1,2]. % BAD
?- f(_, _).
**LOOPS** % BAD: doesn't terminate
To safeguard against problems like these we can use
iwhen/2 like so:
f_safe(C, L) :-
iwhen(ground(C), findall(X,nth1(X,C,43),L)).
Let's re-run above queries with SWI-Prolog:
?- f_safe([43,42,43,42,42,43], Ps).
Ps = [1,3,6]. % still ok
?- f_safe([A,B], Ps). % BETTER
ERROR: Arguments are not sufficiently instantiated
?- f_safe(_, _). % BETTER
ERROR: Arguments are not sufficiently instantiated
Take it step by step:
:- use_module(library(clpfd)).
list_contains_at1s(Elements, Member, Positions) :-
list_contains_at_index1(Elements, Member, Positions, 1).
list_contains_at_index1([], _, [], _).
list_contains_at_index1([E|Es], E, [I1|Is], I1) :-
I2 #= I1+1,
list_contains_at_index1(Es, E, Is, I2).
list_contains_at_index1([E|Es], X, Is, I1) :-
dif(X, E),
I2 #= I1+1,
list_contains_at_index1(Es, X, Is, I2).
Sample query with SWI-Prolog:
?- list_contains_at1s([43,42,43,42,42,43], 43, Positions).
Positions = [1,3,6]
; false. % left-over choicepoint
Syntax error apart, you're doing it more complex than needed. Keep it simpler, and use findall/3 instead of forall/2. The latter cannot be used to instantiate variables outside its scope.
f(C,L) :- findall(X, nth1(X,C,43), L).
While this previous answer preserves logical-purity, it shows some inefficiency in queries like:
?- list_contains_at1s([43,42,43,42,42,43], 43, Ps).
Ps = [1,3,6] ; % <------ SWI toplevel indicates lingering choicepoint
false.
In above query the lingering choicepoint is guaranteed to be useless: we know that above use case can never yield more than one solution.
Method 1: explicit indexing and extra helper predicate
The earlier definition of list_contains_at_index1/4 has two recursive clauses—one covering the "equal" case, the other one covering the "not equal" case.
Note that these two recursive clauses of list_contains_at_index1/4 are mutually exclusive, because (=)/2 and dif/2 are mutually exclusive.
How can we exploit this?
By utilizing first-argument indexing together with the reified term equality predicate (=)/3!
:- use_module(library(reif)).
list_contains_at_index1([], _, [], _).
list_contains_at_index1([E|Es], X, Is0, I1) :-
=(E, X, T), % (=)/3
equal_here_at0_at(T, I1, Is0, Is),
I2 #= I1+1,
list_contains_at_index1(Es, X, Is, I2).
equal_here_at0_at(true , I1, [I1|Is], Is). % index on the truth value ...
equal_here_at0_at(false, _, Is , Is). % ... of reified term equality
Method 2: implicit indexing, no extra helper predicate, using if_/3
For more concise code we can put if_/3 to good use:
list_contains_at_index1([], _, [], _).
list_contains_at_index1([E|Es], X, Is0, I1) :-
if_(E = X, Is0 = [I1|Is], Is0 = Is),
I2 #= I1+1,
list_contains_at_index1(Es, X, Is, I2).
If we re-run above query with new improved code ...
?- list_contains_at1s([43,42,43,42,42,43], 43, Positions).
Positions = [1, 3, 6].
... we see that the query now succeeds deterministically. Mission accomplished!
I am trying to build a list function in prolog which will hopefully do the following;
split(1, 4, [1, 2, 3, 4]). [2, 3]
split(2, 4, [1, 2, 3, 4, 5]). [3]
That is it will put all the items in a list which appear in between the two value provided.
What have I tried;
split(Start, Finish, List) :- append(List, _, [Start|Xs]),
append([Finish|Xs], _, List).
I can just never seem to get it working! I am new to prolog so please be relatively kind!!
Thanks
EDIT
Ok so I have a solution and would like to know if it could be improved. The solution is below,
% Split a list at a specified index
split(List, Index, Front, Back) :-
length(Front, Index),
append(Front, Back, List).
% Get list items inbetween members
inbetween(List, From, To, Result) :-
nth1(FromI, List, From),
nth0(ToI, List, To),
split(List, FromI, _, List1),
split(List, ToI, _, List2),
subtract(List1, List2, Result).
As you can see I followed the advice in the comments and tweaked it a little. Are there any improvements to this?
Thanks, (Could it even be possible in one predicate?)
EXAMPLE
inbetween([1,2,3,4,5,6], 2, 5, Result). % [3,4]
inbetween([a,b,c,d,e,f], a, e, Result). % [b,c,d]
I think the solution you came up with is interesting and it only needs a small adjustment to make it work correctly:
% Split a list at a specified index
split(List, Index, Front, Back) :-
length(Front, Index),
append(Front, Back, List).
% Get list items inbetween members
inbetween(List, From, To, Result) :-
nth1(FromI, List, From),
split(List, FromI, _, List1),
nth0(ToI, List1, To),
split(List1, ToI, Result, _).
The split/4 predicate is unchanged from what you have. The inbetween/4 main predicate I modified a little so that first it finds everything after the From, then it uses that result and finds everything before the To yielding the final result.
| ?- inbetween([a,b,c,a,x,b,e,f], a, b, L).
L = [] ? ;
L = [b,c,a,x] ? ;
L = [x] ? ;
(1 ms) no
A shorter version, using append/3 would be:
betwixt2(List, A, B, Result) :-
append(_, [A|T], List),
append(Result, [B|_], T).
Another approach which is more recursively based and not using library calls would be:
inbetween(List, A, B, Result) :-
split_left(List, A, R),
split_right(R, B, Result).
split_left([X|T], X, T).
split_left([_|T], X, R) :- split_left(T, X, R).
split_right([X|_], X, []).
split_right([H|T], X, [H|R]) :- split_right(T, X, R).
And finally, there's an interesting, concise solution, I hadn't considered when making my comments, using a DCG which is more transparent:
betwixt(A, B, M) --> anything, [A], collect(M), [B], anything.
anything --> [].
anything --> [_], anything.
collect([]) --> [].
collect([H|T]) --> [H], collect(T).
inbetween(List, A, B, Result) :- phrase(betwixt(A, B, Result), List).
The DCG in this case nicely spells out exactly what's happening, with the same results as above. For brevity, I could also use collect(_) in place of anything in the first clause, but didn't want to waste the unused argument.
To use a nice notation credited to #false, we can use ... as a term as shown below:
betwixt(A, B, M) --> ..., [A], collect(M), [B], ... .
... --> [].
... --> [_], ... .
collect([]) --> [].
collect([H|T]) --> [H], collect(T).
I am trying to write prolog code that will delete all punctuation (.,!? etc) from all lists in a list of lists. This is what I have so far:
delete_punctuation(_,[],_).
delete_punctuation(Character,[List|Tail],Resultlist) :-
delete(List,Character,NewList),
delete_punctuation(Character,Tail,[NewList|Resultlist]).
whereas 'Character' will be 33 for ! or 46 for . and so on since I will be using this only on lists of character codes. (I know, that the function would actually work for other elements that I would like to delete from the lists too.)
The results I receive when asking:
delete_punctuation(33,[[45,33,6],[4,55,33]],X).
is just
|: true.
However, I want it to be:
|: X = [[45,6],[4,55]].
What do I need to improve?
For this problem, I'd tackle it by addressing the two sub-problems separately, namely:
Filter/exclude a character code from a single list;
Applying the solution to the above to a list of lists of character codes.
To this end, I'd approach it like this:
exclude2(_, [], []).
exclude2(Code, [Code|Xs], Ys) :-
!, % ignore the next clause if codes match
exclude2(Code, Xs, Ys).
exclude2(Code, [X|Xs], [X|Ys]) :-
% else, Code != X here
exclude2(Code, Xs, Ys).
Note that some implementations like SWI-Prolog provide exclude/3 as a built-in, so you mightn't actually need to define it yourself.
Now, to apply the above predicate to a list of lists:
delete_punctuation(_, [], []).
delete_punctuation(Code, [L|Ls], [NewL|NewLs]) :-
exclude(Code, L, NewL),
delete_punctuation(Code, Ls, NewLs).
However, again, depending on the implementation, a built-in like maplist/3 could be used to achieve the same effect without having to define a new predicate:
?- maplist(exclude2(33), [[45,33,6],[4,55,33]], X).
X = [[45, 6], [4, 55]] ;
false.
n.b. if you want to use all SWI built-ins, exclude/3 requires the test to be a goal, like so:
?- maplist(exclude(==(33)), [[45,33,6],[4,55,33]], X).
X = [[45, 6], [4, 55]] ;
false.
For a more general approach, you could even add all the codes you want to exclude (such as any and all punctuation character codes) to a list to use as the filter:
excludeAll(_, [], []).
excludeAll(Codes, [Code|Xs], Ys) :-
member(Code, Codes),
!,
excludeAll(Codes, Xs, Ys).
excludeAll(Codes, [X|Xs], [X|Ys]) :-
excludeAll(Codes, Xs, Ys).
Then you can add a list with all the codes to delete:
?- maplist(excludeAll([33,63]), [[45,33,6],[4,55,33,63]], X).
X = [[45, 6], [4, 55]] ;
false.