Find repetitions in a list - list

How would I go about finding consecutive repetition of a string in a list in prolog.
What I'm exactly trying to find, for example, is this:
input => output
AAAAAA => 6*(A)
ABABAB => 3*(AB)
ABCABCABC => 3*(ABC)
I wrote a DCG grammar for this and I'm trying to have it give me this as a result.
Here's the grammar, if needed:
exp --> term.
exp --> term, [+], exp.
term --> factor.
term --> digit, [*], exp.
factor --> elem.
factor --> ['S'], ['['], sym, [']']. %S[(A)(B),(C)]
factor --> ['<'], alt, ['>'], ['/'], ['<'], alt, ['>']. %<(A)>/<(B)(C)(D)>
factor --> ['('], exp, [')'].
sym --> factor.
sym --> factor, [','], factor.
sym --> factor, sym.
alt --> factor.
alt --> factor, alt.
elem --> char.
elem --> char, elem.
char --> [D], {is_alnum(D)}.
digit --> [D], {is_alnum(D)}.
digit --> [D], {number(D)}.
nbr_to_char(N, Cs) :-
name(Cs, [N]).
str_to_list(S, Cs) :-
name(S, Xs),
maplist(nbr_to_char, Xs, Cs).
eval(L) :-
str_to_list(L, X),
exp(X, []).
Thanks for any help.

I think what you're after is pack(dcg_util).
But also consider append/2:
?- A=`ababab`.
A = [97, 98, 97, 98, 97, 98].
?- append([X,X,X],$A).
X = [97, 98],
A = [97, 98, 97, 98, 97, 98] ;
false.
Now, if we just find an easy way to make lists of repeated variables, we have a fairly powerful construct, we can use to tackle to problem. Let's try:
?- length(L,3),maplist(=(X),L).
L = [X, X, X].
So:
?- length(L,_),maplist(=(X),L),append(L,$A).
L = [[97, 98, 97, 98, 97, 98]],
X = A, A = [97, 98, 97, 98, 97, 98] ;
L = [[97, 98], [97, 98], [97, 98]],
X = [97, 98],
A = [97, 98, 97, 98, 97, 98] ;
^CAction (h for help) ? abort
% Execution Aborted
oops, never ending story... but a bit boring. Need a bit more code, enforcing the domain knowledge (bagging, really...)
?- length($A,U),between(1,U,N),length(L,N),maplist(=(X),L),append(L,$A).
U = 6,
N = 1,
L = [[97, 98, 97, 98, 97, 98]],
X = A, A = A, A = [97, 98, 97, 98, 97, 98] ;
U = 6,
N = 3,
L = [[97, 98], [97, 98], [97, 98]],
X = [97, 98],
A = A, A = [97, 98, 97, 98, 97, 98] ;
false.

Related

Grouping list elements based on values in list

i have a list of elements(tuples):
tuples(
[
(a,b,[1,3,5,7]),
(a,b,[9,11,13,15]),
(a,b,[17,19,21,23]),
(c,d,[0,2,4,6]),
(c,d,[8,10,12,14]),
(c,d,[16,18,20,22]),
(e,f,[100,200,300,400]),
(e,f,[500,600,700,800]),
(e,f,[900,1000,1100,1200])
]
).
How to group them so it becomes:
[
(a,b,[1,3,5,7,9,11,13,15,17,19,21,23]),
(c,d,[0,2,4,6,8,10,12,14,16,18,20,22]),
(e,f,[100,200,300,400,500,600,700,800,900,1000,1100,1200])
]
As we can see, we grouped the (a,b) (c,d) (e,f) and concatenated their respective lists.
thank you for the help.
How about this somewhat SQL-ish way:
Define a predicate to access data in the original deep and nasty datastructure (which uses conjunction (_,_) to create what looks like n-tuples but are actually "nearly" lists, except at the final position
of the backbone). On backtracking, it will pull out the individual
records one by one and present the information therein in the head variables:
some_tuple(V,W,Values) :-
tuples(Tuples),
member((V,W,Values), Tuples).
And then collect using a 2-level bagof/3 call.
solution(Bag) :-
bagof((V,W,FlatBagForVW), % will backtrack over V,W
SubBag^(bagof(L,some_tuple(V,W,L),BagForVW), % collect lists for a given V,W
flatten(BagForVW,FlatBagForVW)), % flatten it
Bag). % what we want
Done!
If you are in SWI-Prolog, first tell the toplevel printer to not elide long lists so much:
?-
set_prolog_flag(answer_write_options,[max_depth(100)]).
Then:
?-
solution(Bag).
Bag = [(a,b,[1,3,5,7,9,11,13,15,17,19,21,23]),
(c,d,[0,2,4,6,8,10,12,14,16,18,20,22]),
(e,f,[100,200,300,400,500,600,700,800,900,1000,1100,1200])].
A variation using foldl/4 and then library(yall):
'Grouping list elements based on values in list'(Gs) :-
tuples(Ts),
foldl([(X,Y,L),V0,V1]>>(
( append(N,[(X,Y,L0)|M],V0)
-> append(L0,L,L1)
; N=V0, L1=L
),
append(N,[(X,Y,L1)|M],V1)
)
,Ts,[],Gs).
a day after...
There is a bug going unnoticed, namely M remains unbound where a pair of keys (that is, X,Y in a tuple) is not present in V0 (have attempted to keep the variables naming coherent to the docs for foldl/4). A possible correction, that illustrates a single unification call (=/2) to perform multiple 'assignments' at once:
...
foldl([(X,Y,L),V0,V1]>>(
( append(N,[(X,Y,L0)|M],V0)
-> append(L0,L,L1)
; (N,M,L1)=(V0,[],L1)
),
append(N,[(X,Y,L1)|M],V1)
...
Still, it's not clear to me why the bug didn't materialized... for instance, here it's clearly visible...
?- append([1,2,3],[4,5,6|_],R).
R = [1, 2, 3, 4, 5, 6|_19156].
another way...
library(solution_sequences) is a recent addition to SWI-Prolog arsenal, providing more 'SQL like' constructs, for instance group_by/4:
by_group_by(Gs) :-
tuples(Ts),
findall((X,Y,All_xy), (
group_by([X,Y],L,member((X,Y,L),Ts),Lt),
append(Lt,All_xy)
),Gs).
The documentation is too much terse, but overall, the library is worth a try. There are some examples posted in SWI-Prolog discourse group, but I find the data used there boring and difficult to understand.
Note: I used the syntax [X,Y] for the free variables, to make clear the 'shape' of this important specification is unrelated to the pattern.
First we need to be able to find the list of unique A,B pairings in the list of tuples:
foo(X,A,B) :-
setof( p(A,B), C^member( (A,B,C), X ), T),
member( p(A,B), T).
/* 48 ?- tuples(_X), foo(_X,A,B).
A = a,
B = b ;
A = c,
B = d ;
A = e,
B = f. */
Next we collect the tuples for each pair of the A and B values:
bar(X,A,B,G) :-
foo(X,A,B), % for each unique (A,B,_) in X
findall( C, % find all Cs such that (A,B,C) is in X
member( (A,B,C), X),
CS),
append( CS,G). % and append them together
/* 66 ?- tuples(_X), bar(_X,A,B,G).
A = a,
B = b,
G = [1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23] ;
A = c,
B = d,
G = [0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22] ;
A = e,
B = f,
G = [100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200].*/
So that the grouping is done with
groups(X,GS):-
findall( (A,B,G), bar(X,A,B,G), GS).
/* 68 ?- tuples(_X), groups(_X, GS).
GS = [(a, b, [1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23]),
(c, d, [0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22]),
(e, f, [100, 200, 300, 400, 500, 600, 700, 800,
900, 1000, 1100, 1200])].
*/
update: from David Tonhofer's answer, we see that it can be done with just one nested bagof call,
solution( Sol ) :-
tuples(TS),
bagof( (A,B,FLS), % (* collect (A,B,FLS) for each (A,B) *)
LS^( % (* such that (A,B,L) is in TS *)
bagof( L, member((A,B,L), TS), LS ),
flatten( LS, FLS)
), % (* with the lists LS flattened *)
Sol ).
The outer bagof's goal predicate (i.e. the inner bagof call) is automatically backtracking to produce its LS result for each distinct pair of the A,B values, automagically achieving the same effect as we had "manually", in this answer.
group([],L,L).
group([(A,B,L)|T],Acc,Out):-
( append(L1,[(A,B,LL)|L2], Acc)
-> append(LL,L,LLL),
append(L1,[(A,B,LLL)|L2], NewAcc),
group(T,NewAcc,Out)
; append(Acc,[(A,B,L)],NewAcc),
group(T,NewAcc,Out)
).
?- tuples(L), group(L,[],G).
G = [
(a,b,[1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23]),
(c,d,[0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22]),
(e,f,[100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200])
],
L = [
(a,b,[1, 3, 5, 7]),
(a,b,[9, 11, 13, 15]),
(a,b,[17, 19, 21, 23]),
(c,d,[0, 2, 4, 6]),
(c,d,[8, 10, 12, 14]),
(c,d,[16, 18, 20, 22]),
(e,f,[100, 200, 300, 400]),
(e,f,[500, 600, 700, 800]),
(e,f,[900, 1000, 1100, 1200])
].
Basically I'm stripping an element from the (tuple) list ([(A,B,L)|T]) and look if I got a similar entry within my bag-list (2nd argument, append(L1,[(A,B,LL)|L2], Acc)). I use append/3 to do so because it can be used to find an element and at the same time dividing the original list into the list before and after the found element.
If I found an entry in the baglist, I append the two number lists (L and LL) to a new one (LLL) and exchange the values within the bag-list. Then I call the predicate again with the rest list and the altered bag-list.
If no element fits, I just add the element to the bag-list (as last element, append(Acc,[(A,B,L)],Acc2)) and call the predicate again. So each time I call the predicate group/3 it's first argument loses an element. Until there is no element left (group([],L,L).), in this case I state my bag-list is my ouput.
After playing a bit around with group_by/4 and/or bagof/3 as suggested by Guy coder, and using a variant of Davids "extraction" predicate here is another solution:
one_tuple(V,W,Value) :-
tuples(Tuples),
member((V,W,Values), Tuples),
member(Value, Values).
?- findall((A,B,Cs), bagof(C, one_tuple(A, B, C), Cs), O).
O = [(a,b,[1,3,5,7,9,11,13,15,17,19,21,23]),
(c,d,[0,2,4,6,8,10,12,14,16,18,20,22]),
(e,f,[100,200,300,400,500,600,700,800,900,1000,1100,1200])].

Nested if statements that return boolean in prolog

I want to compare two date structures that return true if the first date is earlier than the second one, and false otherwise.
The logic is simple.
Let the first date D1, and second date D2.
If year of D1 > year of D2, return false.
If year of D1 == year of D2, compare month.
If year of D1 < year of D2, return true.
This process is repeated till the comparison of seconds.
Sample code I tried:
date_earlier_than(date(SY, SM, SD, SH, SMm, SS, _, _, _),
date(EY, EM, ED, EH, EMm, ES, _, _, _)) :-
( SY > EY ->
fail
;
SY =:= EY ->
( SM > EM ->
fail
;
SM =:= EM ->
( SD > ED ->
fail
;
SD =:= ED ->
( SH > EH ->
fail
;
SH =:= EH ->
( SMm > EMm ->
fail
;
SMm =:= EMm ->
( SS >= ES ->
fail
;
true
)
;
true
)
;
true
)
;
true
)
;
true
)
;
true
).
This predicate always returns false whatever dates passed as arguments.
I don't know where it goes wrong, and even if I use trace to keep track of the process, the predicate immediately returns false after the execution.
UPDATE:
The sample code works as expected, but I did not use date/9 when I tested it, leading to failure.
Here's revised form:
date_earlier_than(date(SY, SM, SD, SH, SMm, SS, SOff, STZ, SDST),
date(EY, EM, ED, EH, EMm, ES, EOff, ETZ, EDST)) :-
Start = date(SY, SM, SD, SH, SMm, SS, SOff, STZ, SDST),
End = date(EY, EM, ED, EH, EMm, ES, EOff, ETZ, EDST),
Start #< End, !.
This is all completely unnecessary.
If you represent the date as a compound term, those will compare correctly using the built-in comparison predicates and the "standard order of terms". See for example the SWI-Prolog docs:
https://www.swi-prolog.org/pldoc/man?section=standardorder
or GNU-Prolog:
http://www.gprolog.org/manual/gprolog.html#sec77
Try, for example:
?- date(2020, 10, 26, 10, 34, 43) #< date(2020, 10, 26, 6, 34, 43).
or:
?- date(2020, 10, 26, 10, 34, 43) #> date(2020, 10, 26, 6, 34, 43).
But this is also not necessary. Just represent your time as a timestamp (seconds since 1970-01-01T0:0:0Z, for example) and compare those as numbers (probably floats). See for example here:
https://www.swi-prolog.org/pldoc/man?section=timedate
This predicate always returns false whatever dates passed as arguments.
Not for me:
?- Today = date(2020, 10, 26, 12, 13, 14, _, _, _), Agincourt = date(1415, 10, 25, 09, 10, 11, _, _, _), date_earlier_than(Today, Agincourt).
false.
?- Today = date(2020, 10, 26, 12, 13, 14, _, _, _), Agincourt = date(1415, 10, 25, 09, 10, 11, _, _, _), date_earlier_than(Agincourt, Today).
Today = date(2020, 10, 26, 12, 13, 14, _2890, _2892, _2894),
Agincourt = date(1415, 10, 25, 9, 10, 11, _2916, _2918, _2920).
The battle of Agincourt did indeed take place before today, so your predicate seems to be working, at least to some extent. This is a good argument for always (always) including your test data in your questions.
Other than that, I agree with the other answer's suggestion of setting your data up such that you can just use #< to compare them. I don't necessarily agree with the other answer's suggesion of using an opaque seconds-since-epoch representation. It depends on what you actually need to do, and if it has to take time zones into account.

Prolog - List manipulation, if--then--else, recursion

The goal:
rovarsprak([104, 101, 106], B).
B = [104,111,104,101,106,111,106]
[104, 101, 106] = "hej"
B = "hohejoj"
Currently:
rovarsprak([104, 101, 106], B).
B = [104,111,104,106,111,106]
[104, 101, 106] = "hej"
B = "hohjoj"
(E.g. I'm loosing the middle, non-affected, element (101 = "e").)
The code:
voal(97). % 97 = a
voal(111). % 111 = o
voal(117). % 117 = u
voal(101). % 101 = e
voal(105). % 105 = i
voal(121). % 121 = y
isLowerCase(A) :-
A > 96, % small A is 97
A < 123. % small Z is 122
rovarsprak([], []).
rovarsprak([A|AS], [X,Y,Z|ZS]) :-
A = X, A = Z, Y = 111,
isLowerCase(A),
(\+ voal(A)) % if
-> rovarsprak(AS, ZS) % then
; rovarsprak(AS, [X,Y,Z|ZS]). % else
I feel like I'm missing the point here...
The problem is in else case where you call recursively your predicate without having instantiate any of X,Y,Z. You should instantiate X=A so your list will not ignore A as it does now. So the only change you need is adding A=X after ; and call rovarsprak(AS, [Y,Z|ZS]). since X is instantiated:
rovarsprak([], []).
rovarsprak([A|AS], [X,Y,Z|ZS]) :-
A = X, A = Z, Y = 111,
isLowerCase(A),
(\+ voal(A)) % if
-> rovarsprak(AS, ZS) % then
; A=X,rovarsprak(AS, [Y,Z|ZS]). % else
Example:
?- rovarsprak([104, 101, 106], B).
B = [104, 111, 104, 101, 106, 111, 106].

swi-prolog: how to sort list of lists by NTH element of sublist, ALLOWING DUPLICATES

(I have used 'asserta' to put a large csv file with several columns into the database.) Is there a way to sort numerically by column without removing duplicates?
As you can see from my simple example (which sorts by the second column / element), the predsort method removes duplicates.
I could work around this by switching and removing some columns and using msort, but am asking you specifically here for an alternative.
Any advice would be v much appreciated !
mycompare(X,E1,E2):-
E1=[_,A1],E2=[_,A2],compare(X, A1, A2).
?- predsort(mycompare,[ [[95, 97], 11], [[97, 99], 4], [[97, 98], 4]],X).
X = [[[97, 99], 4], [[95, 97], 11]].
?- msort([ [[95, 97], 11], [[97, 99], 4], [[97, 98], 4]],X).
X = [[[95, 97], 11], [[97, 98], 4], [[97, 99], 4]].
%What I want is:
?- wanted_sort(...<as above>...).
X = [[[97, 98], 4], [[97, 99], 4], [[95, 97], 11] ].
The standard way to do this would be to use keysort/2. So first you start by mapping the elements accordingly, then keysorting, and mapping back the values.
list_pairs([], []).
list_pairs([E|Es], [B-E|Ps]) :-
E = [_,B],
list_pairs(Es, Ps).
pairs_values([], []).
pairs_values([_-V|Ps], [V|Vs]) :-
pairs_values(Ps, Vs).
andrew_sort(Xs, Ys) :-
list_pairs(Xs, Ps),
keysort(Ps, PsS),
pairs_values(PsS, Ys).
For other uses of keysort/2 see this list.
Imho predsort/3 provides a very general and fairly efficient way to do - it's as simple as avoiding returning = from the comparison predicate. Example:
?- [user].
|: comparer(<, A, B) :- A #< B.
|: comparer(>, _, _).
(^D here)
true.
?- predsort(comparer, [1,2,1,a,b,a], L).
L = [1, 1, 2, a, a, b].
Your test case:
mycompare(<,[_,A1|_],[_,A2|_]) :- A1 < A2.
mycompare(>, _, _).
yields
?- predsort(mycompare,[ [[95, 97], 11], [[97, 99], 4], [[97, 98], 4]],X).
X = [[[97, 98], 4], [[97, 99], 4], [[95, 97], 11]].
I slightly generalized the pattern matched, from [_,N] to [_,N|_]...
edit: it's funny, I didn't read the title...
to generalize comparing for nth argument:
?- predsort(nthcompare(2),[ [[95, 97], 11], [[97, 99], 4], [[97, 98], 4]],X).
X = [[[97, 98], 4], [[97, 99], 4], [[95, 97], 11]].
and nthcompare/4 itself:
nthcompare(N,<,A,B) :- nth1(N,A,X),nth1(N,B,Y), X #< Y.
nthcompare(_,>,_,_).
that is...

Split a number into a list of digits in Prolog

I've been having trouble trying to split numbers into lists using Prolog, e.g. 123456 becomes [1,2,3,4,5,6].
Can you please help me work out how to do this?
the builtins available are ISO standard:
?- number_codes(123456,X),format('~s',[X]).
123456
X = [49, 50, 51, 52, 53, 54].
?- number_chars(123456,X),format('~s',[X]).
123456
X = ['1', '2', '3', '4', '5', '6'].
I also have some very old code I developed for my interpreter. := must be renamed is to run with standard Prologs. But then you are best served from above builtins...
itoa(N, S) :-
N < 0, !,
NN := 0 - N,
iptoa(NN, SR, _),
reverse(SR, SN),
append("-", SN, S).
itoa(N, S) :-
iptoa(N, SR, _),
reverse(SR, S).
iptoa(V, [C], 1) :-
V < 10, !,
C := V + 48.
iptoa(V, [C|S], Y) :-
M := V / 10,
iptoa(M, S, X),
Y := X * 10,
C := V - M * 10 + 48.
edit here the additional call required to get numbers:
?- number_codes(123456,X), maplist(plus(48),Y,X).
X = [49, 50, 51, 52, 53, 54],
Y = [1, 2, 3, 4, 5, 6].
You could first create a reverse list:
//Base step
splitRev(0,[]).
//Recursive step
splitRev(N,[A|As]) :- N1 is floor(N/10), A is N mod 10, splitRev(N1,As).
The recursive step works like this:
N1 is floor(N/10)
divides N by 10 and rounds it down. So 538 becomes 53.8 becomes 53.
It cuts off the last digit.
A is N mod 10
takes the remainder of N divided by 10. 538 mod 10 equals 8.
So you get only the last digit.
Now for splitting the list you only need to reverse the list created by splitRev/2.
So predicate split/2 is defined as:
split(N,L1) :- splitRev(N,L2), reverse(L1,L2).
Note that reverse/2 is a built-in predicate.
I hope this helps!