tokenizing with matching list in prolog - list

I am working with chemical compounds names.I used to DCG rules to define the name format.but the input is a list format,that not a good way.because the elements separate by separater.but i want as one string input like "1-butene" it is send as ['1','-',but,ene] .my code is here.
stem11-->[but]|[pent]|[hex]|[hept].
suf --> [ene]|[yne].
seperater-->['-'].
numerals-->['1']|['2']|['3']|['4']|['5']|['6']|['7'].
main-->numerals,seperater,stem11,suf.
check(S):-tokenize(S,L),main(L,[]).
here
tokenize("1-butene",L).
L=['1','-',but,ene].i want the tokenizing code the input may be like [1-butene].i tried many ways but couldn't get the proper code.please help me.

To keep things simple, I would do tokenization inline:
stem11 -->"but"|"pent"|"hex"|"hept".
suf -->"ene"|"yne".
seperator -->"-".
numerals -->"1"|"2". % etc
main -->numerals,seperator,stem11,suf.
then use ?- phrase(main, "1-butene"). or ?- main("1-butene", []).
edit
stem11(A) --> atom(["but","pent","hex","hept"], A).
suf(A) --> atom(["ene","yne"], A).
separator(-) -->"-".
numerals(A) --> atom(["1","2","3","4","5","6","7"], A).
atom(L, A) --> {member(S, L)}, atom_match(S), {atom_codes(A, S)}.
atom_match([]) --> [].
atom_match([C|Cs]) --> [C], atom_match(Cs).
tokenize([A,B,C,D]) --> numerals(A), separator(B), stem11(C), suf(D).
check(S,L) :- phrase(tokenize(L), S, []).
yields (tested with GnuProlog)
?- check("1-butene",L).
L = ['1',-,but,ene] ?

Let's try my answer,Here my input like"1-butene"
stem11-->[but]|[pent]|[hex]|[hept].
suf --> [ene]|[yne].
seperater-->['-'].
numerals-->['1']|['2']|['3']|['4']|['5']|['6']|['7'].
main-->numerals,seperater,stem11,suf.
:-set_prolog_flag(double_quotes, codes).
any(A,K) --> {member(S,K)}, S, {atom_codes(A, S)}.
words(A) --> any(A,["but","pent","hex","hept","ene","yne","-","1","2","3"]).
split([]) --> "".
split([X|Xs]) --> words(X), split(Xs).
tokenize(S,L):-phrase(split(L),S).
check(S):-tokenize(S,L),main(L,[]).
query like this.
?- tokenize("1-butene",L).
L = ['1', -, but, ene] ;
?- check("1-butene").
true ;

Related

Create a list from a Prolog DCG

I can't figure out how to create a list containing the elements that have been used to form a sentence using a defined DCG.
Suppose we have the following DCGs:
father --> [Peter].
mother --> [Isabel].
child --> [Guido].
child --> [Claudia].
verb --> [is].
relation --> [father, of].
relation --> [mother, of].
pronoun --> [he].
pronoun --> [she].
adjective --> [a, male].
adjective --> [a, female].
s --> father, verb, relation, child.
s --> mother, verb, relation, child.
s --> pronoun, verb, adjective.
Sentences can be queried as follows:
phrase(s, [peter, is, father, of, guido]), phrase(s, [he, is, a, male]). which returns true.
How can I create and maintain a list of the elements of this executed sentences in order to get false when executing the following sentences (because Peter is a male, notice the she instead of he):
phrase(s, [peter, is, father, of, guido]), phrase(s, [she, is, a, female]).
This question uses the same example as here.
The proper interface to DCGs is via phrase/2, a simplified version of phrase/3:
?- phrase(s, X).
X = [_8304, is, father, of, _8328] ;
X = [_8304, is, father, of, _8328] ;
X = [_8304, is, mother, of, _8328] ;
% etc
The _8404 variables come from rules like father --> [Peter]. because Peter in there is also a variable (variables start with _ or an upper case letter. You can fix this by escaping the atom as 'Peter' - see also the other question you asked).
The first name of phrase is the DCG rule, the second argument is the list. When you use a particular list as the second argument, the answer substitution is empty and Prolog just reports that it could derive the list. In my example, I used the variable X and obtained possible substitutions for it, that can be derived.
Constraints can be added as goals enclosed in curly brackets:
dupnum(X) -->
{ member(X, [0,1,2,3,4,5,6,7,8,9]) },
[X,X].
leads to
?- phrase(dupnum(X), Y).
X = 0,
Y = [0, 0] ;
X = 1,
Y = [1, 1] ;
% etc
The example also shows that DCG productions can have arguments which you could use to propagate a parse tree or some general parsing context.
To address the extra question that was added later: you can pass information around by adding additional arguments but you need to describe the sequence of sentences. The example phrase(s, [peter, is, father, of, guido]), phrase(s, [he, is, a, male]) should succeed, what should not succeed is uttering the two sentences after each other phrase(ss, [peter, is, father, of, guido, '.', he, is, a, male,'.']) (Or it might succeed, leaving "he" as a reference to someone we do not know. It all depends on how strict we are with the context. ).
To do this properly, we need to jump quite a number of hoops. First we need to add parsing information to the DCG rules. For example, np(np([A,O],object)) --> %... will parse an article followed by an object into a structure np([A,O],object). Let's parse ['a', 'male'] and [guido] with it:
?- phrase(np(NP), [a, male]).
NP = np([article(a), object(male, male)], object) ;
false.
?- phrase(np(NP), [guido]).
NP = np([name(guido, male)], name) ;
false.
The first argument of np is a list because other np rules have only on component. Note that we added the gender as an attribute to name and object. In other languages, for example French, the article also has a gender that needs to agree with the object but in English we can set this aside. More sophisticated implementations would also take into account if the object is in singular or plural (the same goes for the different modes of a verb).
For verbs, we need to distinguish how many object noun phrases they need. This is done by the checks is_transitive/1, is_intransitive/1 and is_bitransitive/1.
Finding a good solution for demonstrative pronouns is hard: the pronoun does not need to refer to the previous subject, for example in "Gaile is married to Peter. He is older than her.". It doesn't even need to refer to the last sentence at all, for example in "Peter is where he wants to be.". This means that a) you should decide beforehand, which cases you actually want to cover and b) it is best to make these decisions in a second parsing run, when you have the full structured information. This mirrors the linguistic distinction between syntactic, semantic and pragmatic reasoning, where I would categorize the problem you would like to solve as pragmatic, which depends on the other two steps.
My solution here just incorporates the particular decision you wanted to make into a single parsing run, at the cost of readability of the ss DCG rule: we add an additional argument that collects the sentences that have been parsed already, a so called accumulator. When we start parsing, the history is empty which is reflected by the rule ss(S) --> ss(S,[]).. For the actual rules, we need to distinguish if the current sentence starts with a demonstrative pronoun or not. In the first case we need resolve it, which we do here by looking at possible noun phrases in the previous sentence that agree in gender. With this machinery in place, we can parse the sentence [peter, is, a father, '.', he, is a father,'.']:
?- phrase(ss(Tree), [peter,is,a,father,'.', he, is, a, father, '.']).
Tree = [s(np([name(peter, male)], name), vp([verb(is), np([article(a), object(father, male)], object)])), s(np([pronoun(he, male)], dpronoun), vp([verb(is), np([article(a), object(..., ...)], object)]))] ;
but we cannot parse [peter,is,a,father,'.', she, is, a, father, '.']:
?- phrase(ss(Tree), [peter,is,a,father,'.', she, is, a, father, '.']).
false.
In a proper semantic/pragmatic analysis, we would enrich the pronoun phrase with the actual referenced noun phrase but this would be done as a rewriting of the original parse tree. Here is the code for this:
%%%% utility predicates
% gender_of(X,Y) is true if X is the gender of the syntax tree node Y
gender_of(X,name(_,X)).
gender_of(X,pronoun(_,X)).
gender_of(X,object(_,X)).
gender_of(G,np([X],_)) :-
gender_of(G,X).
gender_of(G,np([_,X],_)) :-
gender_of(G,X).
% nps_of(X,Y) is true if X is the list of nps occurring in the syntax tree node Y
nps_of([],vp([_])).
nps_of([NP],vp([_,NP])).
nps_of([NP|Rest],s(NP,VP)) :-
nps_of(Rest, VP).
% nountype_of(X,Y) is true if X is the type of the np node Y
nountype_of(X, np(_,X)).
% is_intransitive(X) is true if the verb X does not require an object phrase
is_intransitive(is).
is_intransitive(walk).
% is_transitive(X) is true if the verb X requires an object phrase
is_transitive(is).
% is_bitransitive(X) is true if the verb X requires two object phrases
is_bitransitive(is).
%%%% DCG rules
% name are distinct from objects because they do not require articles
name(name(peter,male)) --> [peter].
name(name(isabel,female)) --> [isabel].
name(name(guido,male)) --> [guido].
name(name(claudia,female)) --> [claudia].
% nouns that require an article
object(object(mother,female)) --> [mother].
object(object(father,male)) --> [father].
object(object(male,male)) --> [male].
object(object(female,female)) --> [female].
% verbs
verb(verb(is)) --> [is].
verb(verb(walk)) --> [walks].
% pronouns
pronoun(pronoun(he,male)) --> [he].
pronoun(pronoun(she,female)) --> [she].
% articles
article(article(a)) -->
[a].
article(article(the)) -->
[the].
% noun phrases
np(np([A,O],object)) -->
article(A),
object(O).
np(np([N],name)) -->
name(N).
np(np([PN], dpronoun)) -->
pronoun(PN).
% verb phrases
vp(vp([V,NP])) -->
verb(V),
{ V = verb(Name), is_transitive(Name) },
np(NP).
vp(vp([V])) -->
verb(V),
{ V = verb(Name), is_intransitive(Name) }.
end -->
['.'].
% a single sentence
s(s(NP,VP)) -->
np(NP),
vp(VP),
end.
% a list of sentences, with accumulator
ss([],_Acc) -->
[].
ss([S|Sentences],[]) -->
s(S),
ss(Sentences, [S]).
ss([S|Sentences], [LastS | Acc]) -->
{ S = s(np([Pronoun], dpronoun),_) },
s(S),
{ gender_of(G, Pronoun), nps_of(LastNPS, LastS), member(LNP, LastNPS), gender_of(G,LNP) },
ss(Sentences, [S, LastS | Acc]).
ss([S|Sentences], [LastS | Acc]) -->
{ S = s(NP,_), nountype_of(NT,NP), dif(NT,dpronoun) },
s(S),
ss(Sentences, [S, LastS | Acc]).
% wrapper of ss with empty accumulator
ss(S) -->
ss(S,[]).

mirroring a list in prolog

My wanted output is this:
?- mirror ([1,2], [] , X ).
X= [1,2,2,1]
What I have so far:
mirror(L,R,X):- L is R , [R| revertList(L,X)] .
I cant think of how this works, please help me
It is not very different from reversing a list, but how you write it is not going to work. I googled "Prolog is" and after maybe 10 seconds I see that is/2 is for arithmetic expressions. I also don't know how you think that you can put predicate but maybe it is not possible? If you want to just append then you can use append to append the mirror reversed list to the end of the original list to get the final "mirror" result:
mirror(X, Y) :- reverse(X, R), append(X, R, Y).
but this is too easy? So I wonder maybe there is more to this question? I don't know why you have three arguments when you only need two arguments? Maybe you thought that you can use an accumulator to reverse the list because to reverse a list you use accumulator like this?
list_rev(L, R) :- list_rev(L, [], R).
list_rev([], R, R).
list_rev([X|Xs], Ys, R) :-
list_rev(Xs, [X|Ys], R).
But this is very easy to google, I just googled it and found it, so maybe you googled it too and you didn't like it? To get "mirrored" you just need to keep the original list too, like so:
list_mirrored(L, M) :- list_mirrored(L, [], M).
list_mirrored([], M, M).
list_mirrored([X|Xs], Ys, [X|Zs]) :-
list_mirrored(Xs, [X|Ys], Zs).
I wasn't sure if this is correct and I googled "Prolog append" and this is how it is done.
To describe lists in Prolog, always also consider DCG notation.
For example, in this concrete case:
mirror([]) --> [].
mirror([M|Ms]) --> [M], mirror(Ms), [M].
Your test case:
?- phrase(mirror([1,2]), Ls).
Ls = [1, 2, 2, 1].
It also works in the other direction. For example:
?- phrase(mirror(Ls), [a,b,c,c,b,a]).
Ls = [a, b, c] ;
false.
The most general query yields:
?- phrase(mirror(Ls), Ms).
Ls = Ms, Ms = [] ;
Ls = [_5988],
Ms = [_5988, _5988] ;
Ls = [_5988, _6000],
Ms = [_5988, _6000, _6000, _5988] ;
Ls = [_5988, _6000, _6012],
Ms = [_5988, _6000, _6012, _6012, _6000, _5988] ;
etc.
See dcg for more information.
Note that with the definition above, we have:
?- phrase(mirror(Ls), [a,b,a]).
false.
I leave generalizing this definition (if necessary) as an easy exercise.
This would get you the desired "output":
mirror(_,_,[1,2,2,1]).
That probably won't work for most inputs, but since you haven't explained the relationship between input and output for anything but this one case, that's a good as I can do.

How to perform long concatenation of lists in Prolog?

I am interested in performing a long concatenation of lists, using Prolog language.
The objective is to define a predicate that gets an unknown number of lists, and concatenates them all into one list (that is given as the second argument to the predicate).
I know I should first understand how Prolog supports arguments with unbounded size, but I think the answer to that is using lists, for example:
[a | [[b,c,d] | [[e,f,g] | [h,i,j,k]]]].
If so, I thought about writing the predicate somewhat like this:
l_conc([ ],[ ]).
l_conc([[ ]|Tail],L):-
l_conc(Tail,L).
l_conc([[Head|L1]|Tail],[Head|L2]):-
l_conc([L1|Tail],L2).
However, it only concatenates empty lists to one another.
Please help me here (both regarding the arguments representation and the predicate itself) :-) Thanks!
Before I answer the actual question, I have a couple of comments.
First, the term you give as an example [a | [[b,c,d] | [[e,f,g] | [h,i,j,k]]]] can be written more compactly in Prolog. You can use the Prolog toplevel itself to see what this term actually is:
?- Ls = [a | [[b,c,d] | [[e,f,g] | [h,i,j,k]]]].
Ls = [a, [b, c, d], [e, f, g], h, i, j, k].
From this, you see that this is just a list. However, it is not a list of lists, because the atoms a, h, i etc. are not lists.
Therefore, your example does not match your question. There are ways to "flatten" lists, but I can only recommend against flattening because it is not a pure relation. The reason is that [X] is considered a "flat" list, but X = [a,b,c] makes [[a,b,c]] not flat, so you will run into logical inconsistencies if you use flatten/2.
What I recommend instead is append/2. The easiest way to implement it is to use a DCG (dcg). Consider for example:
together([]) --> [].
together([Ls|Lss]) -->
list(Ls),
together(Lss).
list([]) --> [].
list([L|Ls]) --> [L], list(Ls).
Example query:
?- phrase(together([[a],[b,c],[d]]), Ls).
Ls = [a, b, c, d].
In this example, precisely one level of nesting has been removed.

Prolog - copy a piece of list

I need to duplicate list in prolog.
I have list:
L = [a(string1,value1),a(string2,value2),a(string3,value3),a(string4,value4)].
Output will be: L = [string1, string2, string3, string4].
How can I do this?
I can copy whole list by code:
copy([],[]).
copy([H|L1],[H|L2]) :- copy(L1,L2).
I have tried something like:
copy2([],[]).
copy2([H|L1],[K|L2]) :- member(f(K,_),H), copy2(L1,L2).
But it does not work properly.
But I need only strings from my original list. Can anyone help?
pattern matching is used to decompose arguments: you can do
copy([],[]).
copy([a(H,_)|L1],[H|L2]) :- copy(L1,L2).
It is uncommon to use a structure a/2 for this purpose. More frequently, (-)/2 is used for this. Key-Value is called a (key-value) pair.
Also the name itself is not very self-revealing. This is no copy at all. Instead, start with a name for the first argument, and then a name for the second. Lets try: list_list/2. The name is a bit too general, so maybe apairs_keys/2.
?- apairs_keys([a(string1,value1),a(string2,value2)], [string1, string2]).
Here are some definitions for that:
apairs_keys([], []).
apairs_keys([a(K,_)|As], [K|Ks]) :-
apairs_keys(As, Ks).
Or, rather using maplist:
apair_key(a(K,_),K).
?- maplist(apair_key, As, Ks).
Or, using lambdas:
?- maplist(\a(K,_)^K^true, As, Ks).
Declarative debugging techniques
Maybe you also want to understand how you can quite rapidly localize the error in your original program. For this purpose, start with the problematic program and query:
copy2([],[]).
copy2([H|L1],[K|L2]) :-
member(f(K,_),H),
copy2(L1,L2).
?- copy2([a(string1,value1),a(string2,value2),a(string3,value3),a(string4,value4)], [string1, string2, string3, string4]).
false.
Now, generalize the query. That is, replace terms by fresh new variables:
?- copy2([a(string1,value1),a(string2,value2),a(string3,value3),a(string4,value4)], [A, B, C, D]).
false.
?- copy2([a(string1,value1),a(string2,value2),a(string3,value3),a(string4,value4)], L).
false.
?- copy2([a(string1,value1),B,C,D], L).
false.
?- copy2([a(string1,value1)|J], L).
false.
?- copy2([a(S,V)|J], L).
false.
?- copy2([A|J], L).
A = [f(_A,_B)|_C], L = [_A|_D]
; ... .
So we hit bottom... It seems Prolog does not like a term a/2 as first argument.
Now, add
:- op(950,fx, *).
*_.
to your program. It is kind of a simplistic debugger. And generalize the program:
copy2([],[]).
copy2([H|L1],[K|L2]) :-
member(f(K,_),H),
* copy2(L1,L2).
Member only succeeds with H being of the form [_|_]. But we expect it to be a(_,_).

Prolog transform and separate term (atom) into a list

I have a term (more accurately an atom) like this:
name1(value1),name2(value2)
and I would like to have instead a "real" list like this:
[name1(value1), name2(value2)]
or separeted terms like this:
name1(value1) and name2(value2)
any idea on how to do it?
What about:
ands((A,B)) --> !, ands(A), ands(B).
ands(X) --> [X].
Example:
?- phrase(ands((a,b,c,d)), Ls).
Ls = [a, b, c, d].
In Prolog we use pattern matching to apply different processing to list'elements:
change_list([], []). % end of recursion
change_list([(T1,T2)|Ri], [T1,T2|Ro]) :- % recursive step doing required transformation
!, change_list(Ri, Ro).
change_list([E|Ri], [E|Ro]) :- % catch all rule: copy elem
change_list(Ri, Ro).
simple: for your list [H|T] with T=[] and H the compound term in question that is its head element, have H=','(A,B), List=[A,B].
You can even just write H=(A,B), List=[A,B]. - parentheses are mandatory here.
IOW the data term you're talking about is just an ordinary compound term, with ',' as its functor. If you don't know the structure of these terms in advance, you can inspect it with `=../2':
H =.. [Functor | Arguments].
(I see you got the same advice from #mat).