AIML pattern for singluar and plural - aiml

I would like to know how to handle both singular and plural to match the same pattern.
Ex: “get Statement” and “get Statements”
<category>
<pattern>get Statement</pattern>
<template>Please get it from:</template>
</category>
<category>
<pattern>get Statements</pattern>
<template>Please get it from:</template>
</category>
I want to show the same result without writing two separate patterns and without using SET tag.
Please advise me on this.

You can't do this in standard AIML v1 or v2 if you insist on one pattern only. The closest you could get is a pattern GET * which matches both cases, but that is too general to be very useful. If you were happy to use two patterns, you could do this:
<category>
<pattern>GET STATEMENT</pattern>
<template><srai>GET STATEMENTS</srai></template>
</category>
<category>
<pattern>GET STATEMENTS</pattern>
<template>Please get it from:</template>
</category>
Your custom logic is confined to one place (the plural GET STATEMENTS pattern), and the singular pattern calls the plural pattern.

It's very late now so probably you have found the answer, but in Program-Y you can write it as...
<pattern>get <regex pattern="STATEMEN[T|TS]" /></pattern>
<template>Please get it from:</template>

Related

Regex - Verify multiline content

i am trying to verify a xml structure, where i want to check that the ns22:statement true tag is found after the postcode DataItem.
<ns21:DataItem name="country" default="false" />
<ns21:DataItem name="postcode" default="false">
<ns22:statement disabled>true</ns22:statement>
</ns21:DataItem>
I have tried this
(?m)\b.*:DataItem name="postcode" (?s)\b.*>$\n.*\bstatement disabled>true\b
but when changing postcode to country (where is supposed not to return anything) it catches all tags country, postcode and statement true.
I have also created this https://regexr.com/3quso
Any suggestions of how to get only the postcode+statement true??
XPath really does look like the best tool for the job given you're trying to validate XML structure as well as content. So, ignoring namespaces, you could use the following XPath in a soapUI XPath Match assertion:
boolean(//*[local-name()='DataItem'][#name='postcode']/*[local-name()='statement' and .='true'])
Also, in <ns22:statement disabled>true</ns22:statement>, is disabled meant to be part of the element name or an attribute? As it stands, it makes the XML invalid, so I've ignored it.
For good reasons not to use regular expressions to parse XML/HTML, see Why it's not possible to use regex to parse HTML/XML: a formal explanation in layman's terms

Match particular CDATA sections in XML data

I am trying to do a PowerShell Regex, I have the following page (further below) that I want to do a match from, the two parts in bold is the information that I want to capture and assign to a variable. So I need two regex's. From the text below, the two area's I need to find exactly are King and Years & Years. Please note, these two areas change (hence the reason I need to capture them), the rest of of the code stays the same.
This is the regex I have at the moment, but it's not working for me.
\s+artist\s*>\s*<\s*!\s*[CDATA\s*[(.*)\s*]\s*]\s*>\s*<\s*/artist
And here is the page (or data) I am trying to use regex with.
<on_air>
<publishedInfo publishedDate="2015-07-18 16:24:28" />
<stationName><![CDATA[Mix 106.5]]></stationName>
<stationPrefix><![CDATA[mix1065]]></stationPrefix>
<generic_coverart><![CDATA[http://media.arn.com.au/images/getImage.aspx?i=generic_mix1065.jpg]]></generic_coverart>
<now_playing>
<audio ID="id_1705168034_30458146" type="song">
<title generic="False"><![CDATA[King*]]></title>
<artist><![CDATA[Years & Years]]></artist>
<number><![CDATA[46029]]></number>
<cut><![CDATA[1]]></cut>
<ref><![CDATA[]]></ref>
<played_datetime><![CDATA[2015-07-18 16:24:27]]></played_datetime>
<length><![CDATA[00:03:28]]></length>
<coverart generic="true"><![CDATA[http://media.arn.com.au/images/getImage.aspx?i=generic_mix1065.jpg]]></coverart>
<options>
<option><![CDATA[KIIS S Integrated]]></option>
</options>
</audio>
</now_playing>
If it is a valid XML, then you does not need to use regular expressions. PowerShell adapt XML objects and you can use standard property syntax to navigate on them:
$xml=[xml]#'
<on_air>
<publishedInfo publishedDate="2015-07-18 16:24:28" />
<stationName><![CDATA[Mix 106.5]]></stationName>
<stationPrefix><![CDATA[mix1065]]></stationPrefix>
<generic_coverart><![CDATA[http://media.arn.com.au/images/getImage.aspx?i=generic_mix1065.jpg]]></generic_coverart>
<now_playing>
<audio ID="id_1705168034_30458146" type="song">
<title generic="False"><![CDATA[King*]]></title>
<artist><![CDATA[Years & Years]]></artist>
<number><![CDATA[46029]]></number>
<cut><![CDATA[1]]></cut>
<ref><![CDATA[]]></ref>
<played_datetime><![CDATA[2015-07-18 16:24:27]]></played_datetime>
<length><![CDATA[00:03:28]]></length>
<coverart generic="true"><![CDATA[http://media.arn.com.au/images/getImage.aspx?i=generic_mix1065.jpg]]></coverart>
<options>
<option><![CDATA[KIIS S Integrated]]></option>
</options>
</audio>
</now_playing>
</on_air>
'#
$xml.on_air.now_playing.audio.title.'#cdata-section'
$xml.on_air.now_playing.audio.artist.'#cdata-section'
You want to escape bracket literals.
Also, it's a good practice to avoid using the dot "match almost any character" metacharacter when your intentions are more specific. In your case, what you really want to do is match until you hit the closing bracket, so it's safer to specify that:
'\s+artist\s*>\s*<\s*!\s*\[CDATA\s*\[([^]]*)\s*\]\s*\]\s*>\s*<\s*\/artist'
Note: Regex is contextual, so the reason I don't have to escape the closing bracket within the character class is because of its position, i.e., being the first character specified in the negated class--in that context, it cannot be the closing bracket for the character class. In other words, it's not ambiguous.
To help get off the ground, here is a suggestion for y&y (insert whitespace-selector whereever possible):
artist><!\[CDATA\[Years & Years\]\]></artist

Ant, replaceregexp, match and replace specific element in the list of matches

In my ant build file, I have a task that needs to replace a specific element of a XML.
Here is the target XML that I am trying to modify:
<foo>
<sub>
<elem>name1</elem>
</sub>
<sub>
<elem>name2</elem>
</sub>
<sub>
<elem>name3</elem>
</sub>
</foo>
Ant build task:
<replaceregexp file="myfoo.xml"
match="<elem>(.*)elem>"
replace="<elem>${replace_only_second_match}elem>"
byline="true"
/>
The problem with the above task is that all the tags will get replaced. However, I want only the second element to be modified, not the first or 3rd match. (such a thing is quite easy with normal regular expressions.)
Dont know how to do it with Ant's regular expression. This is where I need help/suggestions on how best to solve this problem.
You should use xmltask for xmlrelated tasks, for your problem use it like that :
Modify the file inplace
<xmltask source="whatever.xml" dest="whatever.xml">
<replace path="//sub[2]/elem/text()" withText="newname2"/>
</xmltask>
Create new file
<xmltask source="whatever.xml" dest="newfile.xml">
<replace path="//sub[2]/elem/text()" withText="newname2"/>
</xmltask>
The replacesection also provides withXml / withFile / withBuffer.
See xmltask manual and tutorial for details.
Some XPath essentials here.

Regex for matching a complete element in a xml file

I would like to know the Regex that match these kind of sequences
<person name="the name I want" ....[other things]>
.... [other tags]
</person>
I tried with something like this:
<person +name="the name I want" +.*
But I'm not going any further, I can only match the first line, but not the complete element
Would you like to help me?
Try this:
<person[*>]*name="the name I want"[^>]*>(.|[\r\n])*?<\/person>
If your language supports the "dotall" flag, you can use that and change (.|[\r\n])* to just .*.
I found this in another stackoverflow thread:
<person(.|\r\n)*?<\/person>
I hope it is useful
Edit:
I forgot to add the name attribute
(<person name="the name I want"(.|\r\n)*?<\/person>)

Regex to get a URL containing a keyword

Due to redbubble.com's lack of an API, I'm using an ATOM feed to steal information about a user's pictures.
This is what the XML looks like:
<entry>
<id>ID</id>
<published>Date Published</published>
<updated>Date Updated</updated>
<link type="text/html" rel="alternate" href="http://www.redbubble.com/link/to/post"/>
<title>Title</title>
<content type="html">
Blah blah blah stuff about the image..
<a href="http://www.redbubble.com/products/configure/config-id"><img src="http://ih1.redbubble.net/path-to-image" alt="" />
</content>
<author>
<name>Author Name</name>
<uri>http://www.redbubble.com/people/author-user-name</uri>
</author>
<link type="image/jpeg" rel="enclosure" href="http://ih0.redbubble.net/path-to-the-original-image"/>
<category term="1"/>
<category term="2"/>
</entry>
Basically using regex... how would I go about getting the href property inside the link in the content tag?
One thing we know for sure is it will always have configure in the path i.e. http://somesite.com/**configure**/id
So basically I just need to find the URL with configure in and grab the whole thing...
The following regex will extract the href content based on your requirements. It seems to work for the sample code.
href="(\w[^"]+/configure/\w[^"]+)
Whatever programming language you're using, don't try to parse the whole thing with a regex. Use an XML parser first to extract the href="...". Then, sure, use a regex to make sure the URL contains configure.
As #KARASZI commented, XPath is another good approach.
If you have to use regex try this one:
href="(?=[^"]*configure)([^"]*)
rubular.com
I am using a lookahead to find if it contains configure.
Thanks for your awesome answers but my colleague solved it for me!
This is what i ended up using:
/http:\/\/([^"\/]*\/)*configure\/([^"]*)/
(Ruby regex by the way)