XSLT Select all nodes containing a specific substring - xslt

I'm trying to write an XPath that will select certain nodes that contain a specific word.
In this case the word is, "Lockwood". The correct answer is 3. Both of these paths give me 3.
count(//*[contains(./*,'Lockwood')])
count(BusinessLetter/*[contains(../*,'Lockwood')])
But when I try to output the text of each specific node
//*[contains(./*,'Lockwood')][1]
//*[contains(./*,'Lockwood')][2]
//*[contains(./*,'Lockwood')][3]
Node 1 ends up containing all the text and nodes 2 and 3 are blank.
Can some one please tell me what's happening or what I'm doing wrong.
Thanks.
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="XPathFunctions.xsl"?>
<BusinessLetter>
<Head>
<SendDate>November 29, 2005</SendDate>
<Recipient>
<Name Title="Mr.">
<FirstName>Joshua</FirstName>
<LastName>Lockwood</LastName>
</Name>
<Company>Lockwood & Lockwood</Company>
<Address>
<Street>291 Broadway Ave.</Street>
<City>New York</City>
<State>NY</State>
<Zip>10007</Zip>
<Country>United States</Country>
</Address>
</Recipient>
</Head>
<Body>
<List>
<Heading>Along with this letter, I have enclosed the following items:</Heading>
<ListItem>two original, execution copies of the Webucator Master Services Agreement</ListItem>
<ListItem>two original, execution copies of the Webucator Premier Support for Developers Services Description between Lockwood & Lockwood and Webucator, Inc.</ListItem>
</List>
<Para>Please sign and return all four original, execution copies to me at your earliest convenience. Upon receipt of the executed copies, we will immediately return a fully executed, original copy of both agreements to you.</Para>
<Para>Please send all four original, execution copies to my attention as follows:
<Person>
<Name>
<FirstName>Bill</FirstName>
<LastName>Smith</LastName>
</Name>
<Address>
<Company>Webucator, Inc.</Company>
<Street>4933 Jamesville Rd.</Street>
<City>Jamesville</City>
<State>NY</State>
<Zip>13078</Zip>
<Country>USA</Country>
</Address>
</Person>
</Para>
<Para>If you have any questions, feel free to call me at <Phone>800-555-1000 x123</Phone> or e-mail me at <Email>bsmith#webucator.com</Email>.</Para>
</Body>
<Foot>
<Closing>
<Name>
<FirstName>Bill</FirstName>
<LastName>Smith</LastName>
</Name>
<JobTitle>VP of Operations</JobTitle>
</Closing>
</Foot>
</BusinessLetter>

But when I try to output the text of
each specific node
//*[contains(./*,'Lockwood')][1]
//*[contains(./*,'Lockwood')][2]
//*[contains(./*,'Lockwood')][3]
Node 1 ends up containing all the text
and nodes 2 and 3 are blank
This is a FAQ.
//SomeExpression[1]
is not the equivalent to
(//someExpression)[1]
The former selects all //SomeExpression nodes that are the first child of their parent.
The latter selects the first (in document order) of all //SomeExpression nodes in the whole document.
How does this apply to your problem?
//*[contains(./*,'Lockwood')][1]
This selects all elements that have at least one child whose string value contains 'Lockwood' and that are the first such child of their parent. All three elements that have a text node containing the string 'Lockwood' are the first such child of their parents, so the result is that three elements are selected.
//*[contains(./*,'Lockwood')][2]
There is no element that has a child with string value containing the string 'Lockwood' and is the second such child of its parent. No nodes are selected.
//*[contains(./*,'Lockwood')][3]
There is no element that has a child with string value containing the string 'Lockwood' and is the third such child of its parent. No nodes are selected.
Solution:
Use:
(//*[contains(./*,'Lockwood')])[1]
(//*[contains(./*,'Lockwood')])[2]
(//*[contains(./*,'Lockwood')])[3]
Each of these selects exactly the Nth element (N = {1,2,3}) selected by //*[contains(./*,'Lockwood')], correspondingly: BusinesLetter, Recipient and Body.
Remember:
The [] operator has higher priority (precedence) than the // abbreviation.

Related

XSLT: Return the value of a specific child if it is present in one of the parents' in ancestor tree

I have an XML with the structure similar to below.
<root>
<randomElement>..</randomElement>
<EFFECT>..</EFFECT>
<parent2>
<randomElement>..</randomElement>
<EFFECT>..</EFFECT>
<parent>
<randomElement>..</randomElement>
<EFFECT>..</EFFECT>
<randomElement>..</randomElement>
<ITEM>..</ITEM>
</parent>
</parent2>
</root>
Note: there can be any number of <randomElement>s at the places
where it's specified.
So, right now, my pointer is at the <ITEM> tag. I need to return the value inside of the <EFFECT> tag, but, here's the catch.
If it's present, I must return the value of the <EFFECT> tag which is inside <parent> tag. If it's not present there, I must return value of <EFFECT> tag which is inside the <parent2> tag. Again, if it is not present there too, I need to finally return the value of the <EFFECT> tag which is inside <root>. The <EFFECT> inside the <root> will always be present and there can be any number of parents for the <ITEM> element.
Sorry if it's confusing.
To go to any <ITEM>, this is sufficient.
//ITEM
Now, <EFFECT> is a sibling to <ITEM>, i.e. it's on the same level. Another way of thinking about siblings is that they are children of the same parent.
In fact, all <EFFECT> elements in question are children of some ancestor of <ITEM>. This means we can move upwards along the ancestor:: axis and grab all those ancestor elements in one step:
//ITEM/ancestor::*
This will give us <parent>, <parent2> and <root>, in this order.
And from those we only need to take one step down to grab all <EFFECT> elements:
//ITEM/ancestor::*/EFFECT
This will give us three EFFECT elements, this time again in document order (only the ancestor:: type of axis works inside out).
We are interested in the last one of those, because this will be closest to the <ITEM> we started from. The last() function will help here:
(//ITEM/ancestor::*/EFFECT)[last()]
The parentheses around the path are necessary, because otherwise the condition ([last()]) is tested on every <EFFECT> individually, and each one of them is the last one of its kind for its parent, giving us three matches. The parentheses make it so that first a node-set of three <EFFECT> elements is constructed and then the last one is taken, giving us only one match.
When you already currently are at the <ITEM> element, the relative version of the path selects the last effect for that particular item:
(ancestor::*/EFFECT)[last()]

return any node with child nodes name matching multiple

I want to retrieve any Node with the child nodes childTypeA, childTypeB or childTypeC, but not return nodes with only other child nodes (like stepChildA).
Once I have that node, I can retrieve any of the child nodes and their attributes. But I can't figure how to filter out those nodes that do not have any child nodes matching childTypeA, childTypeB or childTypeC.
My efforts either return all nodes with children, or return a node for each matching child, which means the same node is returned one, two or three times, depending upon if there is one, two or all three of the desired child nodes.
With xml data as shown
<parent Name="Item one">
<OtherData Name="Data one">
<childTypeA>
<someData Name="Child A">
</childTypeA>
<childTypeB>
<someData Name="Child B">
</childTypeB>
</parent>
<parent Name="Item two">
<OtherData Name="Data two">
<childTypeB>
<someData Name="Child B">
</childTypeB>
<childTypeC>
<someData Name="Child C">
</childTypeC>
</parent>
<parent Name="Item three">
<OtherData Name="Data three">
<stepChildA>
<someData Name="Step Child A">
</stepChildA>
</parent>
The actual data under each child type is different and I'm trying to assemble it into a table, where each parent node with the desired child type appear on a single row and the child data align under the appropriate columns. Currently I've either had all parent nodes where the data desired appears as intended, but also includes rows with the other parent nodes which have no data, or, I am getting multiple rows when there is more than one of the desired child type. The specific child type data fall into the proper columns, but are not on one row.
My approach was correct, but I needed to re-order my code. I had the "if test" before the "for-each". By swapping them, I was able to return all parent nodes, but then use the "xsl:if test=..." to ignore the unwanted parents and build each table row as I parse through the child nodes. Note I added a sort to the returned parent nodes, based upon the value of their attribute #Name.
<xsl:for-each select="parent">
<xsl:sort select="#Name">
<xsl:if test="childTypeA or childTypeB or childTypeC">
<tr>.........</tr>
</xsl:if>
</xsl:for-each>
My XML data is very verbose and I am trying not to overwhelm the post. I am also struggling with the formatting rules.
As far as I can tell, you simply need to use the xpath parent[childTypeA | childTypeB | childTypeC] like this:
<xsl:for-each select="parent[childTypeA | childTypeB | childTypeC]">
<xsl:sort..etc.
</xsl:for-each>

remove parent node tag but keep children as is,xslt

My input xml is like below, I want to delete nodes <multimap:Message1> and
<multimap:Messages xmlns:multimap="http://sap.com/xi/XI/SplitAndMerge">
but want to keep children as is.
Since there is special character ":" in between multimpap and Message I am not able to delete this nodes
<?xml version="1.0" encoding="UTF-8"?>
<multimap:Messages xmlns:multimap="http://sap.com/xi/XI/SplitAndMerge">`This one need to be removed`
<multimap:Message1> `This one need to be removed`
<EmployeeTime>
<EmployeeTime>
<externalCode>e82baef39</externalCode>
<timeType>UK_MATERNITY</timeType>
<userId>101046</userId>>
<Holiday>
<date>2016-03-25</date>
<date>2015-04-06</date>
<date>2015-05-25</date>
</Holiday>
</EmployeeTime>
</EmployeeTime>
</multimap:Message1>`This one need to be removed`
</multimap:Messages>`This one need to be removed`
Assuming you use an XSLT 2 or 3 processor you can simply use
<xsl:template match="/">
<xsl:copy-of select="*/*/*" copy-namespaces="no"/>
</xsl:template>
http://xsltransform.net/naZXpYb
With XSLT 1 you will need to run the EmployeeTime elements and its descendants through a transformation to strip the namespace that is in scope from the root element.

XSLT Get First Element Node

<SMRCRLT_XML>
<AREA>
<DETAILS>
<DETAIL_REQUIREMENT>
<RULE_REQUIREMENT>
<DETAIL_REQUIREMENT>
<COURSE_ROWSET>
<COURSE_SET>
<COURSE_AREA>TESTSELECT</COURSE_AREA>
<COURSE_KEY_RULE>1200</COURSE_KEY_RULE>
<COURSE_SET>A</COURSE_SET>
<COURSE_SUBSET>1</COURSE_SUBSET>
<COURSE_SUBJ_CODE>CHEM</COURSE_SUBJ_CODE>
<COURSE_CRSE_NUMB_LOW>345A</COURSE_CRSE_NUMB_LOW>
</COURSE_SET>
</COURSE_ROWSET>
</DETAIL_REQUIREMENT>
<DETAIL_REQUIREMENT>
<COURSE_ROWSET>
<COURSE_SET>
<COURSE_KEY_RULE>1200</COURSE_KEY_RULE>
<COURSE_SET>A</COURSE_SET>
<COURSE_SUBSET>2</COURSE_SUBSET>
<COURSE_SUBJ_CODE>CHEM</COURSE_SUBJ_CODE>
<COURSE_CRSE_NUMB_LOW>476A</COURSE_CRSE_NUMB_LOW>
</COURSE_SET>
</COURSE_ROWSET>
</DETAIL_REQUIREMENT>
<DETAIL_REQUIREMENT>
<COURSE_ROWSET>
<COURSE_SET>
<COURSE_AREA>TESTSELECT</COURSE_AREA>
<COURSE_KEY_RULE>1200</COURSE_KEY_RULE>
<COURSE_SET>A</COURSE_SET>
<COURSE_SUBSET>3</COURSE_SUBSET>
<COURSE_SUBJ_CODE>PHIL</COURSE_SUBJ_CODE>
<COURSE_CRSE_NUMB_LOW>432</COURSE_CRSE_NUMB_LOW>
</COURSE_SET>
</COURSE_ROWSET>
</DETAIL_REQUIREMENT>
<DETAIL_REQUIREMENT>
<COURSE_ROWSET>
<COURSE_SET>
<COURSE_AREA>TESTSELECT</COURSE_AREA>
<COURSE_KEY_RULE>1200</COURSE_KEY_RULE>
<COURSE_SET>B</COURSE_SET>
<COURSE_SUBSET>4</COURSE_SUBSET>
<COURSE_SUBJ_CODE>PHIL</COURSE_SUBJ_CODE>
<COURSE_SUBJ_DESC>Philosophy</COURSE_SUBJ_DESC>
<COURSE_CRSE_NUMB_LOW>433</COURSE_CRSE_NUMB_LOW>
</COURSE_SET>
</COURSE_ROWSET>
</DETAIL_REQUIREMENT>
<DETAIL_REQUIREMENT>
<COURSE_ROWSET>
<COURSE_SET>
<COURSE_AREA>TESTSELECT</COURSE_AREA>
<COURSE_KEY_RULE>1200</COURSE_KEY_RULE>
<COURSE_SET>B</COURSE_SET>
<COURSE_SUBSET>5</COURSE_SUBSET>
<COURSE_SUBJ_CODE>ZOOL</COURSE_SUBJ_CODE>
<COURSE_CRSE_NUMB_LOW>321</COURSE_CRSE_NUMB_LOW>
</COURSE_SET>
</COURSE_ROWSET>
</DETAIL_REQUIREMENT>
<DETAIL_REQUIREMENT>
<COURSE_ROWSET>
<COURSE_SET>
<COURSE_AREA>TESTSELECT</COURSE_AREA>
<COURSE_KEY_RULE>1200</COURSE_KEY_RULE>
<COURSE_SET>B</COURSE_SET>
<COURSE_SUBSET>6</COURSE_SUBSET>
<COURSE_SUBJ_CODE>BIOC</COURSE_SUBJ_CODE>
<COURSE_CRSE_NUMB_LOW>456</COURSE_CRSE_NUMB_LOW>
</COURSE_SET>
</COURSE_ROWSET>
</DETAIL_REQUIREMENT>
</RULE_REQUIREMENT>
</DETAIL_REQUIREMENT>
</DETAILS>
</AREA>
</SMRCRLT_XML>
I am trying to get the first element from the XML for each COURSE_SET, but it returns all the values. Can someone please help. This is my template that I applied:
<xsl:apply-templates select="//SMRCRLT_XML/AREA/DETAILS/DETAIL_REQUIREMENT/RULE_REQUIREMENT/DETAIL_REQUIREMENT/COURSE_ROWSET/COURSE_SET[COURSE_AREA='TESTSELECT' and COURSE_KEY_RULE='1200'][1]"/>
The results I am getting are:
CHEM345A
PHIL432
PHIL433
ZOOL321
BIOC456
The result I am looking for is CHEM 345A and then PHIL433
You have several problems here.
First, the [1] in your XPath expression is filtering the XPath value by requiring that the COURSE_SET elements selected be the first child of their parent. Without that [1], your XPath expression reads:
//SMRCRLT_XML
/AREA
/DETAILS
/DETAIL_REQUIREMENT
/RULE_REQUIREMENT
/DETAIL_REQUIREMENT
/COURSE_ROWSET
/COURSE_SET
[COURSE_AREA='TESTSELECT' and COURSE_KEY_RULE='1200']
But every COURSE_SET that matches that path expression is the first child of its parent. (The only COURSE_SET elements which are not first children are children of COURSE_SET, not children of COURSE_ROWSET.)
The second problem is that it appears, judging by your question and your attempt at formulating the XPath expression you want, that you would like the courses to be grouped somehow (at first I thought you might want them grouped by department but now I expect you want them grouped by the value of the nested COURSE_SET element, which in your example has values A or B), so that by selecting the first COURSE_SET in some suitable context you can get the first course listed for each group. But the XML you show doesn't in fact group the courses by department or by course set; it provides a flat list of courses with no groupings at all. There are no elements here for which CHEM 345A and PHIL 433 are the first courses.
If your design calls for the courses to be grouped by department or course set, then your data source is not providing the data you want, and you will want to fix it.
If on the other hand you're stuck with this XML and want to use XPath to try to provide the structure that your data source is not capable of providing, then you don't want "the first element for each COURSE_SET", you want "each COURSE_SET which is in a department (or a COURSE_SET) different from the immediately preceding COURSE_SET". And your XPath expression can be something like
//COURSE_ROWSET/COURSE_SET
[not(COURSE_SET eq preceding::COURSE_SET[1])]
Your third problem is that your XML seems to be too fond of using the same name for different constructs (one set of COURSE_SET elements each of which contains a description of a course, with department and course number and so on, and a second set of COURSE_SET elements which contain the strings 'A' and 'B', two sets of DETAIL_REQUIREMENT with different content, and so on. It's confusing for people not familiar with the data, and it will make every single discussion of detail an opportunity for miscommunication and error.
The efficient way to handle a task like this in XSLT 1.0 is to use Muenchian grouping, like this:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>
<xsl:key name="kSet" match="COURSE_ROWSET/COURSE_SET" use="COURSE_SET" />
<xsl:template match="/">
<root>
<xsl:apply-templates
select="//COURSE_ROWSET/COURSE_SET[generate-id() =
generate-id(key('kSet', COURSE_SET)[1])]" />
</root>
</xsl:template>
<xsl:template match="COURSE_ROWSET/COURSE_SET">
<item>
<xsl:value-of select="concat(COURSE_SUBJ_CODE, COURSE_CRSE_NUMB_LOW)"/>
</item>
</xsl:template>
</xsl:stylesheet>
When this XSLT is applied to your sample input, the result is:
<root>
<item>CHEM345A</item>
<item>PHIL433</item>
</root>

Finding the first Preceding Sibling with a particular element

Given The following XML:
<Root>
<NodeA>
<ChildNodeA/>
<ChildNodeB/>
</NodeA>
<NodeB>
<ChildNodeB/>
</NodeB>
<NodeC>
</NodeC>
</Root>
How do I find the first Preceding Siblilng of a particular Node that contains a particular Element.
I.E. If I am at "NodeC" how do I find the first Sibling with "ChildNodeA", in this instance "NodeA"?
Thanks in advance.
To find the first preceding sibling that contains a child element is quite straight-forward, and indeed closely matches the way you describe it....
<xsl:apply-templates select="preceding-sibling::*[ChildNodeA][1]" />
Assuming your were positioned on NodeC, this would indeed return your NodeA in your case
<NodeA>
<ChildNodeA></ChildNodeA>
<ChildNodeB></ChildNodeB>
</NodeA>