Finding the first Preceding Sibling with a particular element - xslt

Given The following XML:
<Root>
<NodeA>
<ChildNodeA/>
<ChildNodeB/>
</NodeA>
<NodeB>
<ChildNodeB/>
</NodeB>
<NodeC>
</NodeC>
</Root>
How do I find the first Preceding Siblilng of a particular Node that contains a particular Element.
I.E. If I am at "NodeC" how do I find the first Sibling with "ChildNodeA", in this instance "NodeA"?
Thanks in advance.

To find the first preceding sibling that contains a child element is quite straight-forward, and indeed closely matches the way you describe it....
<xsl:apply-templates select="preceding-sibling::*[ChildNodeA][1]" />
Assuming your were positioned on NodeC, this would indeed return your NodeA in your case
<NodeA>
<ChildNodeA></ChildNodeA>
<ChildNodeB></ChildNodeB>
</NodeA>

Related

XSLT: Return the value of a specific child if it is present in one of the parents' in ancestor tree

I have an XML with the structure similar to below.
<root>
<randomElement>..</randomElement>
<EFFECT>..</EFFECT>
<parent2>
<randomElement>..</randomElement>
<EFFECT>..</EFFECT>
<parent>
<randomElement>..</randomElement>
<EFFECT>..</EFFECT>
<randomElement>..</randomElement>
<ITEM>..</ITEM>
</parent>
</parent2>
</root>
Note: there can be any number of <randomElement>s at the places
where it's specified.
So, right now, my pointer is at the <ITEM> tag. I need to return the value inside of the <EFFECT> tag, but, here's the catch.
If it's present, I must return the value of the <EFFECT> tag which is inside <parent> tag. If it's not present there, I must return value of <EFFECT> tag which is inside the <parent2> tag. Again, if it is not present there too, I need to finally return the value of the <EFFECT> tag which is inside <root>. The <EFFECT> inside the <root> will always be present and there can be any number of parents for the <ITEM> element.
Sorry if it's confusing.
To go to any <ITEM>, this is sufficient.
//ITEM
Now, <EFFECT> is a sibling to <ITEM>, i.e. it's on the same level. Another way of thinking about siblings is that they are children of the same parent.
In fact, all <EFFECT> elements in question are children of some ancestor of <ITEM>. This means we can move upwards along the ancestor:: axis and grab all those ancestor elements in one step:
//ITEM/ancestor::*
This will give us <parent>, <parent2> and <root>, in this order.
And from those we only need to take one step down to grab all <EFFECT> elements:
//ITEM/ancestor::*/EFFECT
This will give us three EFFECT elements, this time again in document order (only the ancestor:: type of axis works inside out).
We are interested in the last one of those, because this will be closest to the <ITEM> we started from. The last() function will help here:
(//ITEM/ancestor::*/EFFECT)[last()]
The parentheses around the path are necessary, because otherwise the condition ([last()]) is tested on every <EFFECT> individually, and each one of them is the last one of its kind for its parent, giving us three matches. The parentheses make it so that first a node-set of three <EFFECT> elements is constructed and then the last one is taken, giving us only one match.
When you already currently are at the <ITEM> element, the relative version of the path selects the last effect for that particular item:
(ancestor::*/EFFECT)[last()]

pre-processing script to switch product codes

I have a snippet of code I've inherited and I'm trying to get it to work on multiples of the match pattern and set a tag from looking up a value from a table using another tag. What happens is that, for every item, the same lookup is performed and not the relative one for the node. I can't work out the syntax to work thru all entries and substitute the correct one. It's got to be simple it's just that I am simpler :)
My source xml contains this (within an outer /oomsdoc document node not shown):
<item>
<lineqty> 1</lineqty>
<linesku>BNLP5008 </linesku>
<linecustprod>xxxxxxxxxxxxxxx</linecustprod>
<linedesc>London Pride (Bot500mlx8) </linedesc>
</item>
<item>
<lineqty> 1</lineqty>
<linesku>BNBL5008 </linesku>
<linecustprod>xxxxxxxxxxxxxxx</linecustprod>
<linedesc>Bengal Lancer (Bot500mlx8) </linedesc>
</item>
I want to substitute the xxxxxxxxxxxxxxx in each linecustprod tag with the material from the lookup table using the value of the linesku tag.
This is my lookup table:
<Materials>
<product sku='BNLP5008 ' material='LONDON PRIDE'/>
<product sku='BNBL5008 ' material='BENGAL LANCER'/>
</Materials>
and this is my xslt code.
<xsl:variable name="SkuList" select="document('d:\test\transforms\catalogue.xml')/Materials"/>
<xsl:template match="/oomsdoc/item/linecustprod">
<xsl:variable name="MySku" select="/oomsdoc/item/linesku"/>
<linecustprod>
<xsl:value-of select="$SkuList/product[#sku=$MySku]/#material"/>
</linecustprod>
</xsl:template>
I'm guessing some kind of xsl foreach would work but just can't find a usable example to crib :)
Your guidance again would be appreciated at this point in my frustration :)
Thanks,
Brian.
Changing the variable definition to
<xsl:variable name="MySku" select="../linesku"/>
should be sufficient, this will pull out the linesku that is a sibling to the linecustprod you're currently looking at. As currently defined the variable will contain a node set of all the linesku elements in the document, so the value-of will give you the first entry from $SkuList that matches any entry in the main input file.
In addition to Ian Roberts' answer, please change
<xsl:variable name="SkuList" select="document('d:\test\transforms\catalogue.xml')/Materials"/>
to
<xsl:variable name="SkuList" select="document('/d:\test\transforms\catalogue.xml')/Materials"/>
for some reason, the first throws an error (malformed URL).

XSLT Get First Element Node

<SMRCRLT_XML>
<AREA>
<DETAILS>
<DETAIL_REQUIREMENT>
<RULE_REQUIREMENT>
<DETAIL_REQUIREMENT>
<COURSE_ROWSET>
<COURSE_SET>
<COURSE_AREA>TESTSELECT</COURSE_AREA>
<COURSE_KEY_RULE>1200</COURSE_KEY_RULE>
<COURSE_SET>A</COURSE_SET>
<COURSE_SUBSET>1</COURSE_SUBSET>
<COURSE_SUBJ_CODE>CHEM</COURSE_SUBJ_CODE>
<COURSE_CRSE_NUMB_LOW>345A</COURSE_CRSE_NUMB_LOW>
</COURSE_SET>
</COURSE_ROWSET>
</DETAIL_REQUIREMENT>
<DETAIL_REQUIREMENT>
<COURSE_ROWSET>
<COURSE_SET>
<COURSE_KEY_RULE>1200</COURSE_KEY_RULE>
<COURSE_SET>A</COURSE_SET>
<COURSE_SUBSET>2</COURSE_SUBSET>
<COURSE_SUBJ_CODE>CHEM</COURSE_SUBJ_CODE>
<COURSE_CRSE_NUMB_LOW>476A</COURSE_CRSE_NUMB_LOW>
</COURSE_SET>
</COURSE_ROWSET>
</DETAIL_REQUIREMENT>
<DETAIL_REQUIREMENT>
<COURSE_ROWSET>
<COURSE_SET>
<COURSE_AREA>TESTSELECT</COURSE_AREA>
<COURSE_KEY_RULE>1200</COURSE_KEY_RULE>
<COURSE_SET>A</COURSE_SET>
<COURSE_SUBSET>3</COURSE_SUBSET>
<COURSE_SUBJ_CODE>PHIL</COURSE_SUBJ_CODE>
<COURSE_CRSE_NUMB_LOW>432</COURSE_CRSE_NUMB_LOW>
</COURSE_SET>
</COURSE_ROWSET>
</DETAIL_REQUIREMENT>
<DETAIL_REQUIREMENT>
<COURSE_ROWSET>
<COURSE_SET>
<COURSE_AREA>TESTSELECT</COURSE_AREA>
<COURSE_KEY_RULE>1200</COURSE_KEY_RULE>
<COURSE_SET>B</COURSE_SET>
<COURSE_SUBSET>4</COURSE_SUBSET>
<COURSE_SUBJ_CODE>PHIL</COURSE_SUBJ_CODE>
<COURSE_SUBJ_DESC>Philosophy</COURSE_SUBJ_DESC>
<COURSE_CRSE_NUMB_LOW>433</COURSE_CRSE_NUMB_LOW>
</COURSE_SET>
</COURSE_ROWSET>
</DETAIL_REQUIREMENT>
<DETAIL_REQUIREMENT>
<COURSE_ROWSET>
<COURSE_SET>
<COURSE_AREA>TESTSELECT</COURSE_AREA>
<COURSE_KEY_RULE>1200</COURSE_KEY_RULE>
<COURSE_SET>B</COURSE_SET>
<COURSE_SUBSET>5</COURSE_SUBSET>
<COURSE_SUBJ_CODE>ZOOL</COURSE_SUBJ_CODE>
<COURSE_CRSE_NUMB_LOW>321</COURSE_CRSE_NUMB_LOW>
</COURSE_SET>
</COURSE_ROWSET>
</DETAIL_REQUIREMENT>
<DETAIL_REQUIREMENT>
<COURSE_ROWSET>
<COURSE_SET>
<COURSE_AREA>TESTSELECT</COURSE_AREA>
<COURSE_KEY_RULE>1200</COURSE_KEY_RULE>
<COURSE_SET>B</COURSE_SET>
<COURSE_SUBSET>6</COURSE_SUBSET>
<COURSE_SUBJ_CODE>BIOC</COURSE_SUBJ_CODE>
<COURSE_CRSE_NUMB_LOW>456</COURSE_CRSE_NUMB_LOW>
</COURSE_SET>
</COURSE_ROWSET>
</DETAIL_REQUIREMENT>
</RULE_REQUIREMENT>
</DETAIL_REQUIREMENT>
</DETAILS>
</AREA>
</SMRCRLT_XML>
I am trying to get the first element from the XML for each COURSE_SET, but it returns all the values. Can someone please help. This is my template that I applied:
<xsl:apply-templates select="//SMRCRLT_XML/AREA/DETAILS/DETAIL_REQUIREMENT/RULE_REQUIREMENT/DETAIL_REQUIREMENT/COURSE_ROWSET/COURSE_SET[COURSE_AREA='TESTSELECT' and COURSE_KEY_RULE='1200'][1]"/>
The results I am getting are:
CHEM345A
PHIL432
PHIL433
ZOOL321
BIOC456
The result I am looking for is CHEM 345A and then PHIL433
You have several problems here.
First, the [1] in your XPath expression is filtering the XPath value by requiring that the COURSE_SET elements selected be the first child of their parent. Without that [1], your XPath expression reads:
//SMRCRLT_XML
/AREA
/DETAILS
/DETAIL_REQUIREMENT
/RULE_REQUIREMENT
/DETAIL_REQUIREMENT
/COURSE_ROWSET
/COURSE_SET
[COURSE_AREA='TESTSELECT' and COURSE_KEY_RULE='1200']
But every COURSE_SET that matches that path expression is the first child of its parent. (The only COURSE_SET elements which are not first children are children of COURSE_SET, not children of COURSE_ROWSET.)
The second problem is that it appears, judging by your question and your attempt at formulating the XPath expression you want, that you would like the courses to be grouped somehow (at first I thought you might want them grouped by department but now I expect you want them grouped by the value of the nested COURSE_SET element, which in your example has values A or B), so that by selecting the first COURSE_SET in some suitable context you can get the first course listed for each group. But the XML you show doesn't in fact group the courses by department or by course set; it provides a flat list of courses with no groupings at all. There are no elements here for which CHEM 345A and PHIL 433 are the first courses.
If your design calls for the courses to be grouped by department or course set, then your data source is not providing the data you want, and you will want to fix it.
If on the other hand you're stuck with this XML and want to use XPath to try to provide the structure that your data source is not capable of providing, then you don't want "the first element for each COURSE_SET", you want "each COURSE_SET which is in a department (or a COURSE_SET) different from the immediately preceding COURSE_SET". And your XPath expression can be something like
//COURSE_ROWSET/COURSE_SET
[not(COURSE_SET eq preceding::COURSE_SET[1])]
Your third problem is that your XML seems to be too fond of using the same name for different constructs (one set of COURSE_SET elements each of which contains a description of a course, with department and course number and so on, and a second set of COURSE_SET elements which contain the strings 'A' and 'B', two sets of DETAIL_REQUIREMENT with different content, and so on. It's confusing for people not familiar with the data, and it will make every single discussion of detail an opportunity for miscommunication and error.
The efficient way to handle a task like this in XSLT 1.0 is to use Muenchian grouping, like this:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>
<xsl:key name="kSet" match="COURSE_ROWSET/COURSE_SET" use="COURSE_SET" />
<xsl:template match="/">
<root>
<xsl:apply-templates
select="//COURSE_ROWSET/COURSE_SET[generate-id() =
generate-id(key('kSet', COURSE_SET)[1])]" />
</root>
</xsl:template>
<xsl:template match="COURSE_ROWSET/COURSE_SET">
<item>
<xsl:value-of select="concat(COURSE_SUBJ_CODE, COURSE_CRSE_NUMB_LOW)"/>
</item>
</xsl:template>
</xsl:stylesheet>
When this XSLT is applied to your sample input, the result is:
<root>
<item>CHEM345A</item>
<item>PHIL433</item>
</root>

Question on XSLT following-sibling

My XML structure looks like this
<COMPANY>
<COMPANY-DATA>ABC</COMPANY-DATA>
<ID>10800</ISSUE-ID>
<PROJECT-ID/>
</COMPANY-ISSUE-INFO>
</COMPANY>
"COMPANY Node repeats"
What I want to do is I want to do I want to check for COMPANY-DATA='ABC' and get its ID
I tried using
<xsl:value-of select="//COMPANY-DATA/.='ABC'/following-sibling::ID/."/>
But this doesn't seem to work and throwing error
Expression must evaluate to node-set
//COMPANY-DATA/.= -->'ABC'<--
/following-sibling::ID/.
Thanks,
Karthik
Edit: I found the solution
**<xsl:value-of select="//COMPANY-DATA[.='ABC']/following-sibling::ID/."/>**
Thanks
The first thing i noticed ist, that whe snipped you postet is no valid XML. The Element <ID> is closed by the End-Element </ISSUED_ID> and there is a single closing Element </COMPANY-ISSUE-INFO>
But if i get you right, you want to find the ID of the <COMPANY> Element where the <COMPANY-DATA> is ABC. So your Comment on your question should do this. But you could also use
<xsl:value-of select="//COMPANY[COMPANY-DATA='ABC']/ID"/>
This removes the need of having the Comapny-Data and ID in a specific sequence.
<xsl:value-of select="//COMPANY-DATA[.='ABC']/following-sibling::ID/."/>

XSLT Select all nodes containing a specific substring

I'm trying to write an XPath that will select certain nodes that contain a specific word.
In this case the word is, "Lockwood". The correct answer is 3. Both of these paths give me 3.
count(//*[contains(./*,'Lockwood')])
count(BusinessLetter/*[contains(../*,'Lockwood')])
But when I try to output the text of each specific node
//*[contains(./*,'Lockwood')][1]
//*[contains(./*,'Lockwood')][2]
//*[contains(./*,'Lockwood')][3]
Node 1 ends up containing all the text and nodes 2 and 3 are blank.
Can some one please tell me what's happening or what I'm doing wrong.
Thanks.
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="XPathFunctions.xsl"?>
<BusinessLetter>
<Head>
<SendDate>November 29, 2005</SendDate>
<Recipient>
<Name Title="Mr.">
<FirstName>Joshua</FirstName>
<LastName>Lockwood</LastName>
</Name>
<Company>Lockwood & Lockwood</Company>
<Address>
<Street>291 Broadway Ave.</Street>
<City>New York</City>
<State>NY</State>
<Zip>10007</Zip>
<Country>United States</Country>
</Address>
</Recipient>
</Head>
<Body>
<List>
<Heading>Along with this letter, I have enclosed the following items:</Heading>
<ListItem>two original, execution copies of the Webucator Master Services Agreement</ListItem>
<ListItem>two original, execution copies of the Webucator Premier Support for Developers Services Description between Lockwood & Lockwood and Webucator, Inc.</ListItem>
</List>
<Para>Please sign and return all four original, execution copies to me at your earliest convenience. Upon receipt of the executed copies, we will immediately return a fully executed, original copy of both agreements to you.</Para>
<Para>Please send all four original, execution copies to my attention as follows:
<Person>
<Name>
<FirstName>Bill</FirstName>
<LastName>Smith</LastName>
</Name>
<Address>
<Company>Webucator, Inc.</Company>
<Street>4933 Jamesville Rd.</Street>
<City>Jamesville</City>
<State>NY</State>
<Zip>13078</Zip>
<Country>USA</Country>
</Address>
</Person>
</Para>
<Para>If you have any questions, feel free to call me at <Phone>800-555-1000 x123</Phone> or e-mail me at <Email>bsmith#webucator.com</Email>.</Para>
</Body>
<Foot>
<Closing>
<Name>
<FirstName>Bill</FirstName>
<LastName>Smith</LastName>
</Name>
<JobTitle>VP of Operations</JobTitle>
</Closing>
</Foot>
</BusinessLetter>
But when I try to output the text of
each specific node
//*[contains(./*,'Lockwood')][1]
//*[contains(./*,'Lockwood')][2]
//*[contains(./*,'Lockwood')][3]
Node 1 ends up containing all the text
and nodes 2 and 3 are blank
This is a FAQ.
//SomeExpression[1]
is not the equivalent to
(//someExpression)[1]
The former selects all //SomeExpression nodes that are the first child of their parent.
The latter selects the first (in document order) of all //SomeExpression nodes in the whole document.
How does this apply to your problem?
//*[contains(./*,'Lockwood')][1]
This selects all elements that have at least one child whose string value contains 'Lockwood' and that are the first such child of their parent. All three elements that have a text node containing the string 'Lockwood' are the first such child of their parents, so the result is that three elements are selected.
//*[contains(./*,'Lockwood')][2]
There is no element that has a child with string value containing the string 'Lockwood' and is the second such child of its parent. No nodes are selected.
//*[contains(./*,'Lockwood')][3]
There is no element that has a child with string value containing the string 'Lockwood' and is the third such child of its parent. No nodes are selected.
Solution:
Use:
(//*[contains(./*,'Lockwood')])[1]
(//*[contains(./*,'Lockwood')])[2]
(//*[contains(./*,'Lockwood')])[3]
Each of these selects exactly the Nth element (N = {1,2,3}) selected by //*[contains(./*,'Lockwood')], correspondingly: BusinesLetter, Recipient and Body.
Remember:
The [] operator has higher priority (precedence) than the // abbreviation.