XSLT Filter comparing two values in same sub node - xslt

I need to use XSLT (version 1 unfortunately..) am trying to filter out certain nodes via a comparison of two properties in the sub node.
Here is the XML:
<?xml version="1.0" encoding="UTF-8"?>
<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
<SOAP-ENV:Header xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
</SOAP-ENV:Header>
<SOAP-ENV:Body xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
<QueriesResponse xmlns="http://schemas.Movies.com/Movies">
<Films xmlns="http://schemas.Movies.com/Movies">
<Film>
<FilmPostings>
<FilmPosting>
<FilmPostingDates>
<FilmPostDate>2017-01-04T19:44:25.9530000-05:00</FilmPostDate>
<FilmActiveDate>2017-01-04T19:44:25.9530000-05:00</FilmActiveDate>
</FilmPostingDates>
</FilmPosting>
</FilmPostings>
</Film>
<Film>
<FilmPostings>
<FilmPosting>
<FilmPostingDates>
<FilmPostDate>2017-01-04T19:50:06.3830000-05:00</FilmPostDate>
<FilmActiveDate>2017-01-04T19:50:06.3100000-05:00</FilmActiveDate>
</FilmPostingDates>
</FilmPosting>
</FilmPostings>
</Film>
<Film>
<FilmPostings>
<FilmPosting>
<FilmPostingDates>
<FilmPostDate>2016-12-05T18:03:14.9830000-05:00</FilmPostDate>
<FilmActiveDate>2017-01-02T00:16:52.7570000-05:00</FilmActiveDate>
</FilmPostingDates>
</FilmPosting>
</FilmPostings>
</Film>
</Films>
</QueriesResponse>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>
And here is my transform:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="http://schemas.Movies.com/Movies"
xmlns:m="http://schemas.Movies.com/Movies"
exclude-result-prefixes="m"
>
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes" />
<!-- standard identity template -->
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<!-- This is mean to compare PostDate and ActiveDate, matching if different. It doesn't match any nodes in XML (but should match the final one). -->
<xsl:template match="m:Film[m:FilmPostings/m:FilmPosting/m:FilmPostingDates/m:FilmPostDate[1] [.!= m:FilmPostings/m:FilmPosting/m:FilmPostingDates/m:FilmActiveDate[1]]]">
</xsl:template>
<!-- This code does match the hard coded value (the second node). -->
<!--<xsl:template match="m:Film[m:FilmPostings/m:FilmPosting/m:FilmPostingDates/m:FilmPostDate[1] [.!= '2017-01-04T19:50:06.3830000-05:00']]">
</xsl:template>-->
</xsl:stylesheet>
So, you'll see in the commented-out bit that I can do a match with hard-coded values, so I'm obviously in the right area - I know that the code will exclude a whole Film node if it finds the match. But it's comparing the two node values that doesn't work.
I've tried all kinds of variations on the right side of the comparison, but it doesn't seem able to pick up the value of the ActiveDate.

To remove Film elements where FilmPostDate and FilmActiveDate you can actually nest the conditions in the match attribute
<xsl:template
match="m:Film[m:FilmPostings/m:FilmPosting/m:FilmPostingDates[m:FilmPostDate = m:FilmActiveDate]]" />
This assumes only one set of FilmPostDate and FilmActiveDate elements per Film. If there were more than one set, you can try remove Film elements where all occurrences are the same (or rather, then are no occurrences that were different).
<xsl:template
match="m:Film[not(m:FilmPostings/m:FilmPosting/m:FilmPostingDates[m:FilmPostDate != m:FilmActiveDate])]" />

Related

I need to remove a node from an XML based on a Condition using Group by

I have the following XML:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<EMPLOYEE_ASSIGNMENT>
<REFRESH_DATE>2022-03-10 10:55:35.000</REFRESH_DATE>
<PERSON_ID>11189</PERSON_ID>
<EMPLOYEE_ID>032656300</EMPLOYEE_ID>
<EFFECTIVE_START_DATE>2020-08-19 00:00:00.000</EFFECTIVE_START_DATE>
<EFFECTIVE_END_DATE>4712-12-31 00:00:00.000</EFFECTIVE_END_DATE>
<BUSINESS_PROCESS>Absence Return for XXXXXXXXX last day of absence on 08/18/2020, first day back at work on 08/19/2020</BUSINESS_PROCESS>
<ACT_ASSIGNMENT_STATUS_TYPE_ID>1</ACT_ASSIGNMENT_STATUS_TYPE_ID>
<ACT_ORGANIZATION_ID>601</ACT_ORGANIZATION_ID>
<ACT_JOB_QUINTIQ_POSITION>Trainee</ACT_JOB_QUINTIQ_POSITION>
<ACT_HOURS_PER_WEEK>37.5</ACT_HOURS_PER_WEEK>
<ACT_HOURS_FREQUENCY>W</ACT_HOURS_FREQUENCY>
<ACT_BARGAINING_UNIT_CODE>C</ACT_BARGAINING_UNIT_CODE>
<ACT_PRIMARY_PROVINCE>BC</ACT_PRIMARY_PROVINCE>
</EMPLOYEE_ASSIGNMENT>
<EMPLOYEE_ASSIGNMENT>
<REFRESH_DATE>2022-03-10 10:55:35.000</REFRESH_DATE>
<PERSON_ID>11189</PERSON_ID>
<EMPLOYEE_ID>032656300</EMPLOYEE_ID>
<EFFECTIVE_START_DATE>2020-08-19 00:00:00.000</EFFECTIVE_START_DATE>
<EFFECTIVE_END_DATE>4712-12-31 00:00:00.000</EFFECTIVE_END_DATE>
<BUSINESS_PROCESS>Data Change: XXXXXXXXXXXX</BUSINESS_PROCESS>
<ACT_ASSIGNMENT_STATUS_TYPE_ID>1</ACT_ASSIGNMENT_STATUS_TYPE_ID>
<ACT_ORGANIZATION_ID>856</ACT_ORGANIZATION_ID>
<ACT_JOB_QUINTIQ_POSITION>Employee</ACT_JOB_QUINTIQ_POSITION>
<ACT_HOURS_PER_WEEK>37.5</ACT_HOURS_PER_WEEK>
<ACT_HOURS_FREQUENCY>W</ACT_HOURS_FREQUENCY>
<ACT_BARGAINING_UNIT_CODE>C</ACT_BARGAINING_UNIT_CODE>
<ACT_PRIMARY_PROVINCE>MB</ACT_PRIMARY_PROVINCE>
</EMPLOYEE_ASSIGNMENT>
</root>
The two nodes have the same EFFECTIVE_START_DATE but different BUSINESS_PROCESS for a single EMPLOYEE_ID. I need to transform that XML in a way that: when Two (or more) BUSINESS_PROCESS are present for and EMPLOYEE_ID on the same EFFECTIVE_START_DATE it shows only the one that is of value Data Change: XXXXXXXXX.
I need to transform it to:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<EMPLOYEE_ASSIGNMENT>
<REFRESH_DATE>2022-03-10 10:55:35.000</REFRESH_DATE>
<PERSON_ID>11189</PERSON_ID>
<EMPLOYEE_ID>032656300</EMPLOYEE_ID>
<EFFECTIVE_START_DATE>2020-08-19 00:00:00.000</EFFECTIVE_START_DATE>
<EFFECTIVE_END_DATE>4712-12-31 00:00:00.000</EFFECTIVE_END_DATE>
<BUSINESS_PROCESS>Data Change: XXXXXXXXXXXX</BUSINESS_PROCESS>
<ACT_ASSIGNMENT_STATUS_TYPE_ID>1</ACT_ASSIGNMENT_STATUS_TYPE_ID>
<ACT_ORGANIZATION_ID>856</ACT_ORGANIZATION_ID>
<ACT_JOB_QUINTIQ_POSITION>Employee</ACT_JOB_QUINTIQ_POSITION>
<ACT_HOURS_PER_WEEK>37.5</ACT_HOURS_PER_WEEK>
<ACT_HOURS_FREQUENCY>W</ACT_HOURS_FREQUENCY>
<ACT_BARGAINING_UNIT_CODE>C</ACT_BARGAINING_UNIT_CODE>
<ACT_PRIMARY_PROVINCE>MB</ACT_PRIMARY_PROVINCE>
</EMPLOYEE_ASSIGNMENT>
</root>
Thanks a lot
This is difficult to follow. If (!) I understand your description correctly, you want to do:
XSLT 2.0
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/root">
<xsl:copy>
<xsl:for-each-group select="EMPLOYEE_ASSIGNMENT" group-by="concat(EMPLOYEE_ID,'|', EFFECTIVE_START_DATE)">
<xsl:copy-of select="current-group()[starts-with(BUSINESS_PROCESS, 'Data Change:')]"/>
</xsl:for-each-group>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>

How do I match on any node that itself or any child has an attribute with a value in XSLT Template?

Say I have XML data like this:
<root>
<subs>
<sub>
<values>
<value attribute="a">1</value>
<value attribute="a">2</value>
<value attribute="c">3</value>
<value attribute="c">4</value>
</values>
</sub>
<subOther>
<otherValues attribute="c">
<otherValue attribute="a">1</value>
<otherValue attribute="a">2</value>
<otherValue attribute="b">3</value>
<otherValue attribute="a">4</value>
</otherValues>
</subOther>
</subs>
</root>
I am trying to create an XSLT template that matches all the nodes in the path to /root/subs/subOther/otherValues/otherValue[attribute="b"].
So far, this is the closest I have gotten:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes" />
<xsl:strip-space elements="*" />
<!--IDENTITY TEMPLATE -->
<xsl:template match="#*|node()">
<xsl:apply-templates select="node()" />
</xsl:template>
<xsl:template match="//*[ancestor-or-self::[#attribute='b']]">
<xsl:copy>
<xsl:apply-templates select="node()" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
But that throws an error saying there is an unexpected token [. I have tried several combinations but they either don't match anything at all, match too much (i.e. everything), or they throw some sort of error.
Edit: I updated the example and expected to be a little more clear. Also note that this is a highly-simplified XML. In my actual file the attribute in question can be at any leaf node on any valid element for that level, so I have to use a more generic path using * and unknown paths with //. So, for instance, one of the value elements could be the one with attribute="b" and it would trigger the same result.
Edit 2: The expected result is to select the nodes that have a path that lead to any left-child w/ an attribute that is equal to a specific value. In my XSD schema there's a total of about 100 possible leaf nodes spread all over the place. The use case is that the attribute in question marks which data elements have had changes, and I need to basically create a "diff" where the full file is whittled down to only nodes where the results are only those items that have changed and their parents. In the small example above, attrubute="b" is the indication I need to copy that node, and thus I would expect this exact result:
<root> <!-- Copied because part of the path -->
<subs> <!-- Copied because part of the path -->
<sub> <!-- Copied because part of the path -->
<values> <!-- Copied because part of the path -->
<value attribute="b">3</value> <!-- Copied because it matches the attribute -->
</values>
</sub>
</subs>
</root>
I hope that makes better sense. Also, I fixed the typo on the xsl:stylesheet being self-closing.
It looks like you have changed the identity template to ignore elements (the change will also drop attributes and text nodes), and added a template to copy the elements you need.
I think you need to reverse your logic. Instead of thinking about things you want to copy, think of it as removing things you don't want to copy.
So, you have the identity template to do the generic copying of elements, and have a second template to remove the things you don't want (the elements which don't have a "b" attribute either on its self or its descendants).
Try this XSLT
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes" />
<xsl:strip-space elements="*" />
<!--IDENTITY TEMPLATE -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()" />
</xsl:copy>
</xsl:template>
<xsl:template match="*[not(descendant-or-self::*[#attribute = 'b'])]" />
</xsl:stylesheet>
See it in action at http://xsltfiddle.liberty-development.net/ncntCS6

XSLT - Get String between commas

How can I get the value 'four' in XSLT?
<root>
<entry>(one,two,three,four,five,six)</entry>
</root>
Thanks in advance.
You didn't specify the XSLT version, so I assume version 2.0.
I also assume that word four is only a "marker", stating from which place
take the result string (between the 3rd and 4th comma).
To get the fragment you want, you can:
Use tokenize function to "cut" the whole content of entry
into pieces, using a comma as the cutting pattern.
Take the fourth element of the result array.
This expression can be used e.g. in a template matching entry.
So the example script can look like below:
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:transform version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" encoding="UTF-8" indent="yes" />
<xsl:template match="entry">
<xsl:copy>
<xsl:value-of select="tokenize(., ',')[4]"/>
</xsl:copy>
</xsl:template>
<xsl:template match="#*|node()">
<xsl:copy><xsl:apply-templates select="#*|node()"/></xsl:copy>
</xsl:template>
</xsl:transform>
For your input XML it gives:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<entry>four</entry>
</root>

Replace xsi:nil=“true” with open and close tags

I need to do the following transformation in order to get a message pass through a integration broker which does not understand xsi:nil=“true”. I know that for strings having some thing like <abc></abc> is not same as <abc xsi:nil=“true”/> but I have no option.
My input XML:
<PART>
<LENGTH_UOM xsi:nil="1"/>
<WIDTH xsi:nil="1"/>
</PART>
Expected outcome:
<PART>
<LENGTH_UOM><LENGTH_UOM>
<WIDTH></WIDTH>
</PART>
Please let me know your suggestions.
To remove all xsi:nil attributes combine the identity template with an empty template matching xsi:nil.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xsi="http://xsi.com">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="node()|#*"> <!-- identity template -->
<xsl:copy>
<xsl:apply-templates select="node()|#*" />
</xsl:copy>
</xsl:template>
<xsl:template match="#xsi:nil" /> <!-- empty template -->
</xsl:stylesheet>
If you only want to remove those whose value is true use the following empty template instead.
<xsl:template match="#xsi:nil[.='1' or .='true']" />
Concerning the opening and closing tag topic I suggest reading this SO question in which Martin Honnen states that (in the comments of the answer):
I am afraid whether an empty element is marked up as or or is not something that matters with XML and is usually not something you can control with XSLT processors.

XSLT: How to remove elements of a resulting result tree fragment while copying?

My goal is to extract the contents of the SOAP body, f.e. the ElementsToExtract node - but the node name can basically be arbitrary:
<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope">
<soap:Header>
<MessageId>52DF2371-4094-4408-A3EA-42D73FD1B7A3</MessageId>
</soap:Header>
<soap:Body>
<ElementsToExtract>
...
<RemoveMe>...</RemoveMe>
<RemoveMeAlso>...</RemoveMeAlso>
...
</ElementsToExtract>
</soap:Body>
</soap:Envelope>
While I'm extracting the contents, I want to get rid of two elements that all my source documents have in common - say RemoveMe and RemoveMeAlso. As there's a chance that the deeper nested nodes may be called the same, they must only be stripped from the layer below the ElementsToExtract node. How would I formulate that expression?
Here's what I did up to now:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:soap="http://www.w3.org/2003/05/soap-envelope"
xmlns:exsl="http://exslt.org/common"
exclude-result-prefixes="soap exsl">
<xsl:output method="xml" indent="yes" omit-xml-declaration="no"/>
<xsl:strip-space elements="*"/>
<xsl:variable name="SoapHeaderContents" select="exsl:node-set(soap:Envelope/soap:Header/*)"/>
<xsl:variable name="SoapBodyContents" select="exsl:node-set(soap:Envelope/soap:Body/*)"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="/">
<xsl:apply-templates select="$SoapBodyContents"/>
</xsl:template>
<!-- This is global, how to restrict to the ElementsToExtract element? -->
<xsl:template match="node()[name() = 'RemoveMe']"/>
<xsl:template match="node()[name() = 'RemoveMeAlso']"/>
</xsl:stylesheet>
I also played with the node-set() function, having read that one can not modify result tree fragments (they're only text nodes?), but I don't quite understand how to address the resulting nodes of that set. So the nodes weren't removed:
<xsl:template match="/">
<xsl:apply-templates select="$SoapBodyContents"/>
<xsl:apply-templates select="$SoapBodyContents/RemoveMe" mode="m1"/>
</xsl:template>
<xsl:template name="StripRemoveMe" match="RemoveMe" mode="m1"/>
I also read some parts of the specification, but to no avail. I'm lost for clues. Can someone direct me to the right approach?
Would this work for you:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:soap="http://www.w3.org/2003/05/soap-envelope"
exclude-result-prefixes="soap">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<!-- identity transform -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<!-- skip soap wrappers -->
<xsl:template match="/soap:Envelope">
<xsl:apply-templates select="soap:Body/ElementsToExtract"/>
</xsl:template>
<!-- remove unwanted elements -->
<xsl:template match="ElementsToExtract/RemoveMe | ElementsToExtract/RemoveMeAlso"/>
</xsl:stylesheet>
In the (unlikely) case you don't know the name of the ElementsToExtract element, you could use:
<!-- skip soap wrappers -->
<xsl:template match="/soap:Envelope">
<xsl:apply-templates select="soap:Body/*"/>
</xsl:template>
<!-- remove unwanted elements -->
<xsl:template match="soap:Body/*/RemoveMe | soap:Body/*/RemoveMeAlso"/>
Some quick thoughts.
You create variables for storing the SOAP header and body. These are already in the input document, so it makes more sense to just write templates that match these.
Although you create a variable for the SOAP header, you never use it anywhere.
If you try to apply templates in succession, as in your sample XSL code, you will get all the output nodes from the first apply-templates, and then all the output nodes from the next apply-templates. If these nodes are meant to be interleaved in any way, this approach will not produce viable output.
Here's a revised version of your sample input XML, adding in a couple elements that we want to keep.
<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope">
<soap:Header>
<MessageId>52DF2371-4094-4408-A3EA-42D73FD1B7A3</MessageId>
</soap:Header>
<soap:Body>
<ElementsToExtract>
<KeepMe>This text will persist in the output.</KeepMe>
<RemoveMe>This is text that will be removed.</RemoveMe>
<RemoveMeAlso>This will also vanish from the output.</RemoveMeAlso>
<OtherElementToKeep>And this one will also be kept.</OtherElementToKeep>
</ElementsToExtract>
</soap:Body>
</soap:Envelope>
Here's what we'd want as output:
<?xml version="1.0" encoding="utf-8"?>
<ElementsToExtract>
<KeepMe>This text will persist in the output.</KeepMe>
<OtherElementToKeep>And this one will also be kept.</OtherElementToKeep>
</ElementsToExtract>
This XSL 1.0 code will do the job. I'm guessing from your post that you're not familiar with XSL processing flow, so I've added comments to help explain what's going on.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:soap="http://www.w3.org/2003/05/soap-envelope"
version="1.0"
exclude-result-prefixes="soap">
<xsl:strip-space elements="*"/>
<xsl:output method="xml" indent="yes"/>
<!-- The `/` matches the _logical root_ of the input file. This is
basically equivalent to the start of the file, NOT the first element.
This is a common place to start processing in XSL. -->
<xsl:template match="/">
<!-- We just apply templates. In your case, we know already that
we DON'T want to process everything: we want to leave certain
things out, including a lot of the outermost elements. So
we specify what to target in the `select` statement. -->
<xsl:apply-templates select="soap:Envelope/soap:Body/ElementsToExtract"/>
</xsl:template>
<!-- This is the "identity" template, so called because it
just copies over applicable matches identically.
A template with a more-specific match statement takes
precedence. -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<!-- Here, we specify exactly those elements that are in the
processing flow, and that we want to exclude from the
output. Since `soap:Header` etc. are NOT in the processing
flow (their element trees were never included in a preceding
call to `apply-templates`), we don't need to worry about those. -->
<xsl:template match="RemoveMe | RemoveMeAlso"/>
</xsl:stylesheet>
Note that the outermost element in the output is ElementsToExtract. This element will include the xmlns:soap="http://www.w3.org/2003/05/soap-envelope" namespace declaration, even though this namespace isn't used in any of the output elements (at least, for this small sample input XML).
If you can use XSL 2.0+ and you want to remove this namespace from the output, you could add the copy-namespaces="no" attribute to the <xsl:copy> element.