find xpath element using a variable

find xpath element using a variable - xslt

im looking to find an element in a schema based on the value of a variable (that changes each time i iterate). the catch is the element could be anywhere inside the schema.
for instance:
<...
<foo>
<bar>
<bar1>BB</bar1>
<bar2>CC</bar2>
</bar>
<rab>
<rab1>DD</rab1>
</rab>
</foo>
/...>
$attribute = bar1
(then the next iteration, $attribute may equal rab1)
how would i write an expression that could find me: .../foo/bar/$attribute
the closest thing i can find is ...//*[name()=$attribute] but it doesn't work. is there any other way?
Thanks for your help!

Although the question leaves out a lot of details that may be important, you could try changing name() to local-name():
...//*[local-name()='bar1']
and see if that fixes the problem. The return value of name() includes any prefix the element name has, which could cause it not to match the value of $attribute. (#Kirill was hinting at this.)
If that doesn't solve the problem, provide more context: What is the full XPath expression? How is it being used in XSLT? How do you know it "doesn't work"? (Give expected results and actual results.)

Related

TinyXML2: Replace Node function?

I am having a hard time using TinyXML2 (https://github.com/leethomason/tinyxml2) to write a C/C++ method that replaces a given node like:
<doc>
<replace>Foo</replace>
</doc>
...with another node:
<replacement>Bar</replacement>
...so that the outcome is:
<doc>
<replacement>Bar</replacement>
</doc>
However, the node to be replaced may appear multiple times an I would like to keep the order in case I replace the second node with something else.
This should actually be straight-forward, but I am failing with endless recursions.
Is there probably an example around of how to do that? Any help would be greatly appreciated.

Do you have sample code?
You could try calling tinyxml2::XMLNode::InsertAfterChild to insert <replacement> followed by a deletion of <replace>.
This answer also seems related: Updating Data in tiny Xml element

I'd recommend copying the source xml to a new document using the visitor pattern making substitutions as you go. Substituting in-place is very likely to lead to broken chains and the endless loops that you're experiencing.
You can find an example of using the vistor pattern to make substutions (in element attributes and text but it's the same principle) here. See xcopy function and associated code near the bottom.

How to (nicely) template match multiple specific child elements (union) within wider XPATH

I'm trying to match a set of particular elements, but only ones which are children of another element structure (let's say it's input or select elements only somewhere inside divs with the class "special-sauce" on them). Normally, this would be easy so far as XPATH: we could parenthetically union the targeted children, like so:
div[contains(#class, 'special-sauce')//(input | select)
But this is where XSLT throws a curve ball, when we try to use this as a template match (at least in Saxon):
<xsl:template match="div[contains(#class, 'special-sauce')//(input | select)">
{"error":"The xsl file (/section-settings.xsl) could not be parsed.
Failed to compile stylesheet. 1 error
detected.","code":"TRANSFORM_ERROR","location":null,"causes":["Fatal
Error: Token \"(\" not allowed here in an XSLT pattern"]}
Basically, parentheticals aren't allowed as part of a template match at the main pathing level (they still work fine inside of conditionals/etc, obviously).
So what to do?
Well, technically, using a union can still work, but we would have to repeat the ancestor XPATH each time, since we can't parenthetically enclose the children:
<xsl:template match="div[contains(#class, 'special-sauce')//input
| div[contains(#class, 'special-sauce')//select">
This is doable (not very pretty, but sure, we can handle that! line breaks can work here to help our sanity yay) in our simple example here, but it gets problematic with more complex XPATH, especially if the parenthetical union would have been in the middle of a longer xpath, or for a lot of elements.
e.g.
div[contains(#class, 'major-mess')]/div[contains(#class, 'special-sauce')]//(dataset | optgroup | fieldset)//(button | option | label)
becomes
a crazy mess.
Ok, that quickly becomes less of an option in more complex examples. And while structuring our XSLT differently might help (intermediary matches, using modality, etc), the question remains:
How can we gracefully template match using unions of individual child elements within a larger XPATH pattern when parentheticals won't work?
An example sheet for the first example:
<div class="special-sauce">
<input class="form-control" type="text" value="" placeholder="INHERITED:" />
<select class="form-control">
<option value="INHERITED: ">INHERIT: </option>
<option value=""></option>
</select>
<div class="radio">
<label>
<input type="radio" name="param3vals" value="INHERITED: " />
INHERIT:
</label>
</div>
</div>
<div class="not-special"><input type="text" id="contact-info-include-path" size="90">
<label>contact</label>
</input></div>
<div class="sad-panda"><input type="text" id="sidenav-include-path" size="90">
<label>sidenav</label>
</input></div>
Note: this does assume that an identity transform is running as the primary method of handling the input document.
While there are other questions which could validly receive similar answers as, for example, the one I give below, I felt the context of those questions was usually more general (such that a top level union would be fine as their answer without complication), more specific in ways that didn't match, or simply too different. Hence the Q&A format.
XSLT 1.0 vs 2.0 vs 3.0
Michael Kay correctly notes in his answer below that while the original pattern attempted here doesn't work in XSLT 1.0 or 2.0, it should work in a (fully) XSLT 3.0 compatible processor. I'm currently on a system using Saxon 9.3, which is technically XSLT 2.0. I just want to call extra attention to that answer for those who are on a 3.0 system.

I looked all over and most answers to similar problems involved copying the repeated portion of the XPATH to each element and unioning it all together. But there is a better way! It's easy to forget that matching a particular element is relatively equivalent to matching that element's name within XPATH.
Use name() or local-name() instead of matching on the element directly within the template match pattern*.
Be aware of your namespace issues/needs when picking which to use. This still allows for advanced conditionals on attributes/etc of those elements.
The first match, for example, becomes:
<xsl:template match="div[contains(#class, 'special-sauce')//
element()[local-name() = ('input', 'select')]">
There's not a huge gain here in terms of space or time to write this out, but we do reduce redundancy and the associated data consistency errors that can result (all too often, especially if later making changes).
Where this really shines is the last example in the question (the mess):
<xsl:template match="div[contains(#class, 'major-mess')]/
div[contains(#class, 'special-sauce')]//
element()[local-name() = ('dataset', 'optgroup', 'fieldset')]//
element()[local-name() = ('button', 'option', 'label')]">
And since I can't remember if that's fully XSLT/XPATH 1.0 compatible by creating the element tree-fragment parenthetically for comparison, if you do need backwards compatibility the "contains() with bracketing separator tokens" (reducing chances of a false positive from another element being a substring of the full name targeted) pattern always works too:
<xsl:template match="div[contains(#class, 'major-mess')]/
div[contains(#class, 'special-sauce')]//
element()[contains('|dataset|optgroup|fieldset|'), concat('|', local-name(), '|'))]//
element()[contains('|button|option|label|', concat('|', local-name(), '|'))]">
* = "match pattern" vs "XPath"
If you're struggling with understanding why the naive approach (the first thing I attempted in the question) fails in XSLT, it helps to understand that template rules like "match" must follow XSLT patterns, which are only essentially a sub set of valid XPath expressions (which easily makes things more confusing to distinguish and remember, especially when many sources just pretend it's all XPath entirely). Note that parentheses only show up as a valid option to use as expression tokens which are only found within expressions within predicates, not within any other portion of the location path or location steps.
Final Considerations
Performance: I have no idea whether there are notable performance differences with this approach versus unioning each seperate element as a full path to each one, or whether there is even a real performance difference between addressing an element natively versus as a predicate on the anonymous element() selector. My suspicion is that while most XSLT processors can probably achieve a faster DOM tree search when a single match is written using the native path structure versus a predicate with name() function on the anonymous selector, the union cases may perform faster depending on how well the processor tries to pre-compile and optimize for logic patterns. I will leave that task for someone else to try benchmarking, because ultimately the real hurdle becomes developer sanity and maintenance issues (likelihood of incurring human errors). In complex matches, I feel that any small performance penalty will likely be easily met by the simple legibility and reduced/eliminated data redundancy of this approach.

I think that your pattern is legal in XSLT 3.0 as written. But I guess you want an XSLT 2.0 solution...
One great way that people often overlook is to use schema-aware patterns. If you want to match a choice of elements, it's quite likely that they are closely related in the schema, for example by having a common type T or by being members of a substitution group S. You can then write
div[contains(#class, 'special-sauce')//schema-element(S)
or
div[contains(#class, 'special-sauce')//element(*, T)
But I guess you want a solution that isn't schema-aware...
In that case, I don't think I can offer anything better than what you've got.
Sometimes multiple modes are the answer: for example something like
<xsl:template match="div[contains(#class, 'special-sauce')]">
<xsl:apply-templates mode="special"/>
</xsl:template>
<xsl:template match="select|input" mode="special">
Generally I think modes are greatly under-used.

Why not split this template into two or three (one for each level) with modes? Something like
<xsl:template match="div[contains(#class, 'special-sauce')">
<xsl:apply-templates select=".//select|input" mode="special-sauce"/>
</xsl:template>
<xsl:template match="select|input" mode="special-sauce">
<!-- ... -->
</xsl:template>
In my opinion this way it reads clearer.

position()=1 working correctly, but not position()<5

I'm new to XSLT, and I'm carrying out a few tests using w3schools "Try it yourself" pages. I'm using the following demo:
http://www.w3schools.com/xsl/tryxslt.asp?xmlfile=cdcatalog&xsltfile=tryxsl_choose
This contains the following line:
<xsl:for-each select="catalog/cd">
I'm testing filtering the HTML rendered by position() but I'm having issues when using the < operand.
I've tried the following:
<xsl:for-each select="catalog/cd[position()=1]">
And this returns the first item from the XML data (as expected).
I then tried:
<xsl:for-each select="catalog/cd[position()<5]">
I was expecting this to return the first 4 items, but instead I get no results.
My guess is that perhaps position()=1 is doing a string comparison, which is why it returns the first item, but it cannot understand position()<5 as a string cannot be compared in this way?
Why is this happening, and what would be the correct syntax to get the results I wish to achieve?
Update: After reading #joocer's response, and testing this myself, using the > operand does work, for the opposite result:
<xsl:for-each select="catalog/cd[(position()>5)]">

It looks very much like a bug in the version of libxslt that w3schools is using.

Even inside quotes, you must type < as < so it won't be confused for the start of an element tag. I think this was done to make it easier for tolerant parsers to recover from errors and streaming parsers skip content faster. They can always look for < outside CDATA and know that is an element start or end tag.

I don't know why, but inverting the condition works, so instead of looking for less than 5, look for not more than 4
<xsl:for-each select="catalog/cd[not(position()>4)]">

XSLT Get the first occurrence of a specific tag

Let's say i have a full html-document as XML-input.
How would the XSLT-file look if i only want to output the first (or any) image from the html?

One XPath expression that selects the first <img> element in a document is:
(//img)[1]
Do note that a frequent mistake -- as made by #Oded in his answer is to suggest the following XPath expression -- in general it may select more than one element:
//img[1] (: WRONG !!! :)
This selects all <img> elements in the document, each one of which is the first <img> child of its parent.
Here is the exact explanation of this frequent mistake -- in the W3C XPath 1.0 Recommendation:
NOTE: The location path //para[1] does not mean the same as the location path /descendant::para[1]. The latter selects the first descendant para element; the former selects all descendant para elements that are the first para children of their parents.
A further problem exists if the document has defined a default namespace, which must be the case with XHTML. XPath treats any unprefixed name as belonging to no namespace and the expression (//img)[1] selects no node, because there is no element in the document that belongs to no namespace and has name img.
In this case there are two ways to specify the wanted XPath expression:
(//x:img)[1] -- where the prefix x is associated (by the hosting language) with the specific default namespcae (in this case this is the XHTML namespace).
(//*[name()='img'])[1]

The XPath expression the will retrieve the first image from an HTML page: (//img)[1].
See the answer from #Dimitre Novatchev for more information on problems with it.

Element-in-List testing

For a stylesheet I'm writing (actually for a set of them, each generating a different output format), I have the need to evaluate whether a certain value is present in a list of values. In this case, the value being tested is taken from an element's attribute. The list it is to be tested against comes from the invocation of the stylesheet, and is taken as a top-level <xsl:param> (to be provided on the command-line when I call xsltproc or a Saxon equivalent invocation). For example, the input value may be:
v0_01,v0_10,v0_99
while the attribute values will each look very much like one such value. (Whether a comma is used to separate values, or a space, is not important-- I chose a comma for now because I plan on passing the value via command-line switch to xsltproc, and using a space would require quoting the argument, and I'm lazy-enough to not want to type the extra two characters.)
What I am looking for is something akin to Perl's grep, wherein I can see if the value I currently have is contained in the list. It can be done with sub-string tests, but this would have to be clever so as not to get a false-positive (v0_01 should not match a string that contains v0_011). It seems that the only non-scalar data-type that XSL/XSLT supports is a node-set. I suppose it's possible to convert the list into a set of text nodes, but that seems like over-kill, even compared to making a sub-string test with extra boundaries-checking to prevent false matches.

Actually, using XPath string functions is the right way to do it. All you have to make sure is that you test for the delimiters as well:
contains(concat(',' $list, ','), concat(',', $value, ','))
would return a Boolean value. Or you might use one of these:
substring-before(concat('|,' $list, ',|'), concat(',', $value, ','))
or
substring-after(concat('|,' $list, ',|'), concat(',', $value, ','))
If you get an empty string as the result, $value is not in the list.
EDIT:
#Dimitre's comment is correct: substring-before() (or substring-after()) would also return the empty string if the found string is the first (or the last) in the list. To avoid that, I added something at the start and the end of the list. Still contains() is the recommended way of doing this.

In addition to the XPath 1.0 solution provided by Tomalak,
Using XPath 2.0 one can tokenize the list of values:
exists(tokenize($list, ',')[. = $value])
evaluates to true() if and only if $value is contained in the list of values $list

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

find xpath element using a variable - xslt

Related

TinyXML2: Replace Node function?

How to (nicely) template match multiple specific child elements (union) within wider XPATH

position()=1 working correctly, but not position()<5

XSLT Get the first occurrence of a specific tag

Element-in-List testing

Categories

Resources