I have a text node from which I have to extract a particular sub string based on a pattern
Below is the actual string:
//Some_string,"some_string_1":"target_string"},some_other_string//
Following is the regex pattern I am trying to use:
<xsl:analyze-string select="text_node/text()" regex=",("some_string_1":.*?)"}">
<xsl:matching-substring>
<xsl:value-of select="substring-after(regex-group(1),':"')"/>
</xsl:matching-substring>
</xsl:analyze-string>
My extracted sub string should be "target_string"
But I am getting the following error
Fatal error during transformation using //my_path: Closing curly brace in attribute value template ",("some_string_1":.*?)"}" must be doubled;
I tried to use double curly braces also but didn't work
Note - I am using Ant script to run the XSLT with saxon-he-10.1.jar
Thanks in advance!
I tend to put regular expressions into an xs:string typed parameter or variable e.g.
<xsl:param name="pattern" as="xs:string">,("some_string_1":(.*?))"}</xsl:param>
then I can use
<xsl:analyze-string select="." regex="{$pattern}">
<xsl:matching-substring>
<xsl:value-of select="regex-group(2)"/>
</xsl:matching-substring>
</xsl:analyze-string>
where I have a lot less to worry about escaping things in an attribute value template like the regex attribute.
Note, that, if you use XSLT 3, it is safer to have
<xsl:param name="pattern" as="xs:string" expand-text="no">,("some_string_1":(.*?))"}</xsl:param>
to avoid any text value template setting higher up in the tree kicking in.
You need to double the curly bracket because it has special meaning in XSLT and escape it because it has special meaning in regex:
",("some_string_1":.*?)"\}}"
Related
I have went through several examples and tried to modify my search to understand what is what.
Input:
<Description>Ottelu pelattu 22.4.2018. Gagarin Cupin 5. finaaliottelu. Selostus: Antti Mäkinen.</Description>
<Description>Ottelu pelattu 20.4.2018. Gagarin Cupin 1. finaaliottelu. Selostus: Antti Mäkinen.</Description>
<Description>Ottelu pelattu 22.4.2018. Gagarin Cupin 2. puolivälierä. Selostus: Antti Mäkinen.</Description>`
What I want to do is select these to my output:
"Gagarin Cupin 5. finaaliottelu"
"Gagarin Cupin 2. puolivälierä"
Without the dot there in the middle.
I could use substring-before/after, but I understand it could be useful to use regex?
I have made this that fetches what I need: Gagarin Cupin\s\d\W\s\w*[a-zA-ZäöüÄÖÜß]*
But now how can I use this in XSLT? Is it analyze-string() that I should use? or matches() in some way?
XSLT:
<xsl:variable name="episode" select="Description"/>
<xsl:variable name="fetchcup">
<xsl:analyze-string select="$episode" regex="Gagarin Cupin\s\d\W\s\w*[a-zA-ZäöüÄÖÜß]*">
<xsl:matching-substring>
<xsl:value-of select="regex-group(1)"/>
</xsl:matching-substring>
</xsl:analyze-string>
</xsl:variable>
<Cup><xsl:value-of select="$fetchcup"/></Cup>
But honestly, I feel like I am missing some basics of how it works despite looking through tutorial pages and examples. If I get a foot in the door I can apply it further.
Your regular expression works in the context of a single Description element, inside of the xsl:matching-substring if you want to output the matched string you can simply use . for the context item or regex-group(0) (see https://www.w3.org/TR/xslt-30/#func-regex-group). The use of regex-group(1) doesn't make sense in your case as your regular expression does not have any subexpressions.
<xsl:template match="Description">
<cup>
<xsl:analyze-string select="." regex="Gagarin Cupin\s\d\W\s\w*[a-zA-ZäöüÄÖÜß]*">
<xsl:matching-substring>
<xsl:value-of select="."/>
</xsl:matching-substring>
</xsl:analyze-string>
</cup>
</xsl:template>
That template in https://xsltfiddle.liberty-development.net/nc4NzQH outputs
<cup>Gagarin Cupin 5. finaaliottelu</cup>
<cup>Gagarin Cupin 1. finaaliottelu</cup>
<cup>Gagarin Cupin 2. puolivälierä</cup>
for your three Description elements, I hope I understood that correctly as the desired output.
I wrote a code to eradicate all the special characters with a function.
<xsl:function name="lancet:stripSpecialChars">
<xsl:param name="string" />
<xsl:variable name="AllowedSymbols"
select="'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789'"/>
<xsl:value-of select="
translate(
$string,
translate($string, $AllowedSymbols, ' '),
' ')
"/>
</xsl:function>
<xsd:element xtt:fixedLength="14" xtt:required="true" xtt:severity="error" xtt:align="left">
<xsl:value-of select="lancet:stripSpecialChars(upper-case(replace(normalize-unicode(translate($emp/wd:First_Name, ',', ' '), 'NFKD'), '⁄', '/')))"/>
</xsd:element>
Now there is a requirement for me to include apostrophe ('). When I am trying to include the same in AllowedSymbols, I am getting an error.
The output Right now is D AGOSTINO. I need something like D'AGOSTINO.
Not sure how to handle this. Could someone please help me out with this. Thanks
You don't say what the error is, but you probably just need to escape the apostrophe in your variable.
This is done by doubling up the apostrophe:
<xsl:variable name="AllowedSymbols" select="'''ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789'"/>
Since you're using XSLT 2.0, you should be able to use replace() instead of translate()...
<xsl:function name="lancet:stripSpecialChars">
<xsl:param name="string"/>
<xsl:value-of select="replace($string,'[^A-Z0-9'']','')"/>
</xsl:function>
I'm not replacing lowercase letters since the string you're passing is already forced to uppercase, but if you use the function elsewhere you can add a-z to the character class.
Encode it as ' (’ also)
Enclose the value in a CDATA section (recommended as you get rid of encoding problems.
<data><![CDATA[some stuff including D'Agostino & other reserved/problematic characters :-) ]]></data>
I've the below XML
<?xml version="1.0" encoding="UTF-8"?>
<body>
<p>Industrial drawing: Any creative composition</p>
<p>Industrial drawing: Any creative<fn>
<fnn>4</fnn>
<fnt>
<p>ftn1"</p>
</fnt>
</fn> composition
</p>
</body>
and the below XSL.
<xsl:template match="p">
<xsl:choose>
<xsl:when test="contains(substring-before(./text(),' '),'Article')">
<xsl:text>sect3</xsl:text>
<xsl:value-of select="./text()"/>
</xsl:when>
<xsl:when test="contains(substring-before(./b/text(),' '),'Section')">
<xsl:text> Sect 2</xsl:text>
<xsl:value-of select="./text()"/>
</xsl:when>
<xsl:when test="contains(substring-before(./b/text(),' '),'Chapter')">
<xsl:text> Sect 1</xsl:text>
<xsl:value-of select="./text()"/>
</xsl:when>
<xsl:otherwise>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
Here my XSL is working fine for <p>Industrial drawing: Any creative composition</p> but for the below Case
<p>Industrial drawing: Any creative<fn>
<fnn>4</fnn>
<fnt>
<p>ftn1"</p>
</fnt>
</fn> composition
</p>
it is throwing me the below error.
XSLT 2.0 Debugging Error: Error: file:///C:/Users/u0138039/Desktop/Proview/ASAK/DIFC/XSLT/tabel.xslt:38: Wrong occurrence to match required sequence type - Details: - XPTY0004: The supplied sequence ('2' item(s)) has the wrong occurrence to match the sequence type xs:string ('zero or one')
please let me know how can i fix this and grab the text required.
Thanks
The second p element in your example XML has two child text nodes, one containing "Industrial drawing: Any creative" and the other containing a space, "composition", a newline and another six spaces. In XSLT 1.0 it is legal to apply a function that expects a string to an argument that is a set of more than one node, the behaviour is to take the value of the first node and ignore all the others. But in 2.0 it is a type mismatch error to pass two nodes to a function that expected a single value for its parameter.
But in this case I doubt that you really need to use text() at all - if all you care about is seeing whether the string "Article" occurs anywhere within the first word under the p (including when this is nested inside another element) then you can simply use .:
<xsl:when test="contains(substring-before(.,' '),'Article')">
(or better still, use predicates to separate the different conditions into their own templates, with one template matching "Article" paragraphs, another matching "Section" paragraphs, etc.)
The p element in your example has several text nodes, so the expression ./text() creates a sequence. You cannot apply a string function to a sequence; you must convert it to a string first. Instead of:
test="contains(substring-before(./text(),' '),'Article')"
try:
test="contains(substring-before(string-join(text(), ''), ' '), 'Article')"
sorry for this rivial question but at the moment i can not figure it out.
I have an node in there are childs, i want these child and "print" these directly in an attribute. Please take a look at the code:
<fo:declarations>
<xsl:for-each select="//lb">
<xsl:for-each select="./dv-group/dv/download">
<xsl:value-of select="." />
<pdf:embedded-file filename="<xsl:value-of select="." />" src="url(test:///C:/Users/muster/Desktop/template_test/data/Mappe1.xlsx)"/>
</xsl:for-each>
</xsl:for-each>
</fo:declarations>
I have try it with a variable but that doesn't work too.
Any suggestions?
Thanks.
The concept you're looking for is called an attribute value template: in attribute values on a literal result element (and in certain attributes of some xsl: instructions too) you can enclose XPath expressions in braces and they will be evaluated and their result substituted in the output:
<pdf:embedded-file filename="{.}" src="url(test:///C:/Users/muster/Desktop/template_test/data/Mappe1.xlsx)"/>
If you want a literal brace character in an attribute that is interpreted as an AVT you must double it.
My xsl has a parameter
<xsl:param name="halfPath" select="'halfPath'"/>
I want to use it inside match
<xsl:template match="Element[#at1='value1' and not(#at2='{$halfPath}/another/half/of/the/path')]"/>
But this doesn't work. I guess a can not use parameters inside ''. How to fix/workaround that?
The XSLT 1.0 W3C Specification forbids referencing variables/parameters inside a match pattern.:
"It is an error for the value of the
match attribute to contain a
VariableReference"
There is no such limitation in XSLT 2.0, so use XSLT 2.0.
If due to unsurmountable reasons using XSLT2.0 isn't possible, put the complete body of the <xsl:template> instruction inside an <xsl:if> where the test in conjunction with the match pattern is equivalent to the XSLT 2.0 match pattern that contains the variable/parameter reference(s).
In a more complicated case where you have more than one template matching the same kind of node but with different predicates that reference variables/parameters, then a wrapping <xsl:choose> will need to be used instead of a wrapping <xsl:if>.
Well, you could use a conditional instruction inside the template:
<xsl:template match="Element[#at1='value1']">
<xsl:if test="not(#at2=concat($halfPath,'/another/half/of/the/path'))">
.. do something
</xsl:if>
</xsl:template>
You just need to be aware that this template will handle all elements that satisfy the first condition. If you have a different template that handles elements that match the first, but not the second, then use an <xsl:choose>, and put the other template's body in the <xsl:otherwise> block.
Or, XSLT2 can handle it as is if you can switch to an XSLT2 processor.
This topic had the answer to my question, but the proposed solution by Flynn1179 was not quite correct for me (YMMV). So try it the way it is suggested by people more expert than me, but if it doesn't work for you, consider how I solved it. I am using xsltproc that only handles XSL version 1.0.
I needed to match <leadTime hour="0024">, but use a param: <xsl:param name="hour">0024</xsl:param>. I found that:
<xsl:if test="#hour='{$hour}'"> did not work, despite statements here and elsewhere that this is the required syntax for XSL v.1.0.
Instead, the simpler <xsl:if test="#hour=$hour"> did the job.
One other point: it is suggested above by Dimitre that you put template inside if statement. xsltproc complained about this: instead I put the if statement inside the template:
<xsl:template match="leadTime">
<xsl:if test="#hour=$leadhour">
<xsl:copy>
<xsl:apply-templates select="node() | #*"/>
</xsl:copy>
</xsl:if>
</xsl:template>
In XSLT 2.0 you can refer to global variables within a match pattern, but the syntax is simpler than your guess:
<xsl:template match="Element[#at1='value1' and
not(#at2=$halfPath/another/half/of/the/path)]"/>
rather than
<xsl:template match="Element[#at1='value1' and
not(#at2='{$halfPath}/another/half/of/the/path')]"/>
Also, the semantics are not what you appear to be expecting: a variable referenced on the lhs of "/" must contain a node-set, not a fragment of an XPath expression.