XSLT global variable to reuse text in an attribute specifically - xslt

What is the most basic syntax in XSLT to declare a global variable for a string of text and then reference that text value into the attributes you are going to output later on the XSLT? Sounds simple - but has a specific syntax.

After two similar questions were missing the tiny detail that makes this work, I thought it might be useful to share this. Answer:
Variable declaration (near beginning of the XSL):
<xsl:variable name="defaultIconStyle" select="'Icon - Style'"/>
Note the single quotes inside the double quotes for the text string.
This method was also proposed but may be more resource/expensive:
<xsl:variable name="defaultIconStyle">Icon - Style</xsl:variable>
Calling this into an attribute value later:
(in this case, to set a character style for a tag targeted for inDesign)
<xsl:template match="note-mytype">
<xsl:copy><ph aid:cstyle="{$defaultIconStyle}"><image href="file:///myIcon.ai"/><xsl:text> </xsl:text></ph><xsl:apply-templates/></xsl:copy>
</xsl:template>

Related

Duplicate line and replace string

I have an XML file that contains more than 10,000 items. Each item contains a line like this.
<g:id><![CDATA[FBM00101816_BLACK-L]]></g:id>
For each item I need to add another line below like this:
<sku><![CDATA[FBM00101816]]></sku>
So I need to duplicate each g:id line, replace the g:id with sku and trim the value to delete all characters after the underscore (including it). The final result would be like this:
<g:id><![CDATA[FBM00101816_BLACK-L]]></g:id>
<sku><![CDATA[FBM00101816]]></sku>
Any ideas how to accomplish this?
Thanks in advance.
In XSLT, it's
<xsl:template match="g:id">
<xsl:copy-of select="."/>
<sku><xsl:value-of select="substring-before(., '_')"/></sku>
</xsl:template>
Or using Saxon's Gizmo (https://www.saxonica.com/documentation11/index.html#!gizmo) it's
follow //g:id with <sku>{substring-before(., '_')}</sku>
Don't try to do this sort of thing in a text editor (or any other tool that doesn't involve a real XML parser) unless it's a one-off. Your code will be too sensitive to trivial variations in the way the source XML is written and will almost inevitably have bugs - which might not matter for a one-off, but do matter if it's going to be used repeatedly over a period of time.
Note also, the CDATA tags in your input (and output) are a waste of space. CDATA tags have no significance unless the element content includes special characters like < and &, which isn't the case in your examples.
Okay, so after commenting, I couldn't help myself. This seemed to do what you asked for.
find: <g:id><!\[CDATA\[([^\_]+)?(.+)?\]></g:id>
replace: $0\n<sku><![CDATA[$1]></sku>
I don't have BBEdit, but this is what it looked like in Textmate:

XSL "not" not working as expected attempting to compare two variables

Using XSL v3.0 I'm trying to compare two variables. One was created from .txt format directory listings imported as unparsed text. The other was created by querying xml files. Both contain references to jpgs.
I want to create a third variable using select="not" to find out which jpg references are present in one variable but not the other. I know the syntax for this $var1[not(.=$var2)] I was successfully able to do it in another place in this same XSL file.
I can output the values from each of the two variables and they look just like a .txt file would, and the values are what I would expect to see.
But for the life of me I cannot get the "not" to work. As far as I can tell it just returns the entire value of one of the variables.
Is there a way to just brute force these two variables into the same format so I can do this? I just want each variable to be a flat file that I can compare to the other and output another boring old flat file. I've tried all the combinations of tokenize and string-join etc. that I've stumbled across and nothing seems to work.
If I was using a bash script I would just pipe the dirs to two .txt files and use diff to do this, but achieving the same thing in XSL is killing me.
Clearly I am a novice at XSL. Any assistance appreciated.
per Michael Kay's suggestion
complete xsl available at this dropbox link
https://www.dropbox.com/s/fsltr34f5l3ci5a/jpg_report_stack.xsl?dl=0
variable with all jpg names - $jpg_all_distinct_joined
<xsl:variable name="jpg_all_distinct_joined" as="xs:string" select="string-join((distinct-values(($token_full, $token_800, $token_thumb))),'
')"/>
variable with all jpg references from xml - $jpg_all_links
<xsl:variable name="jpg_all_links" select="($jpg_link_pb, $jpg_link_bibl, $jpg_link_ref)"/>
not statement
<xsl:variable name="jpgs_in_xml_not_directories" select="($jpg_all_links)[not(.=$jpg_all_distinct_joined)]"/>
outputs the value of $jpg_all_links - this is not what I want - I want the output to be all jpg references from $jpg_all_links that are not in $jpg_all_distinct_joined
The variable $jpg_all_links is a sequence concatenation of the three variables ($jpg_link_pb, $jpg_link_bibl, $jpg_link_ref). All three of these variables are constructed using string-join() with newline as the separator, so it seems likely that $jpg_all_links is a sequence of three strings, each comprising multiple strings separated by newlines. The variable $jpg_all_distinct_joined is also formed using string-join() with a newline separator. So my suspicion is that (writing # to represent a newline character), you are doing something like
("A#B#C", "D#E#F")[not(. = "C#D")]
and nothing is being eliminated because none of the strings is equal to "C#D". You want to compare the sets of individual strings, not the composite strings formed using string-join().

Dynamic Namespace for Input XML

I have searched many posts on dynamically setting namespaces but they all seem to refer to setting the namespace of the output XML.
The issue I have is that the namespace (defined at root and same for all child nodes) of the input XML differs, and the same stylesheet needs to be able to handle both inputs.
For example, one input XML is:
<root xmlns="aaa">
<body>xxx</body>
</root>
And the other input XML is:
<root xmlns="bbb">
<body>yyy</body>
</root>
In the stylesheet, my XPath obviously needs to use the defined namespace, which is declared with a prefix, i.e.:
xmlns:one="aaa"
But as soon as the second input XML is being transformed, it of course fails to work.
I could define another namespace, e.g.
xmlns:two="bbb"
But the only way to use that namespace at the right time is to duplicate all the XSLT code and have the other namespace as the prefix for all the XPath (even then I would still need to identify which set of XPath to use which may be fun..)
My stylesheet currently uses the following XPath:
%lt;SOMETHING>
<xsl:value-of select="one:body" />
%lt;/SOMETHING>
As you can see it uses the "one" namespace prefix. Is there any way to just get the value of either "body" tag, regardless of namespace? As mentioned in a comment below, although I appreciate they are different elements based on namespace, I know that the information in each will be the same so I can treat them as such.
I have seen posts on using xsl:element with a namespace attribute but from what I can tell that just defines the namespace of the output XML, not the input. (To make matters worse, what I am outputting is actually escaped XML, e.g. %lt;SOMETHING> so I couldn't use xsl:element anyway).
My current solution (since posting this) is to have two extra stylesheets included in the main stylesheet. Each one is specific to either namespace "one" or namespace "two", each line of XPath uses the relevant namespace prefix.
I am hoping there is a way to avoid having two separate stylesheets that are almost identical except for the namespace prefix.
Thanks in advance.
If I get you right, you want to process the XML ignoring the elements' namespaces. Actually, the sense of namespaces is to distinct between elements from different contexts. So from an XML point of view, <one:body> has absolutely nothing to do with <two:body>, besides the fact that they happen to have the same name.
If you want to do it anyway, instead of:
<xsl:template match="one:body">
<xsl:template match="two:body">
you should match on the elements' local name only:
<xsl:template match="*[local-name()='body']">
In order to give a little more background: If you say
<xsl:template match="one:body">
then this is only a short notation of
<xsl:template match="*[namespace-uri()='aaa'][local-name()='body']">
(i.e. "match any element whose namespace is 'aaa' and whose name is 'body'")
Thus, ignoring the namespace by leaving away the
[namespace-uri()='aaa']
makes it
<xsl:template match="*[local-name()='body']">
Instead, you had better say
<xsl:template match="*[namespace-uri()='aaa' or namespace-uri()='bbb'][local-name()='body']">
or
<xsl:template match="*[namespace-uri()='aaa' or namespace-uri()='bbb' or namespace-uri()='ccc'][local-name()='body']">
and so on. If, as dret states, you know all possible namespaces in advance.
I would suggest you define both namespaces, then use paths such as:
one:body | two:body
to address the elements in the source XML.
For example, instead of:
<xsl:value-of select="one:body" />
use:
<xsl:value-of select="one:body | two:body" />
As I already wrote! Instead of
<xsl:value-of select="one:body" />
you can write
<xsl:value-of select="*[local-name()='body']" />
Or, if you have XPath 2.0, then
<xsl:value-of select="*:body" />

xslt test with 2 parameters

<xsl:if test="count($currentPage/..//$itemType) > 0">
I try to use the if statement with 2 param values and I get the error:
"unexpected token '$' in the expression..."
is it possible to do what I'm trying to ?
In XSLT, like in most programming languages (excluding macro languages), variables represent values, not fragments of expression text. I suspect $itemType holds an element name, and you are imagining that you can use it anywhere you could use an element name. If that's what you are trying to do, use ..//*[name()=$itemType].
This is invalid (and #Michael Kay explained it well):
//$varName
If I guess correctly what you are up to, then you may try this:
//*[name() = $varName]

XSLT: keeping whitespaces when copying attributes

I'm trying to sort Microsoft Visual Studio's vcproj so that a diff would show something meaningful after e.g. deleting a file from a project. Besides the sorting, I want to keep everything intact, including whitespaces. The input looks like
space<File
spacespaceRelativePath="filename"
spacespace>
...
The xslt fragment below can add the spaces around elements, but I can't find out how to deal with those around attributes, so my output looks like
space<File RelativePath="filename">
xslt I use for the msxsl 4.0 processor:
<xsl:for-each select="File">
<xsl:sort select="#RelativePath"/>
<xsl:value-of select="preceding-sibling::text()[1]"/>
<xsl:copy>
<xsl:for-each select="text()|#*">
<xsl:copy/>
</xsl:for-each>
Those spaces are always insignificant in XML, and I believe that there is no option to control this behavior in a general way for any XML/XSLT library.
XSLT works on a tree representation of the input XML. Many of the irrelevant detail of the original XML has been abstracted away in this tree - for example the order of attributes, insignificant whitespace between attributes, or the distinction between " and ' as an attribute delimiter. I can't see any conceivable reason for wanting to write a program that treats these distinctions as significant.