string length of a xml structure - xslt

I have a large XSD I process using several templates to get a new XSD.
In one of the last steps I would like to determine the length of the xml (actually an XSD) that was captured in a variable xsdresult.
Using the string-length function I see a strange length not matching the variable length of xsdresult. Size of string/xsd is over 52000 chars but I see Length: 9862 What am I doing wrong?
<!-- Catch output in variable -->
<xsl:variable name="xsdresult">
<xsl:call-template name="start"/>
</xsl:variable>
<xsl:template name="start">
<xsl:apply-templates/>
</xsl:template>
<!-- Build required doc parts -->
<xsl:variable name="docparts">
<xsl:call-template name="builddocparts"/>
</xsl:variable>
<xsl:template name="builddocparts">
Length: <xsl:value-of select="string-length(normalize-unicode($xsdresult))"/>
</xsl:template>
...

A call to string-length() is equivalent to a call to string-length(.), which in turn coerces the current node to a string, so it's equivalent to string-length(string(.)). The value of the string() function is the string value of the node, which for an element node is the string formed by the concatenation of all descendant text nodes.
If you want to know how the minimum amount of space the serialized XML document will take on disk, given a simple serialization, then you must add:
For each non-empty element, the length of its start-tag: the length of the element type name, plus 2 for the start-tag delimiters < ... >, plus the sum of the lengths of the attribute-value specifications.
For each attribute-value specification, you will need one character for leading whitespace, plus the length of the attribute name, plus the string length of the attribute's value, plus three for the equal sign and quotation marks, plus five characters for each time a quotation mark is replaced by &apos; or ".
For each non-empty element, the length of its end-tag (length of its element type name plus 3).
For each empty element, the length of its sole tag (length of its element type name, plus length of its attribute-value specifications, plus 3).
For each occurrence of < in the data or in attribute values, three characters for the escaping as <.
For each occurrence of ampersand in the data or in attribute values, four characters for escaping as &.
Not part of the minimum amount, but possibly part of the space you'll need on disk:
The total width of any whitespace added, if you indent the XML structurally.
The number of CDATA marked sections you serialize, times 12 (for <![CDATA[ + ]]>).
The number of characters saved by using CDATA marked sections instead of < and &.

Related

how to count the value of multiple elements in xslt

I have this xslt code to count the abstract length of type main:
<xsl:variable name="mainAbstract">
<xsl:value-of select="string-length(normalize-space(abstract[#type = 'main']))"/>
</xsl:variable>
and I had an issue where I had multiple xml tags that matches the same pattern and it's not working.
How should I count the number of characters in multiple elements from this select statement?
If you have multiple abstract elements of type main, then you need to select which one you want to process.
If - as it seems - you want the sum of their individual string-lenghts, then do:
<xsl:value-of select="sum(abstract[#type = 'main']/string-length(normalize-space()))"/>

fetch value from the input message (this message doesn't have any space)

I have a input string like this,without any space
51=2MA011362X17=MG127AJ4015AG1A20=022=M35=U48=9CVRVC449
Here, number before = is key and after is value. From this string I have to fetch value of 17= (basically fetch the value MG127AJ4015AG1A)
I used <xsl:value-of select="substring-before(substring-after(.,'17='), '=')"/> which is giving me result: MG127AJ4015AG1A20, now I am stuck with removing these last 2 numeric values (20). totally confused how this an be achieved.
Final output string should be - MG127AJ4015AG1A
If it is the case that the number at the end will always be two digits, you put your current expression in a variable, and use substring to remove the last two characters, like so:
<xsl:variable name="match" select="substring-before(substring-after(.,'17='), '=')" />
<xsl:value-of select="substring($match, 1, string-length($match) - 2)"/>

Meaning of this XPath expression

I am new to XML and XSLT programming.
Can anybody explain the meaning of below XPath expression?
<xsl:apply-templates select="//Order[Header/string-length(ORDERID) > 0]/Header/SAP_WBSELEMENT[not(. = following::SAP_WBSELEMENT)]" />
Meaning: Select SAP_WBSELEMENT elements, including those with duplicate string values only once, that are children of Header elements that are children of any Order elements in the document with a Header child with an ORDERID with an non-empty string value.
Breakdown: Working from the end of the XPath back to the front...
Select SAP_WBSELEMENT elements, excluding those with duplicate string values,
SAP_WBSELEMENT[not(. = following::SAP_WBSELEMENT)]
that are children of Header elements,
Header/
that are children of those Order elements with a Header child with an ORDERID with an non-empty string value,
Order[Header/string-length(ORDERID) > 0]/
anywhere in the document,
//

How to remove character from string using xslt?

<Scheduled>
<xsl:value-of select="//RequestParameters/Identifier/DepartureDate">
</xsl:value-of>
</Scheduled>
In this xslt code iam getting a last character as 'z' in "//RequestParameters/Identifier/DepartureDate" i want to remove z and please help on this.
If the value of //RequestParameters/Identifier/DepartureDate contains 'z' only at the end, you can use substring-before function.
<xsl:value-of select="substring-before(//RequestParameters/Identifier/DepartureDate, 'z')">
edit:
If you want to get the first 10 characters of the value, you can use substring function.
<xsl:value-of select="substring(//RequestParameters/Identifier/DepartureDate, 1, 10)">
In general, you may want to convert an element value in ISO 8601 date format to another format by adding a javascript function to your xslt, and call that function in your Xpath expression.
For instance, when you have added a (javascript) function convertToDate that extracts the date part of the input value in the format yyyymmdd, the Xpath expression
convertToDate (//RequestParameters/Identifier/DepartureDate)
will result in a value
20111016
assuming there is only one DepartureDate element in the input, having value
2011-10-16T09:40:00.000Z

<address1/> What value is in address

I have an XSLT statement as follows:
<xsl:when test="address1 != ' '">
My incoming XML the address node is as follows:
<address1/>
The node exists and the xsl statement seems to work sometimes, but it doesn't always work, it is giving me inconsistent results. I am checking the address1 node and if it is spaces, then I check address2 node, if it is not spaces I move it up to address1 output field if address1 input is spaces. Our customers are very inconsistent when entering addresses and our vendor requires address1 to be valid. Thanks for any help.
The problem with checking against a string is that you actually check for text within all descendants of the element, so <foo><bar>test</bar></foo> would fail the test foo = '', because the text test exists within the tree.
A more conclusive test is:
address1[not(text()) and not(*)]
This passes only where there is neither text nor child elements within the address element.
The node is empty.
You are testing for it not being a single space ' ', if it is an empty node the test will succeed.
To test that the node is empty you can do this:
<xsl:when test="address1 = ''">
A self-closing tag should have no content, so checking for '' would be the proper way to go. Doing '[space]' implies that the tag is actually <address1>[space]</address1>, which is no longer self closing.
You're not telling us enough about your code for us to reliably tell you what's wrong with it. Don't be so reticent! There could be all sorts of problems that aren't evident from a tiny snippet, e.g. using the wrong context item.
One piece of advice, though: avoid the "!=" operator (which appeared in your example). Usually you want not(author='') rather than author!=''. They mean the same thing if there is exactly one author element, but they have different meanings if there is no author element or if there is more than one. The expression author!='' is true if there is at least one author element whose value is not the empty string; the expression not(author='') is true if there is no author element whose value is the empty string.
To check if an element's string value is non-empty and non-whitespace-only, use:
string-length(normalize-space(address1)) > 0
The standard XPath function normalize-space($s) takes a string $s as an argument and returns another string produced from $s in which all leading and trailing whitespace characters are removed and any group of adjacent interim whitespace characters is replaced by a single space-character.
This means that the result of normalize-space() when applied on a string that contains only white-space characters, is the empty string (having string-length() of 0).
The XPath expression above is testing if the result of applying the normalize-space() function to the string-value of address1 has positive (> 0) length -- this means that the string value of address1 contains at least one non-whitespace character.