Simple XSL transform to extract single value from Source XML - xslt

I'm trying to retrieve a certain value from an XML document and output that value into a new XML document - the source XML is full of unused data, I only need the specific part.
Source XML :-
<dpp:Programme xmlns:dpp="http://www.digitalproductionpartnership.co.uk/ns/as11/2012" xmlns:itv="http://dpp.itv.com/timecodes/v1">
<dpp:Editorial>
<dpp:SeriesTitle>test</dpp:SeriesTitle>
<dpp:ProgrammeTitle>test</dpp:ProgrammeTitle>
<dpp:EpisodeTitleNumber>test</dpp:EpisodeTitleNumber>
<dpp:ProductionNumber>2/1993/0022#001</dpp:ProductionNumber>
<dpp:Synopsis>None</dpp:Synopsis>
<dpp:Originator>None</dpp:Originator>
<dpp:CopyrightYear>2013</dpp:CopyrightYear>
</dpp:Editorial>
<dpp:Technical>
<dpp:ShimName>UK DPP HD</dpp:ShimName>
<dpp:Video>
<dpp:VideoBitRate unit="Mbps">100</dpp:VideoBitRate>
<dpp:VideoCodec>AVCI</dpp:VideoCodec>
<dpp:VideoCodecParameters>High 4:2:2 level 4.1</dpp:VideoCodecParameters>
<dpp:PictureFormat>1080i50 16:9</dpp:PictureFormat>
<dpp:AFD>10</dpp:AFD>
<dpp:PictureRatio>16:9</dpp:PictureRatio>
<dpp:ThreeD>false</dpp:ThreeD>
<dpp:ProductPlacement>false</dpp:ProductPlacement>
<dpp:FPAPass>Not tested</dpp:FPAPass>
</dpp:Video>
<dpp:Audio>
<dpp:AudioSamplingFrequency unit="kHz">48</dpp:AudioSamplingFrequency>
<dpp:AudioBitDepth>24</dpp:AudioBitDepth>
<dpp:AudioCodecParameters>PCM</dpp:AudioCodecParameters>
<dpp:AudioTrackLayout>EBU R 123: 4b</dpp:AudioTrackLayout>
<dpp:PrimaryAudioLanguage>eng</dpp:PrimaryAudioLanguage>
<dpp:SecondaryAudioLanguage>zxx</dpp:SecondaryAudioLanguage>
<dpp:TertiaryAudioLanguage>eng</dpp:TertiaryAudioLanguage>
<dpp:AudioLoudnessStandard>EBU R 128</dpp:AudioLoudnessStandard>
</dpp:Audio>
<dpp:Timecodes>
<dpp:LineUpStart>09:58:00:00</dpp:LineUpStart>
<dpp:IdentClockStart>09:59:20:00</dpp:IdentClockStart>
<dpp:Parts>
<dpp:Part>
<dpp:PartNumber>1</dpp:PartNumber>
<dpp:PartTotal>1</dpp:PartTotal>
<dpp:PartSOM>10:30:41:11</dpp:PartSOM>
<dpp:PartDuration>00:00:30:13</dpp:PartDuration>
</dpp:Part>
</dpp:Parts>
<dpp:TotalNumberOfParts>1</dpp:TotalNumberOfParts>
<dpp:TotalProgrammeDuration>00:00:30:13</dpp:TotalProgrammeDuration>
</dpp:Timecodes>
<dpp:AccessServices>
<dpp:AudioDescriptionPresent>false</dpp:AudioDescriptionPresent>
<dpp:ClosedCaptionsPresent>false</dpp:ClosedCaptionsPresent>
<dpp:OpenCaptionsPresent>false</dpp:OpenCaptionsPresent>
<dpp:SigningPresent>No</dpp:SigningPresent>
</dpp:AccessServices>
<dpp:Additional>
<dpp:CompletionDate>2014-01-07</dpp:CompletionDate>
<dpp:TextlessElementExist>false</dpp:TextlessElementExist>
<dpp:ProgrammeHasText>true</dpp:ProgrammeHasText>
<dpp:ProgrammeTextLanguage>eng</dpp:ProgrammeTextLanguage>
<dpp:AssociatedMediaFilename>2-1993-0022-001.mxf</dpp:AssociatedMediaFilename>
<dpp:MediaChecksumType>MD5</dpp:MediaChecksumType>
<dpp:MediaChecksumValue>6154fd9cf312492e2dea68bee656ded7</dpp:MediaChecksumValue>
</dpp:Additional>
<dpp:ContactInformation>
<dpp:ContactEmail>None</dpp:ContactEmail>
<dpp:ContactTelephoneNumber>None</dpp:ContactTelephoneNumber>
</dpp:ContactInformation>
</dpp:Technical>
<itv:AdditionalTimeCodes>
<itv:Element>
<itv:ElementType>Essence</itv:ElementType>
<itv:ElementSOM>10:30:41:11</itv:ElementSOM>
<itv:Duration>00:00:30:13</itv:Duration>
<itv:Fade>false</itv:Fade>
<itv:Mix>false</itv:Mix>
<itv:Property>Essence</itv:Property>
</itv:Element>
</itv:AdditionalTimeCodes>
</dpp:Programme>
This is the XSL I have created :-
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" indent="yes"/>
<xsl:template match="/">
<html>
<body>
<xsl:for-each select="Programme/Technical/Timecodes">
<tr>
<td>
<xsl:value-of select="TotalProgrammeDuration"/>
</td>
</tr>
</xsl:for-each>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
But all I'm getting returned is a blank page?
All I need is the timecode value (TotalProgrammeDuration) from Programme/Technical/Timecodes
What am I doing wrong? (I'm very new to this - if you can't rell already)
J.

The elements in your input XML have a namespace. You need to declare this namespace in your XSLT stylesheet too - and prefix any element names you mention.
Namespaces are an important concept in XSLT (as with XML technologies in general) so I recommend you spend some time understanding the basics. For instance, start with a previous answer of mine.
Stylesheet
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:dpp="http://www.digitalproductionpartnership.co.uk/ns/as11/2012">
<xsl:output method="text" indent="yes"/>
<xsl:template match="/">
<html>
<body>
<xsl:for-each select="dpp:Programme/dpp:Technical/dpp:Timecodes">
<tr>
<td>
<xsl:value-of select="dpp:TotalProgrammeDuration"/>
</td>
</tr>
</xsl:for-each>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
Also note that you are obviously outputting XHTML. Then, it makes more sense to set
<xsl:output method="text">
to
<xsl:output method="html">
Further, indent="yes" only makes sense when used with html, not with text.
Below is a second attempt at writing your stylesheet that uses separate templates (which is generally a better idea than using xsl:for-each).
Stylesheet (a better approach)
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:dpp="http://www.digitalproductionpartnership.co.uk/ns/as11/2012">
<xsl:output method="html" indent="yes"/>
<xsl:template match="/">
<html>
<body>
<xsl:apply-templates/>
</body>
</html>
</xsl:template>
<xsl:template match="dpp:Timecodes">
<tr>
<td>
<xsl:value-of select="dpp:TotalProgrammeDuration"/>
</td>
</tr>
</xsl:template>
<xsl:template match="text()"/>
</xsl:stylesheet>
Output
<html xmlns:dpp="http://www.digitalproductionpartnership.co.uk/ns/as11/2012">
<body>
<tr>
<td>00:00:30:13</td>
</tr>
</body>
</html>

You are missing namespace declarations:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:dpp="http://www.digitalproductionpartnership.co.uk/ns/as11/2012" exclude-result-prefixes="dpp">
<xsl:output method="html" indent="yes"/>
<xsl:template match="/">
<html>
<body>
<xsl:for-each select="dpp:Programme/dpp:Technical/dpp:Timecodes">
<tr>
<td>
<xsl:value-of select="dpp:TotalProgrammeDuration"/>
</td>
</tr>
</xsl:for-each>
</body>
</html>
</xsl:template>
</xsl:stylesheet>

Related

XSLT: Reference to entity must end with the ';' delimiter. SXXP0003

Source:
<!DOCTYPE html>
<html>
<head/>
<body>
<p>Here Link</p>
</body>
</html>
Transform:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<!-- output just the <body> (without <body>) -->
<xsl:output method="html" omit-xml-declaration="yes" indent="no"/>
<xsl:template match="/">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="body/*">
<xsl:copy-of select="."/>
</xsl:template>
<xsl:template match="head | meta | text()"/>
</xsl:stylesheet>
Desired Output:
<p>Here Link</p>
Error:
The reference to entity "node" must end with the ';' delimiter. SXXP0003
Solution Constraint:
I need to keep the #href as node= because that is what the web address is. I cannot delimit it or otherwise change it.
A later step is an Identity Transform, so I'll need to address the problem there as well.

XSLT: find first element above selected element

I want to get the first heading (h1) before a table in a docx.
I can get all headings with:
<xsl:template match="w:p[w:pPr/w:pStyle[#w:val='berschrift1']]">
<p>
<context>
<xsl:value-of select="." />
</context>
</p>
</xsl:template>
and I can also get all tables
<xsl:template match="w:tbl">
<p>
<table>
<xsl:value-of select="." />
</table>
</p>
</xsl:template>
But unfortunetly the processor does not accept
<xsl:template match="w:tbl/preceding-sibling::w:p[w:pPr/w:pStyle[#w:val='berschrift1']]">
<p>
<table>
<xsl:value-of select="." />
</table>
</p>
</xsl:template>
Here is a reduced XML file extracted from a docx: http://pastebin.com/KbUyzRVv
I want something like that as a result:
<context>Let’s get it on</context> <- my heading
<table>data</table>
<context>Let’s get it on</context> <- my heading
<table>data</table>
<context>We’re in the middle of something</context> <- my heading
<table>data</table>
Thanks to Daniel Haley I was able to find a solution for that problem. I'll post it here, so it is independend of the pastebin I postet below.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"
xmlns:v="urn:schemas-microsoft-com:vml" exclude-result-prefixes="xsl w v">
<xsl:output method="xml" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="w:tbl">
<context>
<xsl:value-of select="(preceding-sibling::w:p[w:pPr/w:pStyle[#w:val = 'berschrift1']])[last()]"/>
</context>
<table>
<xsl:value-of select="."/>
</table>
</xsl:template>
<xsl:template match="text()"/>
</xsl:stylesheet>
Hard to answer without a Minimal, Complete, and Verifiable example, but try this:
<xsl:template match="w:tbl">
<p>
<table>
<xsl:value-of select="(preceding::w:p[w:pPr/w:pStyle[#w:val='berschrift1']])[last()]"/>
</table>
</p>
</xsl:template>
Assuming you can use XSLT 2.0 (and most people can, nowadays), I find a useful technique here is to have a global variable that selects all the relevant nodes:
<xsl:variable name="special"
select="//w:tbl/preceding-sibling::w:p[w:pPr/w:pStyle[#w:val='berschrift1']][1]"/>
and then use this variable in a template rule:
<xsl:template match="w:p[. intersect $special]"/>
In XSLT 3.0 you can reduce this to
<xsl:template match="$special"/>

Avoid newline within mixed content elements

I need control over the output of an XSL transformation process in terms of (not) setting newlines before certain result elements. Take this input
<text>
<line>My text uses <hi>highlighting</hi> methods</line>
<line>Next line uses <hi>two </hi><hi>highlighter</hi> elements...</line>
</text>
transformed by this simple stylesheet:
<?xml version="1.0" encoding="utf-8"?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:output indent="yes" method="xml"/>
<xsl:template match="line">
<p>
<xsl:apply-templates/>
</p>
</xsl:template>
<xsl:template match="hi">
<span>
<xsl:apply-templates/>
</span>
</xsl:template>
</xsl:transform>
The undesirable result of the transformation is:
<p>My text uses <span>highlighting</span> methods</p>
<p>Next line uses <span>two </span>
<span>highlighter</span> elements...</p>
The second <span> within <p> produces a newline, which is not what I want.
What's the reason for this behaviour and how to avoid it, meaning: how to achieve this result:
<p>My text uses <span>highlighting</span> methods</p>
<p>Next line uses <span>two </span><span>highlighter</span> elements...</p>
(Yes, I need <xsl:output indent="yes"> and the transformation method has to be "xml".)
The only way I can see to get around this with the constraints you specify in the last line of your question (method="xml" and indent="yes") is to add xml:space="preserve" to the p elements you create, as
Whitespace characters MUST NOT be inserted in a part of the result document that is controlled by an xml:space attribute with value preserve.
(Source)
<?xml version="1.0" encoding="utf-8"?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:output indent="yes" method="xml"/>
<xsl:template match="line">
<p xml:space="preserve"><xsl:apply-templates/></p>
</xsl:template>
<xsl:template match="hi">
<span>
<xsl:apply-templates/>
</span>
</xsl:template>
</xsl:transform>
Note that because of the xml:space="preserve" you also have to remove the whitespace between the opening and closing tags of the p element and the child xsl:apply-templates. When run on your example input using Saxon 9 HE this produces the output
<?xml version="1.0" encoding="UTF-8"?>
<p xml:space="preserve">My text uses <span>highlighting</span> methods</p>
<p xml:space="preserve">Next line uses <span>two </span><span>highlighter</span> elements...</p>
If you could instead use the xhtml output method (and the XHTML namespace) then the XHTML indenter is not allowed to add space around tags that start or end elements that XHTML specifies to be "inline" markup, and this includes span.
<?xml version="1.0" encoding="utf-8"?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0"
xmlns="http://www.w3.org/1999/xhtml">
<xsl:output indent="yes" method="xhtml"/>
<xsl:template match="/">
<html><body><xsl:apply-templates/></body></html>
</xsl:template>
<xsl:template match="line">
<p>
<xsl:apply-templates/>
</p>
</xsl:template>
<xsl:template match="hi">
<span>
<xsl:apply-templates/>
</span>
</xsl:template>
</xsl:transform>
when run on the same input would produce
<?xml version="1.0" encoding="UTF-8"?><html xmlns="http://www.w3.org/1999/xhtml">
<body>
<p>My text uses <span>highlighting</span> methods
</p>
<p>Next line uses <span>two </span><span>highlighter</span> elements...
</p>
</body>
</html>
without space between the two span elements.

XML namespace reference in XSL stylesheet problem

I am trying to extract data out of the following XML fragment:
<?xml version= "1.0" ?>
<Stmts xmlns="http://tempuri.org/Statement.xsd" Generation="2011-08-01T12:41:41" >
<StatementDetail AccountStatus="Open" CompanyID="" TransactionCount="182" >
<Transactions>
<Manual.../>
...
</Transactions>
</StatementDetail>
</Stmts>
Notice that the element has a xmlns attribute.
When I try to use the following XSL I get no data.
<?xml version="1.0" encoding="ISO-8859-1"?>
<!-- Edited by XMLSpy® -->
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html>
<body>
<h2>CabCharge</h2>
<table border="1">
<tr bgcolor="#9acd32">
<th>Batch</th>
<th>TransNo</th>
</tr>
<xsl:for-each select="Stmt/StatementDetail/Transactions/Manual">
<tr>
<td><xsl:value-of select="#Batch"/></td>
<td><xsl:value-of select="#TransNo"/></td>
</tr>
</xsl:for-each>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
BUT! If I remove the XMLNS attribute from the element, I do get data.
What do I need to specifiy in the XSL to recognise the namespace???
Thanks.
Make sure to declare the default namespace of the document in your stylesheet, like:
<xsl:stylesheet version="1.0"
xmlns="http://tempuri.org/Statement.xsd"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
Otherwise the XSLT processor will consider the elements referenced in your stylesheet as belonging to default namespace null.
Moreover, if you want the namespace exlcuded from the output document, you need to declare null namespace for the root literal element html, like:
<xsl:template match="/">
<html xmlns="">
<!-- your stuff -->
</html>
</xsl:template>
Do note also that:
In the xsl:for-each you are selecting the wrong element (Stmt in place of Stmts)
The attributes Batch and TransNo do not exist in your input documents.

Change font in XML using XSLT

I'm new to XSLT. I'm trying to change the font size of a specific text in XML file using XSLT. For eg- I have the CDCatalog.xml file with following data.
<?xml version="1.0" encoding="ISO-8859-1" ?>
<?xml-stylesheet type="text/xsl" href="cdcat.xsl"?>
<catalog>
<cd>
<title>Empire Burlesque</title>
<artist><SmallText>Bob Dylan</SmallText><LineBreak/>*</artist>
<country>USA</country>
<company>Columbia</company>
<price>10.90</price>
<year>1985</year>
</cd>
</catalog>
and the cdCat.XSL file is-
<?xml version="1.0" encoding="ISO-8859-1" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html" version="1.0" encoding="UTF-8" indent="yes" />
<xsl:include href="cdCatalog.xsl" /> <!-- I added this -->
<xsl:template match="/">
<html>
<body>
<h2>My CD Collection</h2>
<table border="1">
<tr bgcolor="#9acd32">
<th align="left">Title</th>
<th align="left">Artist</th>
</tr>
<xsl:for-each select="catalog/cd">
<tr>
<td>
<xsl:value-of select="title" />
</td>
<td>
<xsl:value-of select="artist" />
</td>
</tr>
</xsl:for-each>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
I added a new xsl file cdCatalog.XSL file with following details-
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="LineBreak">
<br/>
</xsl:template>
<xsl:template match="Superscript">
<sup>
<xsl:value-of select="."/>
</sup>
</xsl:template>
<xsl:template match="SmallText">
<font size="1">
<xsl:value-of select="."/>
</font>
</xsl:template>
</xsl:stylesheet>
and included this file in the CDCat.xsl file.and added the tags - <smallText>, <LineBreak> in the CdCatalog.xml file. now when I open the xml file i dont see the LineBreak nor the font size difference. Can anyone please suggest if I'm missing something.
Thanks in advance
Sai
You need to use apply-templates to indicate where your template matches should take effect.
XML says nothing about presentation, that's the whole point. It's a data format.
If you want your XSLT to output to something where presentation matters I suggest you transform to HTML and get let HTML/CSS handle the styling.
Having seen your actual code now (hint: use the formatting when creating questions) don't use the font tag. What you want semantically and in practice is just headers <h1>, <h2>, <h3> etc, and I'd still suggest you add a CSS link in there. Oh and <xsl:output method="html" />
In-between these two opening tags:
<html>
<body>
...I'd place a link to a style sheet that defines the font sizes. Alternatively (and useful if you want a self contained HTML file to email around) you could put a style block there instead.