Sample <AAA:BBB CCC:DDD EEE:FFF><GGG:HHH III:JJJ><KKK>
What I want is a substituion that removes everything except <BBB><HHH><KKK>
I've tried loads of things and just keep falling over
If its easier to one brace at a time that would be fine
As you can probably guess its XML using LibXML and I'm parsing all the elements against a list of paths and nodes in arrays. I just want the node name not things like
<com.fnf:NodeName/> needs to be <NodeName/>
or worse still <\com.com.com:NodeName xmlns:com.com.com="http://www.some.domain"> just needs to say <NodeName>
I think this short program will do what you need. It uses XML::Twig to process the XML data, and defines a twig handler which is called for all elements in the data, and removes the element's namespace prefix and all attributes.
I've had to make a guess at what your XML data really looks like, as what you show in your question is far from being valid XML.
use strict;
use warnings;
use XML::Twig;
my $twig = XML::Twig->new;
$twig->setTwigHandler(_all_ => sub {
$_->set_name($_->local_name);
$_->del_atts;
});
$twig->parse( \*DATA );
$twig->print(pretty_print => 'indented');
__DATA__
<root>
<aaa:bbb ccc="ddd" eee="fff">
<ggg:hhh iii="jjj">
<kkk></kkk>
</ggg:hhh>
</aaa:bbb>
</root>
output
<root>
<bbb>
<hhh>
<kkk></kkk>
</hhh>
</bbb>
</root>
Use XML::Parser and set Namespaces to true:
Namespaces
This is an Expat option. If this is set to a true value, then namespace processing is done during the parse. See "Namespaces" in XML::Parser::Expat for further discussion of namespace processing.
…
When this option is given with a true value, then the parser does namespace processing. By default, namespace processing is turned off. When it is turned on, the parser consumes xmlns attributes and strips off prefixes from element and attributes names where those prefixes have a defined namespace. A name's namespace can be found using the "namespace" method and two names can be checked for absolute equality with the "eq_name" method.
An idea: this can be done with an xsl transformation:
the xsl file:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes" method="xml" encoding="utf-8" omit-xml-declaration="yes"/>
<!-- template for all elements -->
<xsl:template match="*">
<!-- local-name() gets the tagname without namespace -->
<xsl:element name="{local-name()}">
<xsl:apply-templates select="node()"/>
</xsl:element>
</xsl:template>
<!-- template to copy all that is not a tag or an attribute -->
<xsl:template match="comment() | text() | processing-instruction()">
<xsl:copy/>
</xsl:template>
</xsl:stylesheet>
The perl code:
#!/usr/bin/perl
use strict;
use warnings;
use XML::LibXSLT;
use XML::LibXML;
my $xslt = XML::LibXSLT->new();
my $source = XML::LibXML->load_xml(location => 'removens.xml');
my $style_doc = XML::LibXML->load_xml(location => 'removens.xsl');
my $stylesheet = $xslt->parse_stylesheet($style_doc);
my $results = $stylesheet->transform($source);
print $stylesheet->output_as_bytes($results);
or instead of using perl, you can use directly xsltproc in a terminal:
xsltproc removens.xsl removens.xml
Related
I'm writing an experiment where I take an XML file, that has XPaths embedded in it, and try to process it against another XML file with data in it, where the XPaths refer to elements within some predefined nodeset inside the data....basically binding a view to a list of data.
I've basically got it working, except how to evaluate the XPaths themselves, clearly I can do it in saxon with 3.0 (maybe I should try there first), but it would be initially convenient if this worked in msxml. I've read stuff about "extensions" and embedding javascript...but I can't really see how it would work (it didnt work for me).
any ideas?
(I could make the xslt create an xslt that creates the output, but this is a rough and ready proof of concept, and that might make my head hurt).
I can if necessary create an explicit example, but my actual scenario is quite convoluted.
Within .NET XPathNavigator has an Evaluate method: https://learn.microsoft.com/en-us/dotnet/api/system.xml.xpath.xpathnavigator.evaluate?view=netcore-3.1. You can expose it from an extension object to XslCompiledTransform.
The Mvp.Xml library does this to expose a dyn2:evaluate extension function in the namespace xmlns:dyn2="http://gotdotnet.com/exslt/dynamic" to its wrapper MvpXslTransform class around XslCompiledTransform.
The library is available on NuGet in a .NET standard 2.0 compatible (i.e. both .NET framework and .NET Core compatible) package: https://www.nuget.org/packages/Mvp.Xml.NetStandard.
Simplest code would be (using Mvp.Xml.Common.Xsl;):
var processor = new MvpXslTransform();
processor.SupportedFunctions = Mvp.Xml.Exslt.ExsltFunctionNamespace.GdnDynamic;
processor.Load("XSLTFile1.xslt");
processor.Transform(new XmlInput("XMLFile1.xml"), null, new XmlOutput(Console.Out));
where the stylesheet uses e.g. dyn2:evaluate(., path) with xmlns:dyn2="http://gotdotnet.com/exslt/dynamic" where the path child element of the context node contains an XPath expression.
The source code is at https://github.com/keimpema/Mvp.Xml.NetStandard/blob/master/library/Mvp.Xml/Exslt/GDNDynamic.cs.
In the .NET framework you can also embed .NET code directly in the XSLT and XslCompiledTransform will compile and run it with the proper XsltSettings (new XsltSettings() { EnableScript = true }):
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt"
xmlns:my-ext="http://example.com/my-ext"
exclude-result-prefixes="msxsl my-ext">
<msxsl:script implements-prefix="my-ext" language="C#">
public object evaluate(XPathNodeIterator context, string expression)
{
if (context.MoveNext()) {
return context.Current.Evaluate(expression);
}
else {
return null;
}
}
</msxsl:script>
<xsl:output method="xml" indent="yes"/>
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="xpath-test">
<xsl:copy>
<result>
<xsl:value-of select="my-ext:evaluate(., path)"/>
</result>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
I don't know whether Visual Studio allows you to enable that setting for its transformation menu.
I try to concstruct link with
<xsl:element name="a">
<xsl:attribute name="href">
<xsl:value-of select="concat('file:///', substring-before('%RolesPath%', 'roles'),'Flores.chm')"/>
</xsl:attribute>
Help
</xsl:element>
but I get error:
File file:///Flores.chm not found
I'm pretty sure, that variable %RolesPath% works fine. I'm using it in code normally. And if I use in code only
<xsl:value-of select="concat('file:///', substring-before('%RolesPath%', 'roles'),'Flores.chm')"/>
I get
file:///C:\Flores\Flores.chm
which is right path. Where I'm doing mistake please?
edit. %RolesPath% stores path to specify folder of program, which works with this code. In my case %RolesPath% stores "C:\Flores\roles\".
To specify my problem. I need open file(Flores.chm) in root folder of program. Program can be install everywhere in PC and prapably only way, how I can get the path is via %RolesPath%.
What you are passing to substring-before() is just a string ('%RolesPath%'). It appears that you are trying to use a Windows environment variable. This isn't going to work the way you're using it.
I think you have 2 options:
Option 1
Pass the value of the environment variable as an xsl:param when you call the stylesheet. This would work in either XSLT 1.0 or 2.0.
You would need the xsl:param:
<xsl:param name="RolesPath"/>
and this is how you would reference it:
<a href="{concat('file:///', substring-before($RolesPath, 'roles'),'Flores.chm')}"/>
Option 2
Use the environment-variable() function. This would only work with an XSLT 3.0 processor, such as Saxon-PE or EE.
Example:
<a href="{concat('file:///', substring-before(environment-variable('RolesPath'), 'roles'),'Flores.chm')}"/>
Here's another example of environment-variable() to show the function actually working:
XSLT 3.0
<xsl:stylesheet version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<environment-variable name="TEMP" value="{environment-variable('TEMP')}"/>
</xsl:template>
</xsl:stylesheet>
Output (when applied to any well-formed XML)
<environment-variable name="TEMP" value="C:\Users\dhaley\AppData\Local\Temp"/>
Use this shorter expression:
<a href="file:///{substring-before($RolesPath, 'roles')}Flores.chm"/>
where $RolesPath is passed as an external, global parameter to the transformation.
How exactly to pass an external parameter to the transformation varies from one XSLT processor to another -- read your XSLT processor documentation. Some XSLT processors also allow string-typed parameters to be passed to the transformation from a command-line execution utility.
With the program BaseX I was able to use XPath and XQuery in order to query an XML document located at my home directory, but I have a problem with doing the same in XSLT.
The document I'm querying is BookstoreQ.xml.
XPath version, running totally fine:
doc("/home/ioannis/Desktop/BookstoreQ.xml")/Bookstore/Book/Title
XSLT code which I want to execute:
<xsl:stylesheet version = "2.0" xmlns:xsl = "http://www.w3.org/1999/XSL/Transform">
<xsl:output method= "xml" indent = "yes" omit-xml-declaration = "yes" />
<xsl:template match = "Book"></xsl:template>
</xsl:stylesheet>
I read BaseX' documentation on XSLT, but didn't manage to find a solution. How can I run given XSLT?
BaseX has no direct support for XSLT, you have to call it using XQuery functions (which is easy, though). There are two functions for doing this, one for returning XML nodes (xslt:transform(...)), one for returning text as a string (xslt:transform-text(...)). You need the second one.
xslt:transform-text(doc("/home/ioannis/Desktop/BookstoreQ.xml"),
<xsl:stylesheet version = "2.0" xmlns:xsl = "http://www.w3.org/1999/XSL/Transform">
<xsl:output method= "xml" indent = "yes" omit-xml-declaration = "yes" />
<xsl:template match = "Book"></xsl:template>
</xsl:stylesheet>
)
Both can either be called with the XSLT as nodes (used here), by passing it as a string or giving a path to a file containing the XSLT code.
Greetings!
I want to extract some properties from different Maven POMs in a XSLT via the document function. The script itself works fine but the document function returns an empty result for the POM as long as I have the xmlns="http://maven.apache.org/POM/4.0.0" in the project tag. If I remove it, everything works fine.
Any idea how the make this work while leaving the xmlns attribute where it belongs or why this doesn't work with the attribute in place?
Here comes the relevant portion of my XSLT:
<xsl:template match="abcs">
<xsl:variable name="artifactCoordinate" select="abc"/>
<xsl:choose>
<xsl:when test="document(concat($artifactCoordinate,'-pom.xml'))">
<abc>
<ID><xsl:value-of select="$artifactCoordinate"/></ID>
<xsl:copy-of select="document(concat($artifactCoordinate,'-pom.xml'))/project/properties"/>
</abc>
</xsl:when>
<xsl:otherwise>
<xsl:message terminate="yes">
Transformation failed: POM "<xsl:value-of select="concat($artifactCoordinate,'-pom.xml')"/>" doesn't exist.
</xsl:message>
</xsl:otherwise>
</xsl:choose>
And, for completeness, a POM extract with the "bad" attribute:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<!-- ... -->
<properties>
<proalpha.version>[5.2a]</proalpha.version>
<proalpha.openedge.version>[10.1B]</proalpha.openedge.version>
<proalpha.optimierer.version>[1.1]</proalpha.optimierer.version>
<proalpha.sonic.version>[7.6.1]</proalpha.sonic.version>
</properties>
</project>
Your problem is that the POM extract uses default namespace. This means that the elements, although unprefixed, are in the "http://maven.apache.org/POM/4.0.0" -- not in the "no namespace".
However, in this XPath expression, in the XSLT code:
document(concat($artifactCoordinate,'-pom.xml'))/project/properties
the names project and properties are unprefixed. XPath always treats unprefixed names as belonging to "no namespace". Hence, no such elements are found and no node is selected.
Solution: Add a namespace definition to your <xsl:stylesheet>, lets say:
xmlns:p="http://maven.apache.org/POM/4.0.0"
Then rewrite element names in any expressions referencing POM nodes from someElement to p:someElement. For example:
document(concat($artifactCoordinate,'-pom.xml'))/p:project/p:properties
This is a namespace problem. The xmlns="http://maven.apache.org/POM/4.0.0" in the source document means that all the elements are by default put into the "http://maven.apache.org/POM/4.0.0" namespace in the XML document.
If you want to get ahold of them in your xslt, you need to declare that namespace in your xslt (with or without a prefix to use) and then use that namespace when selecting your elements.
For example, I'm guessing that the template in your example is meant to match an "abcs" element in your POM, yes? Try adding a namespace declaration in your xsl:stylesheet, e.g.:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:pom="http://maven.apache.org/POM/4.0.0" version="1.0">
That says to the XSL "I want to add 'pom' as a prefix that identifies the 'http://maven.apache.org/POM/4.0.0' namespace in this document."
Then when selecting elements or matching templates, use that prefix, e.g.:
<xsl:template match="pom:abcs">
Or try it without the prefixes by declaring your stylesheet with the POM namespace as default, something like:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="http://maven.apache.org/POM/4.0.0" version="1.0">
Node can (if using XSLT 2.0+) also be adressed via * because they lie in another namespace .
<xsl:copy-of select="document(concat($artifactCoordinate,'-pom.xml'))/*:project/*:properties)"/>
This can be just convienent or especially useful if the namespace is unknown. In this case the nice side effect is that if the namespace is marked this way the nodes from the other namespace don't get an annotation - which is not wanted in our case.
I have an XML document that needs to pass text inside an element with an '&' in it.
This is called from .NET to a Web Service and comes over the wire with the correct encoding &
e.g.
T&O
I then need to use XSLT to create a transform but need to query SQL server through a SP without the encoding on the Ampersand e.g T&O would go to the DB.
(Note this all has to be done through XSLT, I do have the choice to use .NET encoding at this point)
Anyone have any idea how to do this from XSLT?
Note my XSLT knowledge isn’t the best to say the least!
Cheers
<xsl:text disable-output-escaping="yes">&<!--&--></xsl:text>
More info at: http://www.w3schools.com/xsl/el_text.asp
If you have the choice to use .NET you can convert between an HTML-encoded and regular string using (this code requires a reference to System.Web):
string htmlEncodedText = System.Web.HttpUtility.HtmlEncode("T&O");
string text = System.Web.HttpUtility.HtmlDecode(htmlEncodedText);
Update
Since you need to do this in plain XSLT you can use xsl:value-of to decode the HTML encoding:
<xsl:variable name="test">
<xsl:value-of select="'T&O'"/>
</xsl:variable>
The variable string($test) will have the value T&O. You can pass this variable as an argument to your extension function then.
Supposing your XML looks like this:
<root>T&O</root>
you can use this XSLT snippet to get the text out of it:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" />
<xsl:template match="root"> <!-- Select the root element... -->
<xsl:value-of select="." /> <!-- ...and extract all text from it -->
</xsl:template>
</xsl:stylesheet>
Output (from Saxon 9, that is):
T&O
The point is the <xsl:output/> element. The defauklt would be to output XML, where the ampersand would still be encoded.