How to create Source that results in empty node sequence - xslt

Is there a way to create a javax.xml.transform.Source implementation that Saxon 10 will interpret as an empty node sequence. There are various ways of constructing a Source that is a document node that has no children, i.e. an empty document, but in this case the Source should return an empty node sequence.
Use case for this is an javax.xml.transform.URIResolver implementation that is able to return a Source for an empty document node sequence, when XSLT document() function is used. This allows the URIResolver to mimic a recoverable error behaviour when the target resource is not available.

No sorry, I think I led you up the garden path on this one. It can't be done; a Source must either resolve to a single Node, or fail.

Related

How to skip a self closing tag in a ST function on an SAP system?

So I have this problem handling an XML file in my SAP ABAP-based software, with a Simple Transformation.
The file I receive have normally no empty tags like <test></test>, but can happen sometimes that I receive some self closing tag like <test/>.
This is an example of what I thought to use now. The first condition handles if the ref('test') is blank by skipping it. The second one takes the values if we have one.
<tt:cond check="initial(ref('test'))">
<tt:skip count="*" name="test"/>
</tt:cond>
<tt:cond check="not-initial(ref('test'))">
<test tt:value-ref="test"/>
</tt:cond>
The idea is: if we have this tag <test/> we need to skip it, otherwise we need to assign the data. Now, this working in the first case, cause he takes no date, but not in the second cause it not takes the data again.
Someone can help?
Thanks in advantages.
The XDM tree representations of <test></test> and <test/> are 100% identical, so there is no way an XSLT stylesheet can distinguish them or treat them differently. The idea of attaching different meanings to the two constructs is completely misguided: you can never be sure which representation an XML library will choose to use.
It is of course possible to distinguish an element that contains a value (such as <test>value</test>) from one that is empty - but both the above examples represent empty elements and must be treated as equivalent.

RegEx to remove specific XML elements

I'm using Kate to process text to create an XML file but I've hit a roadblock. The text now contains additional data that I need to remove based on its content.
To be specific, I have an XML element called <officers> that contains 0 or more <officer> elements, which contain further elements such as <title>, <name>, etc.. While I probably could exclude these at run time using XSL, the file also drives another process that I don't want to touch - it's a general purpose data importer for Scribus so I don't want to touch the coding.
What I want to do is remove an <officer> element if the <title> content isn't what I want. For example, I don't want the First VP, so I'd like to remove:
<officer>
<title>First VP</title>
<incumbent>Joe Somebody</incumbent>
<address>....</address>
<address>....</address>
......
</officer>
I don't know how many lines will be in any <officer> element nor what positions they will in within the <officers> element.
The easy part it getting to the start of the content I want removed. The hard part is getting to the </officer> end tag. All the solutions I've found so far just result in Kate deciding that the RegEx is invalid.
Any suggestions are appreciated.
Regex is the wrong tool for this job; never process XML without a proper parser, except possibly for a one-off job on a single document where you will throw the code away after running it and checking the results by hand. You might find a regex that works on one sample document, but you'll never get it to work properly on a well-designed set of 100 test documents.
And it's easily done using XSLT. It's a stylesheet with two template rules: a default "identity template" rule to copy elements unchanged, and a second rule to delete the elements you don't want. In fact in XSLT 3.0 it gets even simpler:
<xsl:mode on-no-match="shallow-copy"/>
<xsl:template match="officer[title='First VP']"/>

Pugixml: No document element found

I'm having some trouble loading the document (see link http://pastebin.com/FE3nDX9h) in pugixml.
I'm getting an error code of 16: No document element found which indicates that the XML file is invalid or empty which I think is neither.
I am using the default parsing method. Is there something I am missing?
edit: as requested heres some source code http://pastebin.com/USUjLC4q you will need to edit the paths.
You need xml_document::load_file but xml_document::load.
From pugi documentation:
There is also a simple helper function, xml_document::load, for cases when you want to load the XML document from null-terminated character string.
So, load's argument has to be xml by itself, not file name.

Reading XML with xerces: Getting type where <nodeName type="typeName">

I'm using xerces-c-3.1.1 to read xml files into a C++ program.
I have located a node aNode of type
DOMNode* aNode;
and can get the node name using
name=aNode->getNodeName();
However when I try to use
type=aNode->getNodeType();
to get the type, the type returned is an integer: ELEMENT_NODE.
I would be most grateful if someone could tell me how to write code that enables me to tell whether a node is of name "nodeName" and of type "typeName". I know how to do the former part using
if(wcscmp(name, L"nodeName")==0)
but do not know how to do the latter part.
DOMNode::getNodeType is not shorthand for "get the attribute named type and return it as a string." It does exactly what it says: retrieves the DOM type of the DOM node. DOM nodes are typed objects: elements, text, attributes, processing instructions, comments, CDATA, etc.
The DOM type of the DOM node has nothing to do with what just happens to be stored in the type attribute of an element node. That's for you to get for yourself, using regular attribute accessing syntax.
I guess you want the type from the schema of the XML instance. It would be easier to get the type information on parsing the file using Xerces than after the XML is already parsed and available as a DOM tree. If this is an option for you take a look here: get-schema-data-types-from-xerces The answer at the link describes how to get access to the schema types on parsing the file using the Xerces SAX parser.
If this is not an option for you, you need to keep the (DOM) parser you used to load the XML and also get access to the grammar which was generated from the schema on validating the file... At the end much more effort.
Edit: Ok, after looking at the title of the question I'm more confused if you just want to get the type attribute or the schema type... However, if you want to get access to the attributes just use getAttributes and then getNamedItem to get the attribute you are looking for.
You need to translate the name from XMLStr to char*
char* temp2 = XMLString::transcode(aNode->getNodeName());
std::cout << "The current node name is " << temp2 << std::endl;

MarkLogic: Trying to understand error "Node has complex type with non-mixed complex content"

I'm getting this error during pipeline processing of an xml document, the processing does an xslt transform. It appears to be telling me that the document is in some way invalid, however the document passes validation against the xsd in Oxygen.
First, the error is not telling me the line number in the offending data file, just the line number in the pipeline xqy file, from what I can tell.
Second: The error is grammatically non-sensical to me: It seems to say that a node in the document is defined as a complex type, but that content in the document is non-mixed...why would that matter? Most content is non-mixed, right? So non-mixed content is as I see it sort of the norm in most xml that I see. Thanks.
The error can also occur when some function is expecting a more simple value as argument, but receiving complex element types.
Actually, searching in the archives at http://marklogic.markmail.org/ the error seems to be coming from fn:data() if it is passed 'too' complex values to put it briefly. I think the message is meant to say that the node that is being passed in doesn't have a typed value. See also here: http://www.w3.org/TR/xpath-functions/#func-data
If you provide the full error message, we might be able to help you out..
The document is likely valid, but it doesn't conform to expectations in your XSLT code. Without seeing code and document, muy hunch is that the XSLT is expecting the matching document node to be an element (or similar) but it is an attribute or text node.