How to prevent XSLT message function to emit "Warning!"? - xslt

I am using ant's xslt-task plus a suitable XSLT script to remove certain nodes from XML documents that we are generating. To get some feedback during processing I added a few message-statements to the script. These work all fine, except that ALL emitted messages have a ": Warning!" prefix. Since the messages are informative only I want/need to get rid of these Warning-prefix to prevent alerting the user who might otherwise think that something is not OK.
How can I avoid that prefix? The XSLT message function seems to have no other option than 'terminate="yes"|"no". Can one somehow control that message prefix? And if so: how?
I am using the default xslt task built into ant, i.e. my target reads:
<xslt in="${source.xml}" out="${output.xml}" style="${stylesheet.xsl}" processor="trax">
</xslt>
with the misc. properties set to point to the appropriate locations. I found that command in some other stack-overflow append.
My ant version reads:
C:\Users\mmo>ant -version
Apache Ant(TM) version 1.8.2 compiled on December 20 2010

I think you're probably using the default Xalan processor, and I'm afraid I can't advise you whether/how its xsl:message processing can be customized.
If you were to switch to using Saxon you would not only get the benefit of doubled productivity by use of XSLT 2.0 instead of 1.0, but you would also get (a) an interface for customising xsl:message output, and (b) membership of an active user community that can answer questions like this.
(Actually, now I think about it, I seem to recall that with Xalan, xsl:message output is sent to the warning() method of the registered ErrorListener, so if you don't want to switch you could try and write an ErrorListener.)

Related

eXist-DB transformation failure with XSLT - where to find error log?

Environment: eXist 4.2.1 - xquery 3.1 - xslt 3.0 - TEI-XML document
Using the eXide interface, I am attempting to do a transformation of a TEI-XML document with an XSL file, with an output of HTML.
Until now I have been developing XML documents and their XSL transformations in Oxygen. Firing off the transformations in Oxygen or using a terminal, both have been working error free. Now I am preparing a web application using eXist (which will contains the thousands of TEI-XML docs).
I am trying to simply fire off the same transformation in eXist with the following xquery test:
let $result := transform:transform(doc("xmldb:exist://db/apps/deheresi/resources/documents/ms609_0001.xml"), doc("xmldb:exist://db/apps/deheresi/resources/documents/document_style.xsl"), ())
return $result?output
eXide returns me only this:
exerr:ERROR Unable to set up transformer: Stylesheet compilation failed: 62 errors reported [at line 3, column 16]
I'm new at eXist DB and have not been able to figure out how to get the reasons for errors.
How do I access the error details (detail log?) in eXist? (I have searched without success my books and online documentation; for example https://exist-db.org/exist/apps/doc/xsl-transform doesn't help at all on errors).
For Oxygen and terminal transformations, I use Saxon 9he. I understand that eXist uses the same?
NB: my documents are all organized in an eXist collection identical to the setup on my computer, thus all relative locations should function correctly?
First - when using doc and collection functions for the paths in the database you don't need the XML:DB URI, instead you can just use:
transform:transform(doc("/db/apps/deheresi/resources/documents/ms609_0001.xml"),
doc("/db/apps/deheresi/resources/documents/document_style.xsl"), ())
The errors should be in exist.log the default location for that is $EXIST_HOME/webapp/WEB-INF/logs. You might otherwise find them on the "Standard Out" of the terminal session which is running eXist-db.
If you are using the YAJSW (Service Wrapper) to run eXist-db you might also need to check $EXIST_HOME/tools/yajsw/logs.

How to tell Liquibase to ignore a db.changelog*.xml?

I would like liquibase to create a set of unit testing functions ONLY if the database is being created in a DEV environment.
I know I could create a "changeset" tag with a "context" attribute for every unit test function but I'd like to avoid that if possible.
What would be ideal is using "context" with the "includeAll" tag, like:
<includeAll path="./sql/UnitTest/" context="dev" />
but sadly that is not supported.
OR since I have several changelogs:
db.changelog.xml
include db.changelog-tables.xml
include db.changelog-functions.xml
...
include db.changelog-unit_test_functions.xml
If I could tell LiquiBase to skip running "db.changelog-unit_test_functions.xml" based on a command line parameter that would also work.
However, the "context" attribute is not allowed in the "include" element.
<include file="./sql/db.changelog-unit-test.xml" context="dev" />
I tried to attach a "preconditions" test to db.changelog-unit-test.xml but that fails ALL db.changelogs execution.
Does anyone have any clever ideas on how I can avoid writing a granular db.changelog-unit-test.xml?
Thanks!
Context in include or includeAll is working from 3.5
Pay attention to your xsd definition in your file - you need to have at lease http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-3.5.xsd there, otherwise your file will not be validate even when running version newer then 3.5

How to break caching on exist-db of included XSLs in Transform

I have a large set of XSLs that we recently went through and implemented a shared XSL template with common bits. We included an xsl:include in all the main XSLs now to pull these in. We had no issues at first until we started to make changes to the shared XSL.
For information, the whole system is web based, calling queries to dynamically format documents in the database given different XSLs through XSL FO and RenderX.
The main transform is:
let $fo := util:expand(transform:transform($articles, doc("/db/Customer/data/edit/xsl/Custbatch.xsl"), $parameters))
That XSL (Custbatch.xsl) has:
<xsl:include href="Custshared.v1.xsl"/>
If we make an edit to "Custshared.v1.xsl" is not reflected in the result because it is obvious that "Custshared.v1.xsl" is being cached and used. We know this because as you can see the name now includes "v1". If we make a change and change all the references say from v1 to v2, it all works. But this seems a bit ridiculous as that means we have to change the 18 XSLs that include this XSL or do something silly like restart the database.
So, what am I missing in the setup or controller.xql (which has the following on all not matched paths), to get things not to cache. I assume that is all internal so this setting likely does not matter. Is there some other setting in the config that does?
<dispatch xmlns="http://exist.sourceforge.net/NS/exist">
<cache-control cache="no"/>
</dispatch>
In reading the document here: http://exist-db.org/exist/apps/doc/xsl-transform.xml, it states:
"The stylesheet will be compiled into a template using the standard Java APIs (javax.xml.transform). The template is shared between all instances of the function and will only be reloaded if modified since its last invocation."
However, if I change an included XSL, it is not being used.
Update #1
I even went as far as creating a query that returns the XSL that is included, then I use:
<xsl:include href="http://localhost/get-include-xsl.xq"/>
This does work as formatting is not broken, but changing the underlying XSL yields the same result. So even that Xquery result is cached.
Update #2
And yes, through some simple test all is proven.
If I make any change to the root template (like add a meaningless space) and run, it does include the changes made in the include. If I only change the included XSL, no changes happen.
So lacking anything else, we could always write a Xquery that basically touches all the main templates after a change is made to the include template. Seems so wrong as a workaround.
Update #3
So the workaround we are currently using is that we have an unused "variable" in the XSL (version) and when we update the shared template, we execute that query which basically updates the value in that variable. At least it's only one XQuery and maybe we should attach to a trigger.
There is a setting in $exist-db-root$/conf.xml for the XSL transformer where you can turn off caching: <transformer class="net.sf.saxon.TransformerFactoryImpl" caching="no"> (The default is 'yes')

Cannot include sub xsl files within main xsl file based on if statement [duplicate]

The <xsl:import> and <xsl:include> elements seem to behave quite specific.
What I am trying to do:
<xsl:import href="{$base}/themes/{/settings/active_theme}/styles.xsl" />
I want to allow loading different themes for my application. I have a settings in my App which stores the "currently active theme" folder name in a xml node.
Unfortunately the code above won't work.
Does anybody know about a workaround to achieve what I want to do?
edit:
just confirmed with a XSLT guru via Twitter... there's no nice way of doing this. Easiest solution in my case will probably be to seperate frontend and backend stylesheets and load them individually to the XSLTProcessor...
xsl:import assembles the stylesheet prior to execution. The stylesheet can't modify itself while it is executing, which is what you are trying to achieve.
If you have three variants of a stylesheet for use in different circumstances, represented by three modules A.xsl, B.xsl, and C.xsl, then instead of trying to import one of these into the module common.xsl that contains all the common code, you need to invert the structure: each of A.xsl, B.xsl, and C.xsl should import common.xsl, and you should select A.xsl, B.xsl, or C.xsl as the principal stylesheet module when initiating the transformation.
What I am trying to do:
<xsl:import href="{$base}/themes/{/settings/active_theme}/styles.xsl" />
This isn't allowed in any version (1.0, 2.0, or 3.0) of XSLT.
In XSLT 2.0 (and up) one may use the use-when attribute, but the conditions that may be specified are very limited.
One non-XSLT solution is to load the importing XSLT stylesheet as an XmlDocument and use the DOM API to set href attribute to the really wanted value -- only then invoke the transformation.

Preventing XSS in Node.js / server side javascript

Any idea how one would go about preventing XSS attacks on a node.js app? Any libs out there that handle removing javascript in hrefs, onclick attributes,etc. from POSTed data?
I don't want to have to write a regex for all that :)
Any suggestions?
I've created a module that bundles the Caja HTML Sanitizer
npm install sanitizer
http://github.com/theSmaw/Caja-HTML-Sanitizer
https://www.npmjs.com/package/sanitizer
Any feedback appreciated.
One of the answers to Sanitize/Rewrite HTML on the Client Side suggests borrowing the whitelist-based HTML sanitizer in JS from Google Caja which, as far as I can tell from a quick scroll-through, implements an HTML SAX parser without relying on the browser's DOM.
Update: Also, keep in mind that the Caja sanitizer has apparently been given a full, professional security review while regexes are known for being very easy to typo in security-compromising ways.
Update 2017-09-24: There is also now DOMPurify. I haven't used it yet, but it looks like it meets or exceeds every point I look for:
Relies on functionality provided by the runtime environment wherever possible. (Important both for performance and to maximize security by relying on well-tested, mature implementations as much as possible.)
Relies on either a browser's DOM or jsdom for Node.JS.
Default configuration designed to strip as little as possible while still guaranteeing removal of javascript.
Supports HTML, MathML, and SVG
Falls back to Microsoft's proprietary, un-configurable toStaticHTML under IE8 and IE9.
Highly configurable, making it suitable for enforcing limitations on an input which can contain arbitrary HTML, such as a WYSIWYG or Markdown comment field. (In fact, it's the top of the pile here)
Supports the usual tag/attribute whitelisting/blacklisting and URL regex whitelisting
Has special options to sanitize further for certain common types of HTML template metacharacters.
They're serious about compatibility and reliability
Automated tests running on 16 different browsers as well as three diffferent major versions of Node.JS.
To ensure developers and CI hosts are all on the same page, lock files are published.
All usual techniques apply to node.js output as well, which means:
Blacklists will not work.
You're not supposed to filter input in order to protect HTML output. It will not work or will work by needlessly malforming the data.
You're supposed to HTML-escape text in HTML output.
I'm not sure if node.js comes with some built-in for this, but something like that should do the job:
function htmlEscape(text) {
return text.replace(/&/g, '&').
replace(/</g, '<'). // it's not neccessary to escape >
replace(/"/g, '"').
replace(/'/g, ''');
}
I recently discovered node-validator by chriso.
Example
get('/', function (req, res) {
//Sanitize user input
req.sanitize('textarea').xss(); // No longer supported
req.sanitize('foo').toBoolean();
});
XSS Function Deprecation
The XSS function is no longer available in this library.
https://github.com/chriso/validator.js#deprecations
You can also look at ESAPI. There is a javascript version of the library. It's pretty sturdy.
In newer versions of validator module you can use the following script to prevent XSS attack:
var validator = require('validator');
var escaped_string = validator.escape(someString);
Try out the npm module strip-js. It performs the following actions:
Sanitizes HTML
Removes script tags
Removes attributes such as "onclick", "onerror", etc. which contain JavaScript code
Removes "href" attributes which contain JavaScript code
https://www.npmjs.com/package/strip-js
Update 2021-04-16: xss is a module used to filter input from users to prevent XSS attacks.
Sanitize untrusted HTML (to prevent XSS) with a configuration specified by a Whitelist.
Visit https://www.npmjs.com/package/xss
Project Homepage: http://jsxss.com
You should try library npm "insane".
https://github.com/bevacqua/insane
I try in production, it works well. Size is very small (around ~3kb gzipped).
Sanitize html
Remove all attributes or tags who evaluate js
You can allow attributes or tags that you don't want sanitize
The documentation is very easy to read and understand.
https://github.com/bevacqua/insane