How can I share match patterns between keys? - xslt

I have two keys with the same match pattern. The pattern is long. The pattern itself doesn't matter; the problem is the long duplication:
<xsl:key name="narrow-things-by-columnset" match="p | p-cont |
heading[not(parent::section or parent::contents) and not(parent::p)] |
language-desc | country-desc | graphic[not(parent::section or parent::contents)] |
block-quote | bulleted-list | blank-line |
bibliography | language-name-index | language-code-index | country-index | table-of-contents"
use="sileth:columnset-id(.)"/>
<!-- TODO: DRY: I would love to be able to share the above match pattern instead of
duplicating it. -->
<xsl:key name="narrow-things-by-section" match="p | p-cont |
heading[not(parent::section or parent::contents) and not(parent::p)] |
language-desc | country-desc | graphic[not(parent::section or parent::contents)] |
block-quote | bulleted-list | blank-line |
bibliography | language-name-index | language-code-index | country-index | table-of-contents"
use="sileth:section-id(.)"/>
The DRY principal reminds us that when we have duplication of data, we run into problems keeping the multiple copies synchronized. Indeed that just happened to me, causing a bug that took a while to track down.
So I would like to be able to share a single, common match pattern between the two keys. AFAIK you can't do that using a variable. Is there some other way to do it?

I'd define a general entity with the pattern, and refer to it from the two locations. So the stylesheet would begin
<!DOCTYPE xsl:stylesheet [
<!ENTITY match-elements "p | p-cont
| heading[not(parent::section or parent::contents)
and not(parent::p)]
| language-desc | country-desc
| graphic[not(parent::section or parent::contents)]
| block-quote | bulleted-list | blank-line
| bibliography | language-name-index | language-code-index
| country-index | table-of-contents">
]>
<xsl:stylesheet ...>
...
And the two key uses would be:
<xsl:key name="narrow-things-by-columnset"
match="&match-elements;"
use="sileth:columnset-id(.)"/>
<!-- DONE: DRY: Isn't is nice to be able to share the above
match pattern instead of duplicating it?
Hooray for general entities! -->
<xsl:key name="narrow-things-by-section"
match="&match-elements;"
use="sileth:section-id(.)"/>

How about a two-level hierarchy of keys?
like so...
<xsl:key name="narrowable-things" match="p | p-cont |
heading[not(parent::section or parent::contents) and not(parent::p)] |
language-desc | country-desc | graphic[not(parent::section or parent::contents)] |
block-quote | bulleted-list | blank-line |
bibliography | language-name-index | language-code-index | country-index | table-of-contents"
use="'universe'"/>
<xsl:key name="narrow-things-by-columnset" match="key('narrowable-things','universe')" use="sileth:columnset-id(.)"/>
<xsl:key name="narrow-things-by-section" match="key('narrowable-things','universe')" use="sileth:section-id(.)" />

Related

Tables in R markdown

I would like to create a manual table in R markdown, I am aiming to have the final output as follow:
I tried the following code but it did not work:
Authority | Responsibility | Period
:----- | :---- | :-----
MOIWR | Text 1 | 2010
^^ | Text 2 | 2011
^^ | Text 3 | 2012
IWC | Text 4 | 2013
SGB | Text 5 | |
could you please help me to figure out how to do that !
Pandoc, the converter used in R Markdown, does not yet support Markdown tables with cells spanning multiple rows and/or columns. A good workaround is to write the table in HTML and to parse it in a Lua filter.
The following filter detects HTML tables and makes sure they can be converted to different output formats:
function RawBlock (raw)
if raw.format:match 'html' and raw.text:match '^%s*%<table' then
return pandoc.read(raw.text, 'html').blocks
end
end
Use the filter like this:
---
output:
html_document:
pandoc_args:
- '--lua-filter=html-table.lua'
---
``` {=html}
<table>
<tr>
<td>column 1</td>
<td>column 2</td>
</tr>
<tr>
<td colspan="2">column 1 and 2</td>
</tr>
</table>
```

how to get document that contain more than collection in xslt?

let $stylesheet := "abc.xsl"
let $params := map:map()
let $_ := map:put ($params,"col1","abc")
return
xdmp:xslt-invoke(
$stylesheet, (), $params,
<options xmlns="xdmp:eval">
<template>a:schema</template>
</options>)
abc.xsl
<xsl:template name="a:schema">
<xsl:param name="collection-uri" as="xs:string" select="$col1"/>
<xsl:apply-templates select="collection($collection-uri)"/>
</xsl:template>
In this currently ,we are taking all the document ,which is coming in collection "abc".
But I want to add more than one collection in $param map, so that the document which contain ,both collection "abc" and "def" will comes
for example :
| Document| collection
|:---------|:----------:|
| Doc1 | abc, def |
| Doc2 | abc |
| Doc3 | abc, def |
it will pick Doc1 and Doc3
collection() accepts a sequence of xs:string, but would return any of the documents in either of the collections specified.
If you want only the docs that are in all of the collections specified, you could use cts:search() with a sequence of cts:collection-query() inside of a cts:and-query().
<xsl:template match="/">
<xsl:param name="collection-uri" as="xs:string*" select="$col1"/>
<xsl:apply-templates select="cts:search(doc(), cts:and-query(( $collection-uri ! cts:collection-query(.) )))"/>
</xsl:template>
Enable the 1.0-ml dialect, so that you can use the cts built-in functions by adding the following attribute to your xsl:stylesheet element:
xdmp:dialect="1.0-ml"
The $collection-uri param is declared as xs:string, so it will only have one string value. You could change that to be a sequence of strings with either * or + quantifier:
<xsl:param name="collection-uri" as="xs:string*" select="$col1"/>
and then set the collections on the $col1 param:
let $_ := map:put ($params,"col1", ('abc', 'def'))

REGEX XML question - Finding value of a name value pair

I have always try to do quick Regex query on XML tags. Yep, I have been told it is not a good idea an you should load it in an object but sometimes like this, it is a in Oracle db blob. I need to get the VALUE for a specific NAME in a name value pair XML like this :
<entry>
<string>NAME</string>
<string>VALUE</string>
</entry>
Is there a way to do this with REGEX
You should parse XML using an XML Parser and in Oracle you can use XMLQUERY:
SELECT XMLQUERY(
'/root/entry/string[1][text()="NAME2"]/../string[2]/text()'
PASSING XMLTYPE( xml, NLS_CHARSET_ID('UTF8') )
RETURNING CONTENT
) AS value
FROM table_name;
Or XMLTABLE:
SELECT value
FROM table_name
CROSS APPLY XMLTABLE(
'/root/entry'
PASSING XMLTYPE( xml, NLS_CHARSET_ID('UTF8') )
COLUMNS
name VARCHAR2(20) PATH './string[1]',
value VARCHAR2(20) PATH './string[2]'
)
WHERE name = 'NAME2';
Which for the sample data:
CREATE TABLE table_name ( xml BLOB );
DECLARE
value CLOB := '<root>
<entry>
<string>NAME1</string>
<string>VALUE1</string>
</entry>
<entry>
<string>NAME2</string>
<string>VALUE2</string>
</entry>
<entry>
<string>NAME3</string>
<string>VALUE3</string>
</entry>
<entry>
<string>NAME4</string>
<string>VALUE4</string>
</entry>
</root>';
dest_offset INTEGER := 1;
src_offset INTEGER := 1;
lang_context INTEGER := DBMS_LOB.DEFAULT_LANG_CTX;
result BLOB;
warning INTEGER;
warning_msg VARCHAR2(50);
BEGIN
DBMS_LOB.CreateTemporary(
lob_loc => result,
cache => TRUE
);
DBMS_LOB.CONVERTTOBLOB(
dest_lob => result,
src_clob => value,
amount => LENGTH( value ),
dest_offset => dest_offset,
src_offset => src_offset,
blob_csid => DBMS_LOB.DEFAULT_CSID,
lang_context => lang_context,
warning => warning
);
INSERT INTO table_name ( xml ) VALUES ( result );
END;
/
Both outputs:
| VALUE |
| :----- |
| VALUE2 |
Can you do it with a regular expression? Yes:
SELECT REGEXP_SUBSTR(
TO_CLOB( xml ),
'<entry>\s*<string>NAME2</string>\s*<string>([^<]*)</string>\s*</entry>',
1,
1,
'c',
1
) AS value
FROM table_name
Which outputs:
| VALUE |
| :----- |
| VALUE2 |
db<>fiddle here
However, you shouldn't as the XML parsing functions take an XPATH which specifies where it should look for the data. The regular expression will just treat the value as a string and look for the first match even if it is not in the expected place in the XML hierarchy.
For example, if your data is:
<root>
<entry>
<string>NAME1</string>
<string>VALUE1</string>
<other><entry><string>NAME2</string><string>NOT THIS</string></entry></other>
</entry>
<entry>
<string>NAME2</string>
<string>VALUE2</string>
</entry>
</root>
Then XMLQUERY and XMLTABLE will find the correct value but the regular expression outputs:
| VALUE |
| :------- |
| NOT THIS |
db<>fiddle here
Or, if your data suddenly has an attribute:
<root>
<entry>
<string>NAME1</string>
<string>VALUE1</string>
</entry>
<entry>
<string>NAME2</string>
<string attr="attr value">VALUE2</string>
</entry>
</root>
Then parsing with the regular expression will fail and return NULL.
db<>fiddle here
So, don't use a regular expression, use a proper XML parser.

XSLT 2.0 / XPATH - choose-when testing node

In XPATH under XSLT 2.0, I am unclear as to why an xsl:choose/xsl:when #test isn't working.
When I run this template testing for the element tei:del[#rend='expunctus'], the test DOES NOT return the result:
<xsl:template match="tei:del[#rend='expunctus'] |
tei:gap |
tei:sic |
tei:supplied[#reason='added'] |
tei:surplus[#reason='repeated' or #reason='surplus'] |
tei:unclear">
<xsl:choose>
<xsl:when test="tei:del[#rend='expunctus']">
[<xsl:text>EXPUNCTUS</xsl:text>]
</xsl:when>
</xsl:template>
When I run this template with just the attribute #rend='expunctus' as the test, the test DOES return the result:
<xsl:template match="tei:del[#rend='expunctus'] |
tei:gap |
tei:sic |
tei:supplied[#reason='added'] |
tei:surplus[#reason='repeated' or #reason='surplus'] |
tei:unclear">
<xsl:choose>
<xsl:when test="#rend='expunctus'">
[<xsl:text>EXPUNCTUS</xsl:text>]
</xsl:when>
</xsl:template>
Is this because of the current node already selected?
I prefer to test against the element, not just the attribute, to eliminate possible ambiguity.
Thanks.
Yes, it is because of the current node selected.
Your template matches tei:del[#rend='expunctus'] (amongst other things), so when you do <xsl:when test="tei:del[#rend='expunctus']"> this is relative to the node you have matched, so it is looking for another tei:del as a child node of the current node.
What you probably need to do is this...
<xsl:when test="self::tei:del[#rend='expunctus']">
Alternatively, consider using separate templates for each possible node and putting any shared code in a named template.

Flexible XSL Date / DateTime Transformation

We're using Tibco BusinessWorks to pass an XML document to a Tibco BusinessEvents process. Because BusinessEvents does not have a DATE format, only DATETIME, we must change Dates in the source XML document before sending to BusinessEvents, and map the response's DateTime values back to simple Dates.
This is both annoying and cumbersome.
In an attempt to improve BusinessWorks' performance, I'm writing a stylesheet to handle the mapping. Here's what I've got.
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:inf="http:/the.company.namespace">
<xsl:template match="node()">
<xsl:copy>
<xsl:apply-templates select="node()" />
</xsl:copy>
</xsl:template>
<xsl:template match="inf:PriorExpirationDate | inf:OrderDate | inf:SegmentEffectiveDate |
inf:SegmentExpirationDate | inf:CancelDate | inf:NewBusinessEffectiveDate |
inf:NewBusinessExpirationDate | inf:RenewalEffectiveDate | inf:RenewalExpirationDate |
inf:QuestionDate | inf:ViolationDate | inf:ConvictionDate |
inf:EffectiveDate | inf:RatingDate | inf:AdvanceDate |
inf:SIDRevisionDate | inf:DriverLicensedDate |
inf:ESignatureDate | inf:UploadDate | inf:CancelDate |
inf:CancelProcessedDate | inf:CancelEffectiveDate | inf:CreatedDate |
inf:QuoteCreationDate | inf:QuoteModifiedDate | inf:QuoteExpirationDate |
inf:RateStartDate | inf:RateEndDate | inf:ChangeEffectiveDate | inf:PostDate |
inf:EffectiveDate | inf:ExpirationDate | inf:BirthDate |
inf:InstallmentDueDate | inf:CommercialDriverLicenseDate ">
<xsl:element name="{name()}">
<xsl:value-of
select="concat(format-date(text(),'[Y0001]-[M01]-[D01]'), 'T00:00:00')" />
</xsl:element>
</xsl:template>
While functional, this not ideal. I'd prefer NOT to have to enumerate each element I need transformed, I'd rather specify a TYPE that needs to be converted.
(1) Does XSL offer this functionality ?
(2) Alternatively, there are some element names that may be DATEs in one location and DATETIMEs in others. Is there an efficient way of excluding some DATETIME elements if I know the parent by name ?
(3) Lastly, does anyone see room for enhancement beyond the scope of the question ?
Some context: The original mapping is done inside BusinessWorks' editor, where the GUI generates its own mapping file, a several-hundred-line series of if/then/else statements. For a 50k document (our average) this amounts to nearly 20ms overhead per transformation for a web service that completes its actual work in fewer than 50ms. This is the bottleneck that must be improved upon.
Just use:
<xsl:template match="*[. castable as xs:date]">
<!-- Your code here -->
</xsl:template>