One large xslt over smaller more granular ones - xslt

We have one large xslt that renders a whole shop area including products, manifacturers and does filtering based on price and cateogry on top of that.
I'm using sitecore as a CMS and I'm having problems with caching. I have about 9000 items and some pages take as much as 20s to render.
Will it be better to split the xslt into smaller parts? Does it improve speed?
I think the xslt engine sitecore uses is called Nexus.
Update:
I think we need to optimise the xslt. Even though there were about 9k items the sitecore profiler showed we're actually traversing about 250k items while doing various checks.

you probably get a better performance by applying other changes than splitting the XSLT file. Without seeing the XSLT it is hard to spot bottlenecks but you will find some best practices for XSLT performance here:
http://www.dpawson.co.uk/xsl/sect4/N9883.html#d15756e150
In addition it might be very helpful to use an XSLT profiler in that case.
Some performance tricks also depend on the engine that you are using, so some additional information might be useful here as well.
If you could post your XSLT code I might help you in finding possible bottlenecks.

Sounds like the problem is with sitecore not XSLT (I've done faster transforms against 10's of K rows), but I'd advise splitting generally to enable code reuse.

Separating one huge rendering into smaller ones will help if you use Sitecore caching. Having multiple renderings will allow to apply individual cache settings to each.

There are two different issues here:
Separating XSLT files for better readability, maintainability and code reuse
Making performance improvements on your XSLT translations
The first should be done as a best practice, the latter should take care of the extended rendering times you are getting

Definitely use as small of XSLTs that make sense. Thats just good practice and can't hurt performance.

Related

fastest xslt 1.0, and xslt 2.0 processors in terms of performance

which xslt processor you would suggest, for xslt 1.0 and xslt 2.0 respectively, in terms of performance? Say one has a huge xml file, and would like to make the conversion time faster. Does the language it is implemented play a very important role?
For instance, i would like to try a go language implementation, if one has to suggest such an implementation, or a c language processor?
To be clear. I have made things in both, xslt 1.0 and xslt 2.0, so i need a way to make the process faster on each specification.
The implementation language for the processor isn't a very important factor. Sure, different languages pose different challenges: if you write an XSLT processor in C (or any other language without garbage collection) then you have to put a lot of effort into memory management, but that affects the cost and effort of writing the processor more than the performance it ends up achieving.
Performance isn't one dimensional. Different processors will compare differently depending on the actual stylesheet. In addition, it depends what you measure: processor start-up time, stylesheet compile time, source document parsing/building time, single-threaded performance or multi-threaded performance. There's no substitute for making your own measurements tailored to your particular workload.
If the performance of your workload isn't meeting requirements, then seeing how it fares on a different XSLT processor is one of the actions you can take to try and solve the problem. But there are many other actions you can take as well, and some of them may be less disruptive. Before trying a different XSLT processor, check to make sure that XSLT processing is actually a critical component of your overall system performance: I've seen plenty of cases were I've been called in to look at perceived XSLT performance issues and it turned out on investigation that the issues were nothing to do with XSLT processing. And where it is XSLT processing, sometimes a simple change to your XSLT code, or to the way you run the XSLT processor, will make a vast saving.

xsl:value-of vs xsl:apply-templates: what is faster?

Looking at all the examples of XSL, I've noticed that
some people use xsl:value-of while others use apply-templates. Which method is faster than the another and why?
Firstly, they don't do the same thing, so it's not a simple choice based on performance.
Secondly, you can't ask questions about the performance of language constructs except in the context of a specific processor. There are a dozen popular XSLT processors and another dozen that are less well known, and they all perform differently.
Thirdly, if anyone tells you that one is faster than the other, they probably won't tell you all the configuration variables that you need to reproduce their measurements, and if they do, your own circumstances will probably be quite different.
So the best thing to do is to measure it and see, and tell us what results you obtained.

Marklogic xslt performance

I have a XSLT that I'm executing via the xdmp:invoke() function and I'm running into very long processing times to see any result (in some instances timing out completely after max time out of 3600s is reached). This XSLT runs approximately in 5sec in Oxygen editor. Some areas I think maybe impacting performance:
The XSLT produces multiple output files, using xsl:result-document. The MarkLogic XSLT processor outputs these as result XML nodes, as it cannot physically save these documents to a file system.
The XSLT builds variables that contain xml nodes, which then are processed by other template calls. At times these variables can hold a large set of XML nodes.
I've done some profiling on the XSLT and it seem that building the variables seems to be the most time consuming part of the execution. I'm wondering why that's the case and why does it run a lot faster on the saxon processor?
Any insight is much appreciated.
My understanding is that there are some XSLT performance optimizations that are difficult or impossible to implement in the context of a database in comparison to a filesystem. Also, Saxon is the industry leader in XSLT and is significantly faster than almost anything on the market, although that probably doesn't account for the large discrepancy you describe.
You don't say which version of MarkLogic you're running, but version 8.0 has made significant improvements in XSLT performance. A few simple tests I ran suggested 3-4x speed improvement, depending on the XSLT.
I have run into some rare but serious performance edge cases for XSLT when running MarkLogic on Windows. Linux and OSX builds don't appear to have this problem. It is also far more highly pronounced when the XSLT tasks are running on multiple threads.
It is possible, however, to save data directly to the filesystem instead of the database using xdmp:save.
Unless your XSLTs involve very complex templating rules, I would recommend at least testing some of performance-sensitive XSLT logic in XQuery. It may be possible to port the slowest parts and pass the results of those queries to the XSLT. It's not ideal, but you might be able to achieve acceptable performance without rewriting the XSLTs.
Another idea, if the problem is simply the construction of variables in a multi-pass XSLT, is to break the XSLT into multiple XSLTs and make multiple calls to xdmp:xslt-invoke from XQuery. However, I know there is some overhead to making an xdmp:xslt-invoke call, so it may be a wash, or it may be worse.
I have come across similar performance issues with stylesheets in ML 7. To come to think of it I had similar stylesheets as the ones you have mentioned i.e. variables holding sequence of nodes. It seems xslt cannot be possibly optimised as well as xquery is. If you are not satisfied with the performance of your stylesheets I would recommend you to convert the xslt to it's equivalent xquery. I did this and achieved about 1~1.5 secs performance gains. It may be worth the effort :)
Well in my case, it seems that using the fn:not() function in template match rules is causing the slow performance. Perhaps if someone else is experiencing the same problem this might be a good starting point.

What are the advantages of using XSL in Sitecore instead of C#?

While learning Sitecore I have found that the majority of Sitecore sample code on the web is in XSL instead of .NET.
What would be the advantage of choosing XSL over the processes I have become accustomed to as a .NET developer?
Are there processing speed advantages to using XSL?
Is XSL actually easier once you are comfortable with the syntax?
I'll just add my 2 cents too:
I find that there are too many limitations in XSLT that need to be overcome with either external "libraries" or with you developing a method in C# that can be used in XSLT.
So I find using Asp.Net simpler. But then I'm also a lot better with Asp.Net than with XSLT.
But XSLT has some good things:
good when getting fields from the current context item
good with simple content etc.
doesn't force the solution to recycle/rebuild
usually a nice way it fails, ie. the page still works, but the xslt that failed says it fails
When I first started working with Sitecore, my company used quite a bit of XSLT, but we've slowly gone away from that, because of it's limitations and because most people here are more familiar with Asp.Net/C#.
Some folks prefer XSL because of existing team skill set, the availability of XSL talent, or the belief that XSL is easier or cheaper to learn.
In Sitecore, ASP.NET-based sublayouts actually perform much better than XSL renderings. If that's what you are comfortable with, go for it. I've never created an XSL rendering myself.
XSLT is a powerful language; its main advantages over languages like ASP.NET tend to come when you want to reuse and customize logic over a wide variety of different pages or different source document structures with common shared elements and other variable structures. To achieve this it uses a rule-based processing model which some people find quite difficult to get to grips with on first encounter. Learning it is an investment that will pay off over time, but it can be daunting at first.
As for performance, I've never come across a site where it isn't fast enough for the job, and that includes some pretty high-stress services; when people have had performance problems they've usually turned out to be in other parts of the processing pipeline (or simply due to bad coding).
The choice between XSLT and .Net components in Sitecore is largely one of taste and skillset. XSLT in Sitecore does have some drawbacks though - it tends to be outperformed by .NET components for all but the most simple renderings and the places where it might seem most logical to use it, such as replicating content tree structure as a site menu, are actually those that tend to take the biggest performance hit. In the right situations XSLT is an incredibly powerful tool and well worth learning, but I've yet to see a convincing argument for making much use of it in Sitecore. It's also worth noting that some of the standard patterns of XSLT programming aren't the most efficient in Sitecore.
The only real advantage I can think of, would be that XSLT renderings are easier to deploy in isolation. Say, for instance, that you're updating your "News Spots" rendering and you want to deploy this change to test/production right away - it would be a simple case of uploading the .xsl file itself.
Using .NET development (and enduring the Web Application Project model), a deployment of the code base would implicitly deploy any and all changes to the affected assemblies - including whatever work you have in progress.
There are, of course, ways you can manage this. Source code branching/merging and so on - but that's an additional layer of complexity to your solution.
That being said, I use .NET for well over 95% of all my Sitecore development myself :-)
"In summary, a primary goal of software design and coding is conquering complexity. The motivation behind many programming practices is to reduce a program's complexity. Reducing complexity is a key to being an effective programmer." -Steve McConnell (1993)
Let that guide when to use XSLT over C#.

Memory-efficient XSLT Processor

I need a tool to execute XSLTs against very large XML files. To be clear, I don't need anything to design, edit, or debug the XSLTs, just execute them. The transforms that I am using are already well optimized, but the large files are causing the tool I have tried (Saxon v9.1) to run out of memory.
I found a good solution: Apache's Xalan C++. It provides a pluggable memory manager, allowing me to tune allocation based on the input and transform.
In multiple cases it is consuming ~60% less memory (I'm looking at private bytes) than the others I have tried.
You may want to look into STX for streaming-based XSLT-like transformations. Alternatively, I believe StAX can integrate with XSLT nicely through the Transformer interface.
It sounds like you're sorted - but often, another potential approach is to split the data first. Obviously this only works with some transformations (i.e. where different chunks of data can be treated in isolation from the whole) - but then you can use a simple parser (rather than a DOM) to do the splitting into manageable pieces, then process each chunk separately and reassemble.
Since I'm a .NET bod, things like XmlReader can do the chunking without a DOM; I'm sure there are equivalents for every language.
Again - just for completeness.
[edit re question]
I'm not aware of any specific name; maybe Divide and Conquer.
For an example; if your data is actually a flat list of like objects, then you could simply split the first-level children - i.e. rather than having 2M rows, you split it into 10 lots of 200K rows, or 100 lots of 20K rows. I've done this before lots of times for working with bulk data (for example, uploading in chunks of data [all valid] and re-assembling at the server so that each individual upload is small enough to be robust).
For what it's worth, I suspect that for Java, Saxon is as good as it gets, if you need to use XSLT. It is quite efficient (both cpu and memory) for larger documents, but XSLT itself essentially forces full in-memory tree of contents to be created and retained, except for limited cases. Saxon-SA (for-fee version) supposedly has extensions to allow taking advantage of such "streaming" cases, so that might be worth checking out.
But the advice to split up the contents is the best one: if you are dealing with independent records, just split the input using other techniques (like, use Stax! :-) )
I have found that a custom tool built to run the XSLT using earlier versions of MSXML makes it very fast, but also consumes incredible amounts of memory, and will not actually complete if it is too large. You also lose out on some advanced XSLT functionality as the earlier versions of MSXML don't support the full xpath stuff.
It is worth a try if your other options take too long.
That's an interesting question. XSLT could potentially be optimized for space, but I expect all but the most obscure implementations around start by parsing the source document into DOM, which is bound to use a low multiple of the document size in memory.
Unless the stylesheet is specially designed to support a single-pass transformation, reasonable time performance would probably require parsing the source document into a disk-based hierarchical database.
I do not have an answer, though.
It appears that Saxon 9.2 may provide an answer to your problem. If your document can be transformed without using predicates (does not reference any siblings of the current node) you may be able to use Streaming XSLT.
See this link
I have not tried this myself, I am just reading about it. But I hope it works.
Are you using the Java version of Saxon, or the .Net port? You can assign more memory to the Java VM running Saxon, if you are running out of memory (using the -Xms command line parameter).
I've also found that the .Net version of Saxon runs out of memory less easily than the Java version.
For .NET you can use solution suggestion on Microsoft Knowledge Base:
http://support.microsoft.com/kb/307494
XPathDocument srcDoc = new XPathDocument(srcFile);
XslCompiledTransform myXslTransform = new XslCompiledTransform();
myXslTransform.Load(xslFile);
using (XmlWriter destDoc = XmlWriter.Create(destFile))
{
myXslTransform.Transform(srcDoc, destDoc);
}
Take a look at Xselerator