xslfo with FOP: Check if content overflows and call different template? - xslt

I have a question with XSLFO, generator is FOP. What I wanna do:
In the PDF I wanna generate an item list, each item is in a box with a specific width and height. In case the content does not fit this box, the content should be displayed in a bigger box (with also specific dimensions).
I do not see any way to reach that in XSLFO, especially with FOP.
Has someone an idea to solve that?
Thanks for every idea!!

There are two separate, independent processing steps involved here:
Generation of XSL-FO markup (using a stylesheet and an XSLT processor).
Rendering of XSL-FO markup as PDF (using a FO processor, such as FOP).
The second step cannot influence the first. It is not possible to test for overflow conditions during rendering and somehow decide what template to invoke. There is no feedback loop. What you are asking for is not possible.
It is possible to do crude text fitting by estimating the length of text strings in XSLT. That is the idea behind "Saxon Extension for Guessing Composed Text String Length".
I have not used this extension, and it may not even be available anymore (the announcement about it is from 2004). In any case, this is very far from an actual layout feedback mechanism.

Related

MFC: what is the best way to generate and display large document?

I am still new to this area. My current project requires to generate and display large report (over a few hundred pages). The structure of document is quite simple but still contains row-column formatting with a few colors, fonts and lines. Also, it needs to be printable which is quite a headache. The approach I am taking is to use browser control plus HTML. One issue is that when the document gets big, UI is pretty lagging. Is there other way of doing that?

Is there a way to count tags on a physical (PDF) page using XSL-FO?

Here is the scenario. I have an XML document which contains tags. I want to create a transform that does this
<tag>content A</tag> 1. content A
<tag>content B</tag> ----> 2. content B
<tag>content C</tag> 3. content C
but only if the tag contents appear on the same physical page. The numbering should restart on each new page. Is there any way to do this using XSL-FO? I know with latex the only way to accomplish something like this is to run latex twice, with the interim document used to determine content page placement.
As far as I can tell (and as confirmed by the Antenna House tech support team), there is no way to do this using standard XSL-FO. Antenna House offers <axf:footnote*/> extensions which include the ability to set an axf:footnote-number-reset="page" attribute, and as suggested in the comments, RenderX offers a generic mechanism which might be used for this purpose, but both of these involve vendor-specific extensions to the language.
This points to a number of shortcomings in XSL-FO that really should have been addressed a long time ago with a 2.0 version of the specification. A w3c committee to develop an XSL-FO 2.0 spec was formed and then disbanded quite some time ago; I have no idea why, as I find the tool indispensable for a large class of document to PDF conversions.

XSL FO Repeat contents of split spanned cell

I am using apache FOP 0.95 (and docbook on top of it) and I would like to repeat the content of a table cell spanning multiple rows whenever a page break happens. At the moment the cell content is only displayed on the first page while an empty cell is displayed in all other pages.
I know this is part of the XSL FO 2.0 requirements, I believe not yet final.
I am a beginner with xsl transformations and I was wondering if there is a way to define a template to achieve this.
Thanks,
Pierpaolo
I am a beginner with xsl transformations and I was wondering if there is a way to define a template to achieve this.
The answer is almost certainly no. You are referring to a suggested new feature that might be included in a future XSL (XSL-FO) specification. Conformant XSL-FO processors will implement the feature if it is considered valuable enough.
XSLT and XSL-FO are related in the sense that the former is the most common way to generate the latter. But in general, you can't enhance the functionality of an XSL-FO processor by writing a clever XSLT stylesheet.

XSLT to convert an XML element containing RTF data to HTML?

OK, so here's the background:
We have a third-party piece of software that does a lot of complicated stuff to generate an XML file from a lot of tables based on a wide array of business rules. The software allows you to apply an XSL transformation by supplying an XSLT file as part of its workflow, before continuing on in the process, which is usually an upload to one or more servers, based on more business rules.
Here's the problem:
One of the elements (with more on the way) this application is processing contains RTF text, and needs to be converted into formatted HTML before being uploaded. There are no means of transforming the XML inside the application other than through an XSLT file, and once we output the file, we cannot resume the workflow. My original thought was, "Easy! someone must have written a few XSL transforms for converting RTF to formatted HTML!" Hours of searching later, I must conclude I either suck at searching or it's awfully obscure.
Disclaimers:
I know the software is pretty darned limited; I'm stuck with it.
I know there are a lot of third-party tools to do this; they are not available to me because I would need to run them externally.
I know that this is not a pretty or efficient thing to do with XSLT. Changing that is not an option for me at this point.
If I cannot find a means to do this through pure XSL transforms, I will need to output the files locally, run the extra process, and take the destination routing on through a custom process. I really don't want to do that.
Does anyone have access to an XSL transformation function/ scheme that will allow me to do this natively in the application? Perhaps a series of regular expressions I could use or something?
So it turns out that external scripts can be invoked from the XSLT. It seems I will be using another scripting language to get this to work. I'm a little bummed there was no other answer available.

Html renderer with limited resources (good memory management)

I'm creating a linux program in C++ for a portable device in order to render html files.
The problem is that the device is limited in RAM, thus making it impossible to open big files (with actual software).
One solution is to dynamically load/unload parts of the file, but I'm not sure how to implement that.
The ability of scrolling is a must, with a smooth experience if possible
I would like to hear from you what is the best approach for such situation ?
You can suggest an algorithm, an open-source project to take a look at, or a library that support what I'm trying to do (webkit?).
EDIT:
I'm writing an ebook reader, so I just need pure html rendering, no javascript, no CSS, ...
To be able to browse a tree document (like HTML) without fully loading, you'll have to make a few assumptions - like the document being an actual tree. So, don't bother checking close tags. Close tags are designed for human consumption anyway, computers would be happy with <> too.
The first step is to assume that the first part of your document is represented by the first part of your document. That sounds like a tautology, but with "modern" HTML and certainly JS this is technically no longer true. Still, if any line of HTML can affect any pixel, you simply cannot partially load a page.
So, if there's a simple relation between position the the HTML file and pages on screen, the next step is to define the parse state at the end of each page. This will then include a single file offset, probably (but not necessarily) at the end of a paragraph. Also part of this state is a stack of open tags.
To make paging easier, it's smart to keep this "page boundary" state for each page you've encountered so far. This makes paging back easy.
Now, when rendering a new page, the previous page boundary state will give you the initial rendering state. You simply read HTML and render it element by element until you overflow a single page. You then backtrack a bit and determine the new page boundary state.
Smooth scrolling is basically a matter of rendering two adjacent pages and showing x% of the first and 100-x% of the second. Once you've implemented this bit, it may become smart to finish a paragraph when rendering each page. This will give you slightly different page lengths, but you don't have to deal with broken paragraphs, and that in turn makes your page boundary state a bit smaller.
Dillo is the lightest weight Linux web browser that I'm aware of.
Edit: If it (or its rendering component) won't meet your needs, then you might find Wikipedia's list of and comparison of layout engines to be helpful.
Edit 2: I suspect that dynamically loading and unloading parts of an HTML file would be tricky; for example, how would you know that a randomly chosen chunk of the file isn't in the middle of a tag? You'd probably have to use something like SAX to parse the file into an intermediate representation, saving discrete chunks of the intermediate representation to persistent storage so that they won't take up too much RAM. Or you could parse the file with SAX to show whatever fits in RAM at once then re-parse it whenever the user scrolls too far. (Stylesheets and Javascript would ruin this approach; some plain HTML might too.) If it were me, I'd try to find a simple markup language or some kind of rich text viewer rather than going to all of that difficulty.