XSL display embedded pdf from xml source - xslt

I have an xml document which contains embedded pdf documents in base64 format. I'm using xsl:fo to create a pdf view of the xml, however I have absolutely no idea how to display the embedded documents as part of the overall output using xsl. Could someone help here please. Apologies if this is a very simple question, however I'm brand new to XSL and cannot seem to find any example of this anywhere.

PDF documents are vector images in some sense, and thus can be embedded into PDF output of an XSL FO rendering engine -- so far the first page only.
RenderX XEP accepts data: as URI schema for embedded images, so a base64 encoded PDF file placed as a string to fo:external-graphic/#src should work fine:
src="url('data:application/pdf;base64,encodedpdffilegoeshere...')"

Related

Apache POI - word file to msxml

I have done a bunch of searching and have not found a simple answer to my question.
MS word allows you to Save As, and then select from a variety of formats. What I want to do is have POI open a word file, save as msxml (a truly hideous looking format) and then in a subsequent step, run an xslt transformation on the msxml file.
I saw posts for reading a word file, and then looping over all the text elements and build an xml doc from scratch that way, but I would prefer to do the xml transformation using xslt.
perl OLE allows you to do this. Is there a list of basic commands like that can be run from POI?
thanks.

Any Decent Open Source XSLT designers for XSL-FO output [ WYSIWYG style]

We are planning to render millions of pdf's using Apache FOP by using XSL-FO as input.
Is there a decent XSLT WYSIWYG designer that allows to easily design an XSLT that will transform the XML input data to the XSL-FO required for processing by FOP?
I see a lot of commercial ones - Ecrion , Antenna House.. Any open source ones?
The only somewhat decent editor that I have found is MiniScribus Scribe but I gut stuck with it at the point of wanting to put a horizontal line and the opened odt file lost its table format in Scribe... it says that it doesnt support yet headers/footers and table borders... not so decent.
There are some converters that could be of good use, like html 2 fo and odt to fo converters but the fo code generated by them returned a lot of exceptions from the Apache's FOP processor. The odt/html file with which I was testing had only a table, two horizontal lines and some unformatted text and only one page.
These tools, the convertors and the editor as well are now in beta phase so maybe there will a decent solution, so far I have not been able to find it.

how to get xsl from existing pdf?

Is it possible to get the .xsl file from an existing .pdf file?
I know that with Apache FOP you can get a .pdf file from a .xml and .xsl but I would like to go in the other direction. Any idea?
XML+XSL->PDF with Apache FOP, but is it somehow possible PDF->XSL?????
The reason why I would like to do that is because I want to open a PDF that has a form inside, edit it adding some information to the form and then save it again as PDF.
I already have the edited form as .xml and I'm trying to generate the PDF, but the I need a .xsl file for the layout... so I thought that maybe I could reuse the layout from the original PDF as they will be the same. Any other better approach?? I would like to avoid creating a specific XSL file for every form.
Thanks
Definitely not the XSLT file, since that's not even part of what FOP does. FOP only works with FO documents, the fact that it allows you to use XML+XSLT to get the FO source is just a nice usability feature. However, once it gets the FO file, it doesn't know how that was obtained, so it can't embed in any way the XSLT file.
You could post-process the PDF file using another tool, like PDFBox, to embed any metadata you want.

Using XSL-FO and HTML?

I'm trying to transform some XML-data to HTML with XSLT for my bachelor thesis.
My professor wants me to consider XSL-FO too, or at least to write some word about it. But I'm very noob to this.
So my questions are:
Can I combine FO with HTML? Can I use FO istead of HTML and CSS? If yes, how will my browser render this? Are there any examples/tutorials on how to transform xml into web pages with FO?
Generating XSL-FO and XHTML from XSLT is not necessarily an either-or choice.
XSL-FO is generally used to generate PDF. For that you will need an XSL-FO engine, such as FOP, RenderX, Antenna House, IBex, etc. However, you can convert XSL-FO into XHTML and then render in the browser.
Generally, it wouldn't be worth the hassle to have your XSLT create XSL-FO and then convert into XHTML (just generate XHTML directly), unless you want to create both output formats (PDF and XHTML) with reduced effort.
It is possible to create both **XSL-FO and XHTML at the same time without maintaining two complete sets of stylesheets to create similar output in different vocabularies**.
Rather than choosing between one format or the other, or having to maintain two distinctly diffirent(but similar) stylesheet libraries, you can create your main stylesheet library to generate either XSL-FO or XHTML and then use a second transform to convert from XSL-FO to XHTML and vice-versa. There are existing XSLT stylesheets that you can leverage to do this.
In the past I have developed XSL-FO stylesheets and then used the Render-X FO2HTML stylesheet to convert the XSL-FO to XHTML output. It converts <block> elements into <div>, <inline> into <span>, etc.
I haven't used them before, but you could also try using HTML2FO stylesheets to convert XHTML outut into XSL-FO.
Out of the box, you can get amazingly similar output in both formats while maintaining one XSLT library dedicated to one particular output format.
If you happen to need to customize the output slightly (e.g. different header content for XHTML) then you just need to import/extend the conversion stylesheets and override the appropriate template(s) for the divergent content. This makes maintenance much easier, so you don't have to worry about updating multiple sets of stylesheets with essentially the same information.
XSL-FO is for PDF display only (this is not strictly true, but you can take it as a guideline). So HTML output and FO output are not related. From your XML source, you can use XSLT to generate XHTML or XSL-FO, but not both at the same time.
See for example DocBook. It comes with several XSLT stylesheets ready to use, one is for HTML output and one for PDF (via Apache Fop). If you are satisfied with the result could be a different question.

Any good XSLT sheets for XML to PDF?

Does anyone know of a solid non-encumbered suite of XSLT sheets that can generate PDF via an XSL transform?
I really like XSL transforms, I use them to make HTML output from dbms's.
Now I have some data in a DBMS that I like to generate PDF output from. I've not written a dbms to XML script yet for this data, so I have no commitment to any given set of XML tags.
Thanks in advance!
Depends on your XML. If you have, e.g., Docbook, look at David Pawson's site, for HTML IBM has a nice example over at developerWorks. Both go the way of XML -> via XSLT -> XSL-FO -> via Apache FOP -> PDF.
Cheers,