Converting HTML file to PDF using Win32/MFC - c++

As part of my application, my client has requested that I include an automated e-mailing system. As part of this system, I generate HTML code and use automation to send it via. Outlook.
However, they also require a PDF copy of the HTML document to be sent as an attachment. My initial attempts involved using libHaru, which proved difficult to use efficiently, as I was required to create the PDF document from scratch, which required computation of the position of each of the lines in a table, and positioning of all the text, etc.
I was wondering if there would be a way to programmatically convert HTML code (or an HTML file if need be) into a PDF document either by using Win32/MFC itself or an external library.
Thanks in advance!
EDIT: Just to clarify, I am looking for solutions which minimize external dependencies.

You should evaluate this utility wkhtmltopdf:
http://code.google.com/p/wkhtmltopdf/
You can call it from the command line without the need to run a setup.
I use it generating my output documents as html then cal a ShellExecute(...) to convert it to PDF. It's great!
Inside uses webkit + qt. So compability with modern HTML is OK.
Hope it helps.

I'd take a look at PDF Creator, which can be used as a COM object (that acts pretty much like a printer). I haven't used it to print HTML, so I'm not sure, but my guess is that you'll probably end up having to instantiate a web browser control to render the HTML, and then feed it from there to the PDF control.

Some possible answers are in this thread:
C++ Library to Convert HTML to PDF?
Not sure if they will satisfy your particular requirements, but these might at least get you started.
Edit:
Some other possible options here.

Not MFC but you can try QtWebKit. It can render and export HTML to PDF, PNG, JPEG

Related

Printing of HTML markup with C++

I was curious if I can embed printing of HTML markup in my C++ application? Here's what I need:
Ability to specify which printer to print to.
Ability to change paper size.
Ability to specify margins/gutter, etc.
Ability to let end-user preview the result.
It would be easier to virtually print the HTML page to a PDF file using wkHTMLtoPDF C++ Library, and then print it.
Pros:
It allows you to keep a draft copy for future use.
Cons:
It's not a print-HTML-directly library
Have a look at this library: http://www.terrainformatica.com/htmlayout/ . It does everything you need, assuming you want to print the rendered html, and not syntax-highlighted html source code, which is not entirely clear from your question - but
MFC has a CHtmlView class that is part of their Document/View architecture. Hence, you can create a rather simple MDI "Web Browser" in MFC pretty easily.

How to open and display a PDF file using Qt/C++?

I am trying to open and read a PDF file using Qt, but there is no specific way to do that.
I know the subject is a bit old, but...
I found a really simple way to render PDFs in Qt via QtWebKit using pdf.js (http://mozilla.github.com/pdf.js/).
Here is my realization of the idea for Qt5 and the WebEngine: https://github.com/Archie3d/qpdf
Qt itself does not include PDF reading/rendering functionality as far as I know. You might want to have a look at libpoppler which has Qt bindings.
I found this very interesting article on qt-project.org - "Handling PDF - Qt Project".
This page discusses various available options for working with PDF documents in a Qt application. The page does not exactly show how to "open and display an existing PDF document" but it can help you deduce something useful out of all that is explained there.
Here, the page says:
For rendering pages or elements from existing PDF documents to image
files or in-memory pixmaps (useful e.g. for thumbnail generation or
implementing custom viewers), third-party libraries can be used (for
example: poppler-qt4 (freedesktop.org) and muPDF (mupdf.com)).
Alternatively, the task can be delegated to existing command-line
tools (like poppler-utils (freedesktop.org) and muPDF (mupdf.com)).
You can use PdfViewer which is a lightweight PDF viewer that only uses Qt. It contains a PdfView widget which can be easily embedded in your application.
Simple answer : it is not supported in the Qt API.
Other answer : you can code it, I suggest you have a look at this Qt application which uses Ghostscript
The best way I have found to open a pdf is using QProcess in Qt.
You may want to use okular for pdf proccessing.
I know this is an old post, but I stumbled on it during my initial search so I figured I would post some documentation from the solutions I used.
As of Qt 5.10
Check out the QPdfDocument Class. This class can open a PDF and you can use the render function to render a page to an image. I use the QQuickPaintedItem to then "draw" this image but I am sure there are more ways to handle the QImage output.
Prior to Qt 5.10
I used libpoppler to do a VERY similar process.
#include <poppler/qt5/poppler-qt5.h>
Use the Poppler::Document Class to load and handle the entire PDF document and look at the Poppler::Page::renderToImage function to output the page as a QImage.
Qt does not support reading PDF files out of the box and among many approaches you can use Adobe's PDF Reader ActiveX object along with a QAxObject.
You may want to check out this link which describes how to read PDF files in Qt/C++ using ActiveX and has a downloadable example project.

Create PDFs with editable forms in Qt

I'm trying to find out if there's a way to embed an editable text cell in a PDF generated in a Qt application. I'm currently using QPrinter to generate the PDF, but if there's another library that could do this, that would be fine. The environment is limited, though, to C or C++, so libraries like iText are out. In terms of form capabilities, this pdf,
http://examples.itextpdf.com/results/part2/chapter08/text_fields.pdf, is a good example with the exception that I don't need a password text field.
Thanks,
Frank
This may not be terribly helpful, but I'll throw it out there anyway.
wkhtmltopdf is based on QTWebkit.
One of its command line options is to convert HTML fields into PDF fields (off by default).
There's almost no pdf-related code within wkhtmltopdf. Certainly nothing dealing with fields. Something upstream is doing the PDF conversion for them.
So find out what that "something" is and you're golden.
EDIT: That or spend a lot of time writing JNI wrappers for iText. :/ Having done so myself, I can say it'd be much more interesting to write a JNI generator tailored to iText, but far more practical to write a Java app that uses iText and then make JNI calls from your C/C++ app to pass the data it'll need and retrieve any response.
The form field borders are a part of the page, not the field itself. Odd, but that's not the first time I've encountered it. Our own software, LiquidOffice, used to generate fields with backgrounds AcroForms couldn't support the same way (now we use an icon-only button).
Those Real PDF Fields have their visibility flags set to "visible but doesn't print" within the pDF. I doubt wkhtmltopdf will let you control that directly. Patch time.
BUT, you could make a second pass with some PDF manipulation library to go through and change the visibility settings on your fields. I'm partial to iText, but there are many other fish in that particular sea.

Exporting *.png sequence from *.fla with C++

I need an animation in my program. My designer draws animation in Flash and provides me with *.fla file. All I need is to grab 30-40 PNGs from this file and store them within my internal storage.
Is it possible grab resources from *.fla with C++ ? Probably, some Adobe OLE objects can help?
Please, advice.
Thanks in advance.
If I asked an artist to make me an icon I wouldn't expect to need to write code to convert a .3DS model into a usable icon format.
You can save yourself a lot of time and hassle by having your designer use File->Export and give you PNGs of the layers and frames instead of a .FLA file if that's the format you require for your implementation.
If that's not possible for some reason then you can probably find a flash decompiler that has a command line option which you could launch from your program to extract assets as part of your loading sequence but that is generally frowned upon because this is not the intended use of the proprietary format for .swf/.fla anymore than you should design applications to extract source code from a binary executable.
Assuming
You are using CS5
The assets used internally in the FLA are already PNG's as you want them to be.
Then simply get the FLA saved as a XFL file, and you will be able to grab them from the library folder ( but then why not just get them to mail you the pngs ? )
So if for some reason you can only get access to the fla and not the designer, then you can do it programatically by renaming the fla to .zip, extracting.. and you have the XFL format.

Is there such a thing like a Printer-Markup-Language

I like to print a document. The content of the document are tables and text with different colors. Does a lightwight printer-file-format exist, which can be used like a template?
PS, PDF, DOC files in my opinion are to heavy to parse. May there exist some XML or YAML file format which supports:
Easy creation (maybe with a WYSIWYG-Editor)
Parsing and manipulation with Library-Support
Easy sending to the printer (maybe with Library-Support)
Or do I have to do it the usual way and paint within a CDC?
I noticed you’re using MFC (so, Windows). In that case the answer is a qualified yes. In recent versions of Windows, Microsoft offers the XPS Document API which lets you create and manipulate a PDF-like document using XML, which can then be printed using the XPS Print API.
(For earlier versions of Windows that don’t support this API, you could try to deal with the XPS file format directly, but that is probably a lot harder than using CDC. Even with the API you will be working at a fairly low level.)
End users can generate XPS documents using the XPS print driver that is available for free from Microsoft (and bundled with certain MS products—they probably already have it on their system).
There is no universal language that is supported across all (or even many) printers. While PCL and PS are the most used, there are also printers which only work with specific printer drivers because they only support a proprietary data format (often pre-rendered on the client).
However, you could use XSL-FO to create documents which can then be rendered to a printer driver using library support.
I think something like TeX or LaTeX (or even troff or groff) may meet your needs. Google them and see.
There are also libraries to render documents for print from HTML source. Look at http://libharu.sourceforge.net/ for example. This outputs a printer-ready .PDF
A think that Post Script is a really good choice for that.
It is actually a very simple language, and it must be very easy to parse becuse it is stack-oriented. Then -- most printers supprort it, and even if you have no support you can use GhostScript to convert for many different formats (Consider GS as a "virtual PS supporting printer").
Finally there are a lot of books and tutorials for the language.
About the parsing -- you can actually define new variables and functions in PS. So, maybe, your problem can be solved (almost) entirely using PS.
HTML + CSS can be printed -- properly. CSS was designed to support this with the media attribute to specify that your CSS is for printer layout, not for screen layout. Tools like PRINCE (free + commercial versions) exist to render this for printing.
I think postscript is the markup language used by printers. I read this somewhere, so correct me if postscript is now outdated.
http://en.wikipedia.org/wiki/PostScript
For more powerful suite you can use Latex. It will give options of creating templates where you can just copy the text.
On a more GUI friendly note, MS-Word and other word processors have templates. The issue is they are not of a common standard or markup.
You can also use HTML to render stuff in a common markup but it will not be very printer friendly.