Help programmatically add text to an existing PDF - c++

I need to write a program that displays a PDF which a third-party supplies. I need to insert text data in to the form before displaying it to the user. I do have the option to convert the PDF in to another format, but it has to look exactly like the original PDF. C++ is the preferred language of choice, but other languages can be investigated (e.g. C#). It need to work on a Windows desktop machine.
What libraries, tools, strategies, or other programming languages do you suggest investigate to accomplish this task? Are there any online examples you could direct me to.
Thank-you in advance.

What about PoDoFo:
The PoDoFo library is a free, portable
C++ library which includes classes to
parse PDF files and modify their
contents into memory. The changes can
be written back to disk easily. The
parser can also be used to extract
information from a PDF file (for
example the parser could be used in a
PDF viewer). Besides parsing PoDoFo
includes also very simple classes to
create your own PDF files. All classes
are documented so it is easy to start
writing your own application using
PoDoFo.

iTextSharp is a free library that you can use in .Net applications. Take a look at the iText page - that is for the iText project, which is a Java library. iTextSharp is part of that project, and is a port to C# and .Net.

Consider Python It have a lot PDF librarys (both creating and extracting) eg:
http://pypi.python.org/pypi/pdfsplit/0.4.2
http://pypi.python.org/pypi/JagPDF/1.4.0
http://pypi.python.org/pypi/pdfminer/20091129
http://pypi.python.org/pypi/podofo/0.0.1
http://pypi.python.org/pypi/pyFPDF/1.52
There are also good tools for using C/C++ code in Python and to create .exe form Python scripts. If you decide to use different language consider Python as prototyping language!

Related

How to add a new row to Excel file using unmanaged C++?

How can I add a new row (with contents) to an existing Excel .xls file using unmanaged C++ running on Windows?
I don't mind using OLE, COM, or any external free library, whatever is the easiest way.
There is a COM interface which is well documented.
I'd suggest you start with the Workbooks.Open method to open an existing excel file.
If you only need basic features (no formatting, formula's, ...), you can also use BasicExcel: A c++ library which doesn't have any dependencies (it reads and writes the excel file as a compound file) and is much easier to use than the COM interface (at least from c++).
I've used SQL to do this. I don't have sample code handy, but a quick google search brought this up: Link
Hope its helpful.
If you have no restrictions to use managed libraries you can check NPOI, a managed library to handle Excel file format.
Since it is managed it should be possible to register it as a COM server. If, for any reason, it proves hard/impossible to register it as a COM server you can write a thin COM server (either in C++ or C# or whatever you prefer) to expose just the functionality you need to your unmanaged C++ code.
I've used this one: ExcelFormatLib, it's great and simple to use, C++, well maintained, compiles and works without any trouble.

Small library for generating HTML files in C++

Is there a library that will allow easier generation of a simple website using C++ code. This 'website' will then be compiled into a CHM help file (which is the final goal here). Ideally, it will allow generation of pages easily and allow links to be generated between pages easily. I can do this all by hand, but that is going be very tedious and error prone.
I know about bigger libraries such as Wt, but am more interested in smaller ones with little or no dependencies and a need for installation.
You can try CTPP template engine. It is written in C++ is small and quite fast.
Do you need this project to be written in c++? Because if you just need to prepare documentation in CHM I would go with Sphinx. Sphinx is a set of tools written in Python that generate manuals in few formats (chm, html, LaTeX, PDF) from text files (formated using reStructuredText markup language). Those text files could be created by hand or using some application and then combined into one manual using Sphinx. In my work right now we are using this solution to write documentation, because it is very easy to maintain text files (merging, tracking changes etc.) than for example html or doc. Sphinx is used to generate Python language documentation (chm), so it is capable to handle really large project.
I've used the FLATE library every day for ten years and it works flawlessly. It's a piece of cake to use; I can't recommend it enough.
It will definitely do the trick, though probably at a much lower level than you have in mind. It is a C-language source library that you can link with a C++ caller. It's also available as a Perl module, but I haven't used that.
FLATE library
Flate is a template library used to deal with html code in CGI applications. The library includes C and Perl support. All html code is put in an external file (the template) and printed using the library functions: variables, zones (parts to be displayed or not) and tables (parts to be displayed 0 to n times). Using this method you don't need to modify/recompile your application when modifying html code, printing order doesn't matter in your CGI code, and your CGI code is much cleaner.
HTH and good luck!
Are this CHM lib and the related links what you're looking for?

What library should I use to get the data from a photo of a QR Code?

Similar: Does anyone know of a C/C++ Unix QR-Code library?
I tried libqrencode but apparently it's only able to generate a QR-Code. However, I need a library that reads the data from a photo of a printed QR-Code.
It must be a C, C++ or Objective-C library and it has to compile on BSD systems. On my platform, Java and .NET are not available.
What libraries can I use?
Thanks.
Try using libdecodeqr , it doesn't seem to be updated for over a year but a Google search reveals that it still works.
zxing (http://code.google.com/p/zxing/) is probably the most well-known and used in a number of barcode/qr-code apps. The original/primary code is Java but it includes a C++ port that is pretty actively maintained, particularly for QR codes.
The C++ library does not (currently) have an encoder, but it sounds like you want a decoder.

Is there such a thing like a Printer-Markup-Language

I like to print a document. The content of the document are tables and text with different colors. Does a lightwight printer-file-format exist, which can be used like a template?
PS, PDF, DOC files in my opinion are to heavy to parse. May there exist some XML or YAML file format which supports:
Easy creation (maybe with a WYSIWYG-Editor)
Parsing and manipulation with Library-Support
Easy sending to the printer (maybe with Library-Support)
Or do I have to do it the usual way and paint within a CDC?
I noticed you’re using MFC (so, Windows). In that case the answer is a qualified yes. In recent versions of Windows, Microsoft offers the XPS Document API which lets you create and manipulate a PDF-like document using XML, which can then be printed using the XPS Print API.
(For earlier versions of Windows that don’t support this API, you could try to deal with the XPS file format directly, but that is probably a lot harder than using CDC. Even with the API you will be working at a fairly low level.)
End users can generate XPS documents using the XPS print driver that is available for free from Microsoft (and bundled with certain MS products—they probably already have it on their system).
There is no universal language that is supported across all (or even many) printers. While PCL and PS are the most used, there are also printers which only work with specific printer drivers because they only support a proprietary data format (often pre-rendered on the client).
However, you could use XSL-FO to create documents which can then be rendered to a printer driver using library support.
I think something like TeX or LaTeX (or even troff or groff) may meet your needs. Google them and see.
There are also libraries to render documents for print from HTML source. Look at http://libharu.sourceforge.net/ for example. This outputs a printer-ready .PDF
A think that Post Script is a really good choice for that.
It is actually a very simple language, and it must be very easy to parse becuse it is stack-oriented. Then -- most printers supprort it, and even if you have no support you can use GhostScript to convert for many different formats (Consider GS as a "virtual PS supporting printer").
Finally there are a lot of books and tutorials for the language.
About the parsing -- you can actually define new variables and functions in PS. So, maybe, your problem can be solved (almost) entirely using PS.
HTML + CSS can be printed -- properly. CSS was designed to support this with the media attribute to specify that your CSS is for printer layout, not for screen layout. Tools like PRINCE (free + commercial versions) exist to render this for printing.
I think postscript is the markup language used by printers. I read this somewhere, so correct me if postscript is now outdated.
http://en.wikipedia.org/wiki/PostScript
For more powerful suite you can use Latex. It will give options of creating templates where you can just copy the text.
On a more GUI friendly note, MS-Word and other word processors have templates. The issue is they are not of a common standard or markup.
You can also use HTML to render stuff in a common markup but it will not be very printer friendly.

Load Excel data into Linux / wxWidgets C++ application?

I'm using wxWidgets to write cross-plafrom applications. In one of applications I need to be able to load data from Microsoft Excel (.xls) files, but I need this to work on Linux as well, so I assume I cannot use OLE or whatever technology is available on Windows.
I see that there are many open source programs that can read excel files (OpenOffice, KOffice, etc.), so I wonder if there is some library that I could use?
Excel files it needs to support are very simple, straight tabular data. I don't need to extract any formatting except column/row position and the data itself.
Suggestedd reference: What is a simple and reliable C library for working with Excel files?
I came across other libraries (chicago on sf.net, xlsLib) but they seem to be outdated.
jrh
I can say that I know of a wxWidgets application that reads Excel .xls and .xlsx files on any platform. For the .xlsx files we used an XML parser and zip stream reader and grab the data we need, pretty easy to get going. For the .xls files we used: ExcelFormat, which works well and we found the author to be very generous with his support.
Maybe just some encouragement to give it a go? It was a couple of days work to get working.
Maybe http://www.libxl.com/ can help ?
I think that it is not something easy to do. xls files are quite complex and it is a proprietary format.
Maybe this is a stupid idea but why don't you upload and access your doc with Google docs. There are some apis to access your doc.
2 potential problems:
- Your app needs internet access
- Currently there is no C++ api.
But there are api for several languages including python see http://code.google.com/intl/fr/apis/gdata/articles/python_client_lib.html