pdf/a-1 to any raster image format - file-conversion

I'm looking for a very simple C or C++ open source converter from pdf/a-1 to any image format. I've seen something like mupdf but more than 200 source files are too many for my application. Pdf/a-1 is a subset of pdf so my hope is to find something simpler than a generic pdf converter like mupdf. I don't need to display on screen. Any suggestion apperciated,
BR Tommaso

PDF is a complicated format, even with the PDF/A-1 subset. I doubt you can find anything smaller than MuPDF.

Related

why is my libharu pdf oversized with .png images?

I am creating a pdf using libharu in C++ (compiled as a .cgi) that features .png images.
The code is fine, but my pdf's are ridiculously oversized.
Each page features one image of around 30kb and around 4 text characters in libharu's system font. If I open a 20 page output file of 25mb and "print" it to a file in my operating system it becomes 256kb or so with no visible change to the images.
I think the issue is related to libharu because this guy see's it too, here. He is using php so, libharu as a compiled .cgi. (my C++ code is also compiled .cgi, linked to libharu).
Another guy here on stack overflow has also seen size issues with libharu, but his problem does not mention anything to do with .png so it may be unrelated.
Code for reference:
WorkingGraphic = HPDF_LoadPngImageFromMem ( *gPdfPtr,
PngAssets[AssetIndex], //Image data ptr
PngSizes[AssetIndex]); //data length
//Render Appropriate
HPDF_Page_DrawImage (*BlitParams->page,
WorkingGraphic,
BlitParams->OutputRect->X,
BlitParams->OutputRect->Y,
BlitParams->OutputRect->Width,
BlitParams->OutputRect->Height);
Does anyone know how to drive libharu so it creates sensible sized pdf's when you use .png images?
Right I don't know how to remove a question but maybe this info will be useful to others anyway.
I may have had the same issue as this fellow here where I have duplicated this answer.
What I needed to do was enable compression of the .pdf, which I had not done.
Documentation link
C Code:
HPDF_SetCompressionMode (pdf, HPDF_COMP_ALL);
It's because I didn't do enough research to know that .pdf format does not natively support .png, or if it has been updated to do so, libharu still doesn't. So, this option tells libharu to use zlib to zip compress everything it can, including your images.
The implementation is not perfect (you will still see a size difference if you zip your output .pdf) but it is acceptable for my use case.
If you don't need the full-size image in the PDF, you can reduce the image to a thumbnail using GDI+ APIs, equal in size to however big you want the image to appear in the PDF.
Save the scaled PNG to a temporary file, and pass the thumbnail PNG to Haru PDF. This will reduce the size of the PDF file.
The image will be pixellated when the viewer zooms in.

How can I pass TIFF image data to JUCE (which does not support TIFF)?

I am using learning gui programming using c++ JUCE library. That library have supports for image file format(png, jpg). But I wants to learn how can I use other file format for example tiff.
After google I got libtiff.
My question is what will be the accurate approach for displaying this. Should I need to convert .tiff file into jpeg/png from tiff for doing this.
But I think this will require double processing.
Can anyone explain the raw/native/basic image file format so that I need to convert all images into that type and use it directly.
As I find something in winAPI for dealing with images in which they use image data from file format.
It will be very helpful if someone can let me know the approach for handling images data and displaying it.
Can anyone explain the raw/native/basic image file format so that I need to convert all images into that type and use it directly.
There is no "native" image file format, but RGB comes close (especially if you strip the headers to give just a Width×Height×Channels array of pixel values). You probably wouldn't want to use this for storing everything though as your buffers will be very large. Let your libraries handle storage.
It will be very helpful if someone can let me know the approach for handling images data and displaying it.
There is no "the" approach. C++ itself doesn't say anything about images, and there are loads of ways you can go about working with them. Your design will depend on your functional requirements specification and on what libraries you have available.
I am using learning gui programming using c++ JUCE library. That
library have supports for image file format(png, jpg). But I wants to
learn how can I use other file format for example tiff.
After google I got libtiff.
My question is what will be the accurate approach for displaying this.
Should I need to convert .tiff file into jpeg/png from tiff for doing
this.
But I think this will require double processing.
If you mean using libtiff to convert TIFF-format images to formats that JUCE supports, you're right in saying that this introduces an extra initial processing step. However, as far as you've said, it sounds like any possible performance hit through this will be vastly, wildly and hugely outweighed by the benefit of simplicity and clarity. So I'd just do that.
In order to do something like read *.tiff images and using them in an application build with the JUCE framework, I would suggest to create a new class derived from the base interface ImageFileFormat.
class MyTiffFormat : public ImageFileFormat
{
private:
MyTiffFormat( const MyTiffFormat& );
MyTiffFormat& operator=( const MyTiffFormat& );
public:
MyTiffFormat();
~MyTiffformat();
const String getFormatName();
bool canUnderStand();
Image decodeImage( InputStream& input );
bool writeImageToStream( const Image& source, OuptputStream& dest );
};
Implementing the function "Image decodeImage( InputSTeram& input )" is the point were you need something like libtiff. In the JUCE source tree you will find the implementation for PNG and the other supported formats in the folder: \juce\src\gui\graphics\imaging
More information on extending JUCE features can be found in the JUCE user forum.
Juce works great with pngs, jpgs, and gifs (not animated), and they can be read from file, or even "compiled" with the BinaryBuilder.
For example to load it from compiled c++ with BinaryBuilder.
someImage = ImageFileFormat::loadFrom (AppResources::image_png, AppResources::image_pngSize);
Check out the doxygen docs, they are quite helpful. to compile your images with BinaryBuilder the syntax is:
./BinaryBuilder someFolder otherFolder ClassName

C++ - How could I do some operation on bmp file?

I am interesting to do some transformation, like change one color to another, count all used colors, and resize image. I DO NOT want to use any exist library, I would like write myslelf all code.
Summing up: How could I open BMP file and change it?
Start by learning the bitmap file format. It is very easy to understand and implement.
You can get any file format by going to www.wotsit.org and searching for the file type you want. In your case BMP. There are different types of bitmaps so you can figure out which ones you want to implement.
I would start with reading some documentation. Maybe go to Wikipedia for an overview.
You need to read in the binary file, figure out what all the bits mean, do your transformation, and write out a new binary file. For figuring out the format of various binary files, wotsit is the best resource I've found. They have links to 5 specs for BMP format files.

C++ Importing and Renaming/Resaving an Image

Greetings all,
I am currently a rising Sophomore (CS major), and this summer, I'm trying to teach myself C++ (my school codes mainly in Java).
I have read many guides on C++ and gotten to the part with ofstream, saving and editing .txt files.
Now, I am interested in simply importing an image (jpeg, bitmap, not really important) and renaming the aforementioned image.
I have googled, asked around but to no avail.
Is this process possible without the download of external libraries (I dled CImg)?
Any hints or tips on how to expedite my goal would be much appreciated
Renaming an image is typically about the same as renaming any other file.
If you want to do more than that, you can also change the data in the Title field of the IPTC metadata. This does not require JPEG decoding, or anything like that -- you need to know the file format well enough to be able to find the IPTC metadata, and study the IPTC format well enough to find the Title field, but that's about all. Exactly how you'll get to the IPTC metadata will vary -- navigating a TIFF (for one example) takes a fair amount of code all by itself.
When you say "renaming the aforementioned image," do you mean changing metadata in the image file, or just changing the file name? If you are referring to metadata, then you need to either understand the file format or use a library that understands the file format. It's going to be different for each type of image file. If you basically just want to copy a file, you can either stream the contents from one file stream to another, or use a file system API.
std::ifstream infs("input.txt", std::ios::binary);
std::ofstream outfs("output.txt", std::ios::binary);
outfs << insfs.rdbuf();
An example of a file system API is CopyFile on Win32.
It's possible without libraries - you just need the image specs and 'C', the question is why?
Targa or bmp are probably the easiest, it's just a header and the image data as a binary block of values.
Gif, jpeg and png are more complex - the data is compressed

Decode JPEG to obtain uncompressed data

I want to decode JPEG files and obtain uncompressed decoded output in BMP/RGB format.I am using GNU/Linux, and C/C++.
I had a look at libjpeg, but there seemed not to be any good documentation available.
So my questions are:
Where is documentation on libjpeg?
Can you suggest other C-based jpeg-decompression libraries?
The documentation for libjpeg comes with the source-code. Since you haven't found it yet:
Download the source-code archive and open the file libjpeg.doc. It's a plain ASCII file, not a word document, so better open it in notepad or another ASCII editor.
There are some other .doc files as well. Most of them aren't that interesting though.
Unfortunately I cannot recommend any other library besides libjpeg. I tried a couple of alternatives, but Libjpeg always won. Is pretty easy to work with once you have the basics done. Also it's the most complete and most stable jpeg library out there.
MagickWand is the C API for ImageMagick:
http://imagemagick.org/script/magick-wand.php
I have not used it, but the documentation looks quite extensive.
You should check out Qt's QImage. It has a pretty easy interface that makes this task really easy. Setup is pretty simple for every platform.
If Qt is overkill, you can try Magick++ http://www.imagemagick.org/Magick++/. It supports similar operations and is also well suited for that sort of task. The last time I used it, I struggled a bit with dependencies for it on Windows, but don't recall much trouble on Linux.
For Magick++'s Image class, the function you probably want is getConstPixels.
I have code that you can copy ( or just use as a reference ) for loading a jpeg image using the libjpeg library.
You can browse the code here: http://code.google.com/p/kgui/source/browse/trunk/kguiimage.cpp
Just look for the function LoadJPGImage.
The code is setup to handle c++ binding of my DataHandle class to it for loading the image, that way the image can be a file or data already in memory or whatever.
A slightly out of the box solution is to acquire a copy of the netpbm tools, which transform images from pretty much any format to any other format via one of several very simple intermediate formats. They work well from the shell, and are most often used in pipes to read some arbitrary image, perform an operation on it, and write it out to some other format.
The pbm formats can be as simple as a plain ASCII header followed by the RGB data in ASCII or binary. They are intended to be simple enough to use without required a library to implement.
JPEG is supported in netpbm by read and write filters that are implemented on top of libjpeg.