How to save pages in a PDF as images using Python - python-2.7

I want to save all the pages in the PDF document as images using python.
I already tried Imagemagick and pypdf. The results are not good with my type of document (containing graphs, old scanned documents).

When using Imagemagick to convert PDF to say PNG, one can specify the density to rasterize the PDF at high resolution. Then you can resize back down if you want. For example,
convert -density 300 image.pdf -resize WxH image.png
If your PDF is CMYK rather RGB, then add -colorspace sRGB right after -density 300 and before reading the image.pdf.
If that is not good enough increase the density to 600 and try again.
WxH is the final images size that you want.
If using Imagemagick 7, then replace convert with magick

Related

Create a grayscale image from 16-bit data using C++

I have an array of 16-bit values and I want to create an image (such as BMP) so that I can view it. Does anyone have suggestions of how to do this?
Thanks
Say your image is 200px by 100px. Write the 20,000 16-bit vales to a binary file.
At the commandline, run ImageMagick:
magick -depth 16 -size 200x100 gray:yourFile.bin image.png
Or, if you want a JPG
magick ... image.jpg
For a tiny bit more effort, you could write a 16-bit PGM file which would have the benefit that it contains its dimensions and bit depth in a small ASCII header so the conversion in ImageMagick is simpler, and other programs such as GIMP can read it:
magick yourFile.pgm image.jpg
The header would be:
P5
200 100
65535
... binary data as above ...
See Wikipedia NetPBM article.

Shrink the size of a .png file

There are many programs that claim to reduce the size of a .png file but none of the well known ones, optipng , pngcrush , pngquant, allow me to shrink to a specified size. pngcrush tried its hardest, but the result was still way to big for my needs. For .jpg files, jpegoptim has an -m option that does allow me to shrink to the size I need. The obvious solution seemed to be to convert to jpg, shrink to the right size, then convert back, but that doesn't work either, the reconstituted .png file just jumps back to its original size.
Presumably, this has something to do with the structure of .png files.
Is there any way to get a small png file? This png file is an example of something i need to shrink to below 1K bytes.
Thanks for any suggestions!
Use ImageMagick to reduce the colors, then pngcrush to get rid of ancillary chunks:
magick in.png -colors 8 temp.png
pngcrush -rem alla temp.png out.png
results in a 1621-byte file. If you have an older version of ImageMagick, use "convert" instead of "magick". Using "-colors 4" instead of "-colors 8" gets you a 1015-byte file, but the dithering looks very spotty.
Note that these preserve the transparency in the image, while converting to JPEG loses the transparency and makes the background a solid color.
The only solution to your problem that I can think of is to use .jpg instead of .png. The .jpg format was mainly created for its high lossy compression but still gets a good enough image. On the other hand, .png is going for the full transparency and no quality loss. To sum it all up, .jpg is ideal for getting smaller files if quality doesn't matter, and .png is perfect for high-quality images that quality and colour really matter.
Sources:
http://www.labnol.org/software/tutorials/jpeg-vs-png-image-quality-or-bandwidth/5385/, http://www.interactivesearchmarketing.com/jpeg-png-proper-image-formatting/
I can get that 9.5 KB file down to 3.4 KB using the 8-bit palette PNG format. The image has a transparent boundary, which adds unnecessary pixels and an alpha channel for the whole image which isn't needed, since it's rectangular. After stripping the transparent boundary, eliminating the alpha channel, and using a palette, I can get it down to 3.2 KB.
To get any further, I have to use JPEG for lossy compression. At a very low image quality of 5 (out of 100), I can get it down to 1 KB. It shows some artifacts from the severe compression (look around the prompt > and _ to see some of those):

Get image info from header (width, height, etc) without opening it

I have been using OpenCV to read and manage images. But now, given the image path, I want to QUICKLY store only the image path, width and height (I do not want to load the image buffer), for a wide range of image formats.
OpenCV does not allow to do this without loading the whole image. I could implement image header parsers for a set of formats, but this might take a while.
Is there any library or so that can do this?
Yes, ImageMagick at the command line like this:
identify -ping image.png
image.png PNG 510x500 510x500+0+0 8-bit sRGB 2.36KB 0.000u 0:00.000
Or, if you have lots of images to do in a single invocation, use it like this:
identify -ping -format "%f %w %h\n" *.png
a.png 500 500
image.png 510 500
n.png 500 510
s.png 510 500
sw.png 510 550
w.png 510 500
C/C++, Python and Perl bindings are also available. It is installed on most Linux distros, and available for OSX and Windows from here.

How can I compress jpeg image with compression rate 4 bpp or less?

I am trying to compress my .jpeg image in Photoshop.
WHat is the best way to do this?
I am now calculating the bpp taking the image size in kb, calculating how many bits that is. Then I take the image size in pixel*pixel to get the amount of pixels in the image. After that I divide bits/pixels, to find how many bits per pixel the image has.
But How can I change this number? My guess is to change how many kb the image is, but how do i do this?
Thanks for any help!!
Yes, you can achieve higher compression ratio than 4 bits per pixel. Images with solid color can have rate as low as 0.13bpp.
In fact 4bpp is quite poor compression — it's same as uncompressed 16-color image or half of 256-color image, which even GIF can manage. JPEG can look decent at 1-2bpp.
in general, you cannot "compress" a jpeg image. all you can do is to reduce the image quality further in order to achieve a lower bpp value. jpeg streams are always compressed and they use a lossy compression method. it means that the original image will never ever be reconstructed from a jpeg image. the smaller the file the more information you have lost.
a specific "bpp value" is not, and should never be your target. especially with lossy compression. you should always look at your current image and decide whether it is still good enough or not.
if you still have the original image, try a lossless compression format, like zip-compressed or lzw-compressed tiff or compressed png. i'm sure PhotoShop can handle these formats as well. another softwares like IrfanView (https://www.irfanview.com/) or XnView MP (https://www.xnview.com/en/xnviewmp/) will convert your images too.
if you want manual (eg. full) control over your images, you should use command line utilities, like ImageMagick (https://imagemagick.org/) or NConvert (please find the XnView MP link above)
if you have only the jpeg images do not touch (edit & save) them. with every single save operation you lose another bunch of information. you should always work on file copies.
you should always keep your master image (the very picture you took with your phone or your camera).
of course, these rules of thumb will not answer your original question.

What options for convert (ImageMagick or GraphicsMagick) produce the smallest (filesize) PNG?

ImageMagick creates some pretty large PNGs. GraphicsMagick is a lot better, but I'm still looking for the best options to use with convert to obtain the smallest filesize png.
I have here a large png with a small filesize, and passing this through IM convert I have been unable to reach that filesize, let alone get it smaller. With GM convert I can get it slightly smaller but I'm looking for improvements, generically for any image I come across.
gm convert -quality 95 a_png.png gm.png
convert -quality 95 -depth 8 a_png.png im.png
gm identify *
a_png.png PNG 2560x2048+0+0 PseudoClass 256c 8-bit 60.1K 0.000u 0:01
gm.png[1] PNG 2560x2048+0+0 PseudoClass 256c 8-bit 60.0K 0.000u 0:01
im.png[2] PNG 2560x2048+0+0 DirectClass 8-bit 130.2K 0.000u 0:01
What options for convert produce the smallest PNG filesize?
(Yes, I'm familiar with OptiPNG, PNGOUT and Pngcrush. But I'm after something that will be available without question on every *nix box I happen to be on.)
Looks like you and me are looking for the same answer. Unfortunately there doesn't seem to be many people out there with a good knowledge of GraphicsMagick. This is what I have learned so far,
The quality operator doesn't properly work for any image other than JPEG's. For me it just made the file size bigger when used on PNG's and GIF's.
I have done this to my PNG and GIF files to reduce their size:
gm convert myImage.png +dither -depth 8 -colors 50 myImage.png
+dither stops any dithering of the image when the colors are reduced. (this reduces the file size)
-depth 8 is probably unnecessary as most PNG files are already depth 8.
-colors 50 reduces the number of colors in the image to 50, this is the only way to really reduce the size of a image stored in a lossless format like PNG or GIF.
Obviously for the best image quality/size ratio you cant just reduce the image depth or number of colors without knowing the current depth and number of colors. To determine this information I am doing the following
gm identify -format "file_size:%b,unique_colors:%k,bit_depth:%q" myImage.png
For my image; this returns
file_size:100.7k,unique_colors:13455,bit_depth:8
The problem is when GraphicsMagick reduces colors it always reduces to at least 255, so you can't set the number of colors to 300 for example. Also there seems to be an issue with the alpha channel for PNG files; If the image has transparency in it, reducing colors replaces these colors with transparent; with imagemagick it does not do this.
I just came across this question again so I'll update, GraphicsMagick and ImageMagick have a serious problem. They cannot write out PNG images using a tRNS chunk which means if you try to read in an image that has a tRNS chunk and then write it out, the image will be much bigger. GM is not the best tool for compressing images. You need to use a separate tool such as OptiPNG to compress PNG's again after using Image/GraphicsMagick. I am getting up to 60% smaller files when using OptiPNG after running GraphicsMagick on an image.
Also I was wondering if you have encountered a problem regarding RGBA images and bit depth. For some images I am getting an "Invalid bit depth" exception. I can't see any reason why.
I haven't found a way to do it in the command line, but i did find this free website (http://tinypng.org/) that does an excellent job, my test image got a 71% reduction, final size was only 29% of the original. It looks like you can give it 20 images at a time. I'm looking into how they do it.
http://tinypng.org/