What is the raw form of a compressed image file format(jpeg, PNG , gif)? - compression

As we know jpeg , PNG , gif are all compressed file formats, my question is what is the original source of input we provide to these compression algorithms and in which form a image data is stored before it gets converted into one of these file formats.

That depends.
PNG is generally lossless, but it does have a limit on the number of bits/pixel. GIF turns out to be lossless, too, but it is more complicated to get a high number of colors. These formats are still compressed, but use a compression that doesn't lose data.
JPEG is lossy. If you save as a JPEG, you will not be able to revert back to another format without losing some clarity. By representing the data as equations it can get quite small, but it can start to look "blurry" as the approximations get worse.
There are other images formats, like TIFF, RAW and BMP, which generally don't do any compression, although they are really more like containers and technically can contain compressed data, but they usually don't.
The original, uncompressed, data depends on what generates it. A photoshop file will save as a PSD but internally may represent it differently in memory. Every digital camera may have a different way of laying out its internal memory, and the photo sensors tend to map 1 to 1 from a sensor to a memory location of a set number of bits.
The common pattern, however, is that each pixel of the image is stored as 3 (sometimes 4) color values, each one between 8 and 16 bits. The 3 values may represent Red, Green and Blue, or alternatively Hue, Saturation and Value. For design, it could be CMYK (Cyan, Magenta, Yellow and blacK). There could also be an alpha value. It's unusual to use more than 16 bits for each color channel and most common to use 8. Using 12 bits is considered by most to be full color, but that doesn't align very well on 32 bit or even 64 bit machines. Still, 12 bit is used sometimes in digital video signals since when broadcast serially the color values don't need to fit into words.
Different formats will go in a different order. Usually rows first, but some formats start at the bottom row and some start at the top.
So, the real answer is it depends on what the particular compressor is looking for. Most software that saves as JPEG or PNG will accept multiple formats and the most common is probably 32bit/pixels with 8 bytes each for RGB (red, green, blue) and one either unused or alpha. It will need width and height of the image so the image data should be width*height*4 in bytes. You generally pass in a defined constant that tells it the byte order: RBGA, ARGB, BGR, RGB, etc.

Related

jpeg compression - lossy or lossless

I have few questions on JPEG Compression.
In my windows system, I have some image processing application. For example, Windows msPaint: which provides an option to convert BMP image to JPEG format.
Can anybody please tell me, what is the JPEG compression here using in mspaint- is it lossy or lossless.
If somebody is referring to "JPEG Standard compression", which compression it is internally using: lossy or lossless?
Thanks in advance.
Alvin
JPEG is a family of related compression techniques. There is lossless JPEG but is it generally relegated to 12bit, medical applications.
Any JPEG that you are likely to use creates loss. This occurs at several steps.
The transformation from the RGB to YCbCR. The two color spaces intersect but do not have the same gamut of colors. RGB colors outside of YCbCr get clamped into range. Also the transformation from RGB to YCbCr is a floating point operation that creates integer values, so there are rounding errors.
The Discrete Cosine Transform is usually performed on the data using scaled integers. This introduces small rounding errors. Even if you do this in floating point there will be some small errors and the values have to be rounded to integers for the final output.
Quantization is the big one. This divides the DCT output by integer values. You can eliminate rounding at this step by making all the quantization values 1.
JPEG compression is considered a lossy compression, because it is not possible to build the exact binary from an original source through uncompression.
Even at the highest quality, JPEG works by discarding data. You control the quality to trade off what you think is an acceptable loss to still have a fair representation of your image. Although data is lost, what can be seen might still be identical to the untrained eye - and that is the point. The same as what minidisc used to do for audio.
The intent for JPEG is to make photographic images smaller in file size for internet transmission, you get to decide how small, but if you want absolute quality a format like TIFF is better suited.
Incidently, TIFF offers a lossless compression, but the file sizes are still massive!
One more thing... If you take a 300 x 500 bitmap and convert it to JPEG, then convert it back. The file size will still be the same, because bitmap works by storing a common number of bits per pixel. But the contents of the file will be quite different. In this regard it might be naively viewed as lossless, but in practical terms it is far from it.

Can't find logic behind png file sizes

I'm saving a large number of small png files for use in a game on a phone, so space is at a premium.
I'm trying to figure out the logic behind the file sizes so I can save things most efficiently, but even after using pngcrush the sizes are totally inconsistent.
I saved a 1x1 image and it takes 3kb. I have another 23x21 image which takes only 2kb. I have two images which are almost the same size, but one takes 6kb and the other takes 13kb. I doubled the image height and copied one image into the empty space of the other and saved that. The combined image is only 11kb!
Why is a 1x1 image larger than a 23x21 image? Why can I combine a 13kb image and a 6kb image and get an 11kb image?
Here are the images I'm talking about (there's a 1x1 pixel in between the 1st and second images. It's difficult to see, so I'll just give the URL: http://g42.org/temp/png/1x1.png):
example http://g42.org/temp/png/hat.png
example http://g42.org/temp/png/1x1.png
example http://g42.org/temp/png/helmet1.png
example http://g42.org/temp/png/helmet2.png
example http://g42.org/temp/png/helmet1_2.png
It's not a compression thing, the problem with the 1x1 image is that it has metadata (added by Photoshop, it seems), a color profile (iCCP chunk). If you look inside the binary, its' the data between the strings "iCCP" and "IDAT", it could be removed and you get a 69 bytes file.
If you reopen and save the file most image viewers (xnview), or use pngcrush, you can strip that chunk. : See it here : http://i.stack.imgur.com/fmOdA.png
And regarding the helmet images: besides other informational chunks (imageReady ads some informational text, as you can see), the difference is due to different formats: the two-helmets is a paletted image (8bits per pixel), the single helmet is a RGB with alpha (32bits per pixel)
PNG compression is based on the same algorithm as zlib and is highly sensitive to the data that is being compressed so you won't see a consistent relationship between image size and file size. In the case of the combined image, it is still bigger than the smaller image and given the similarity of the two halves of the image, the compressor was probably able to reuse a lot of the Huffman tree. I don't know enough about the algorithm to say for certain how it ended up smaller than the other half.
As long as you are not seeing oddities like the 1x1 image, which you seem to have figured out in the comments, I don't think this will make a lot of sense without extensive study of image compression.
There is a great utility called pngcrush
http://pmt.sourceforge.net/pngcrush/
Compressing to PNG is a rather difficult task - there are lost of assumptions and strategies to try - do we create a palette, or are we better off without it?
PNGcrush essentially bruteforces 100+ different compression strategies, while at the same time trimming useless tags and sections.
PNG has several sub-formats: 24-bit with or without alpha, 8-bit (includes alpha), grayscale, etc. which use different amount of bytes per pixel and have different "compressibility".
Plus PNG supports several compression tricks (filters and gzip settings) which affect how well image data is compressed.
On top of that PNG can contain metadata, which sometimes can be pretty large, like some embedded color profiles.
ImageAlpha converts images to the most space-efficient PNG8+alpha variant.
ImageOptim removes junk metadata and finds best compression parameters.
With a combination of those two your images can be reduced by 30-50%.

Writing 10,12 bit TIFF files with LibTIFF C++

I'm trying to write 10,12 bit RGB TIFF files with LibTIFF.
The pixel data is saved locally in an unsigned short buffer (16bits)
1) If I set TIFFTAG_BITSPERSAMPLE to 10 or 12, not enough bits are being read from the buffer, and the output is incorrect. (I understand that it is just reading 10 or 12 bits per component, instead of 16 and this is the problem)
2) I tried packing the bits in the buffer, so that it is really 12-R, 12-G, 12-B. In this case, I think the file is being written correctly but no viewer I could find could display this image properly.
3) If I set TIFFTAG_BITSPERSAMPLE to 16, viewers can display the TIFF image, but then I have a problem that I don't know if the image was originally 10 or 12 bits (If I want to later read it with LibTIFF). Also, the viewer expects the dynamic range to be 16 bits and not 10 or 12, also resulting in a bad view.
4) The most annoying part is that I couldn't find one 10, 12, or 14 bit TIFF image on the web to see what the header is supposed to look like.
So finally, what is the proper way to write 10 or 12 bit Image data to a TIFF file ?????
The TIFF specification does not specify a way to store 10, 12 or 14 bits per channel in an image. Depending on the encoder and decoder, it may still be possible to work with such images, but it is effectively an implementation detail, as they are not required to do this.
If you want more than 8 bits of precision in a TIFF, your only choice is 16 (or floating point, but that's a different story).
I'm not aware of any image format with specific support for these bitdepths, so viewers will likely be a problem anyway if you must store the image with that specific bitdepth. The simplest workaround I can think of would be to just store as 16 bits per pixel and put the original bitdepth as metadata (e.g. in an ImageDescription tag), but it all depends on what the images will be used for and why you need this information.
You can store the image as a multi-image file. For example, with a 12 bit source, one image would be an RGB(8) image using the upper 8 bits and a second 16bit gray scale that was a combination of the low four bits and four bits of padding. This gives a TIFF that can be viewed with on a monitor with standard programs and the extra precision can be retrieved with custom software.
I disagree that 'exotic' bit depths are not good. This format would reduce the image size by 5/6. You could even just store the 2nd image as a re-scaled version that would have the 4 bits tightly packed without padding for a 3/4 size reduction. This savings can be significant with very large data sets, where compression is not an option due to the nature of the data. Ie, many scientific and machine vision applications may want the un-adultered bits. The ability to convert from the multi-image tiff to a 16-bit tiff would allow the use of standard programs and image libraries.

Reading a BMP into memory using the correct structures

I'm currently doing a steganography project (for myself). I have done a bit of code already but after thinking about it, I know there are better ways of doing what I want.
Also - this is my first time using dynamic memory allocation and binary file I/O.
Here is my code to hide a text file within a BMP image: Link to code
Also note that I'm not using the LSB to store the message in this code, but rather replacing the alpha byte, assuming its a 32 bit per pixel (bbp) image. Which is another reason why this won't be very flexible if there are 1, 4, 8, 16, 24 bpp in the image. For example if it were 24 bbp, the alpha channel will be 6 bits, not 1 byte.
My question is what is the best way to read the entire BMP into memory using structures?
This is how I see it:
Read BITMAPFILEHEADER
Read BITMAPINFOHEADER
Read ColorTable (if there is one)
Read PixelArray
I know how I to read in the two headers, but the ColorTable is confusing me, I don't know what size the ColorTable is, or if there is one in an image at all.
Also, after the PixelArray, Wikipedia says that there could be an ICC Color Profile, how do I know one exists? Link to BMP File Format (Wikipedia)
Another thing, since I need to know the header info in order to know where the PixelArray starts, I would need to make multiple reads like I showed above, right?
Sorry for all the questions in one, but I'm really unsure at the moment on what to do.
The size of the color table is determined by bV5ClrUsed.
An ICC color profile is present in the file only if bV5CSType == PROFILE_EMBEDDED.
The documentation here provides all that information.
Then, 24-bit color means 8 red, 8 green, 8 blue, 0 alpha. You'd have to convert that to 32-bit RGBA in order to have any alpha channel at all.
Finally, the alpha channel DOES affect the display of the image, so you can't use it freely for steganography. You really are better off using the least significant bits of all channels (and maybe not from all pixels).

image color conversion

I need to convert 24bppRGB to 16bppRGB, 8bppRGB, 4bppRGB, 8bpp grayscal and 4bpp grayscale. Any good link or other suggestions?
preferably using Windows/GDI+
[EDIT] speed is more critical than quality. source images are screenshots
[EDIT1] color conversion is required to minimize space
You're better off getting yourself a library, as others have suggested. Aside from ImageMagick, there are others, such as OpenCV. The benefits of leaving this to a library are:
Save yourself some time -- by cutting out dev and testing time for the algorithm
Speed. Most libraries out there are optimized to a level far greater than a standard developer (such as ourselves) could achieve
Standards compliance. There are many image formats out there, and using a library cuts the problem of standards compliance out of the equation.
If you're doing this yourself, then your problem can be divided into the following sub-problems:
Simple color quantization. As #Alf P. Steinbach pointed out, this is just "downscaling" the number of colors. RGB24 has 8 bits per R, G, B channels, each. For RGB16 you can do a number of conversions:
Equal number of bits for each of R, G, B. This typically means 4 or 5 bits each.
Favor the green channel (human eyes are more sensitive to green) and give it 6 bits. R and B get 5 bits.
You can even do the same thing for RGB24 to RGB8, but the results won't be as pretty as a palletized image:
4 bits green, 2 red, 2 blue.
3 bits green, 5 bits between red and blue
Palletization (indexed color). This is for going from RGB24 to RGB8 and RGB4. This is a hard problem to solve by yourself.
Color to grayscale conversion. Very easy. Convert your RGB24 to YUV' color space, and keep the Y' channel. That will give you 8bpp grayscale. If you want 4bpp grayscale, then you either quantize or do palletization.
Also be sure to check out chroma subsampling. Often, you can decrease the bitrate by a third without visible losses to image quality.
With that breakdown, you can divide and conquer. Problems 1 and 2 you can solve pretty quickly. That will allow you to see the quality you can get simply by doing coarser color quantization.
Whether or not you want to solve Problem 2 will depend on the result from above. You said that speed is more important, so if the quality of color quantization only is good enough, don't bother with palletization.
Finally, you never mentioned WHY you are doing this. If this is for reducing storage space, then you should be looking at image compression. Even lossless compression will give you better results than reducing the color depth alone.
EDIT
If you're set on using PNG as the final format, then your options are quite limited, because both RGB16 and RGB8 are not valid combinations in the PNG header.
So what this means is: regardless of bit depth, you will have to switch to index color if you want RGB color images below 24bpp (8 bits per channel). This means you will NOT be able to take advantage of the color quantization and chroma decimation that I mentioned above -- it's not supported in PNG. So this means you will have to solve Problem 2 -- palletization.
But before you think about that, some more questions:
What are the dimensions of your images?
What sort of ideal file-size are you after?
How close to that ideal file-size do you get with straight RBG24 + PNG compression?
What is the source of your images? You've mentioned screenshots, but since you're so concerned about disk space, I'm beginning to suspect that you might be dealing with image sequences (video). If this is so, then you could do better than PNG compression.
Oh, and if you're serious about doing things with PNG, then definitely have a look at this library.
Find your self a copy of the ImageMagick [sic] library. It's very configurable, so you can teach it about the details of some binary format that you need to process...
See: ImageMagick, which has a very practical license.
I received acceptable results (preliminary) by GDI+, v.1.1 that is shipped with Vista and Win7. It allows conversion to 16bpp (I used PixelFormat16bppRGB565) and to 8bpp and 4bpp using standard palettes. Better quality could be received by "optimal palette" - GDI+ would calculate optimal palette for each screenshot, but it's two times slower conversion. Grayscale was received by specifying simple custom palette, e.g. as demonstrated here, except that I didn't need to modify pixels manually, Bitmap::ConvertFormat() did it for me.
[EDIT] results were really acceptable until I decided to check the solution on WinXP. Surprisingly, Microsoft decided to not ship GDI+ v.1.1 (required for Bitmap::ConvertFormat) to WinXP. Nice move! So I continue researching...
[EDIT] had to reimplement this on clean GDI hardcoding palettes from GDI+