Understanding PNG file format IDAT segment - c++

From the sample image below, I have a border in yellow just for display purposes only.
The actual .png file is a simple black/white image 3 pixels by 3 pixels. I was originally thinking to try as a 2x2, but that would not help trying to interpret low/hi vs hi/low drawing stream. At least this way, I would have two black, one white from the top, or one white, two black from the bottom..
So I read the chunks of data, get to the IDAT chunk, decode that (zlib) and come up with 12 bytes as follows
00 20 00 40 00 80
So, my question, how does the above get broken down into the 3x3 black and white sample... Also, it is saved in palette format and properly recognizes the bit depth of 1 and color palette of 2... color pallet[0] is RGBA all zeros. Palette1 has RGBA of 255, 255, 255, 0
I'll eventually get into the multiple other depth formats later, just wanted to start with what would expect to be the easiest.
Part II. Any guidance on handling the other depth formats would help if anything special to be considered especially regarding alpha channel (which I am already looking for in the palette) that might trip me up.

It wouuld be easier if you use libpng, so I guess this is for learning purposes.
The thing is if you decompress the IDAT chunk directly, you get some data that is not supposed to be displayed and/or may need to be transformed (because a filter was applied) to get the actual bytes. In PNG format each line starts with an extra byte that tells you which filter was applied to that line, the remaining bytes contain the line pixels.
BTW, 00 20 00 40 00 80 are 6 bytes only (not 12, as you think). Now if you see this data as binary, your 3 lines would look like this:
00000000 00100000
00000000 01000000
00000000 10000000
Now, your image is 1 bit per pixel, so 1 byte is required to save a line of 3 pixels. The 3 highest bits are actually used (the 5 lower bits are ignored). I replaced the ignored bits with a x, so I think is easier to see the actual pixels (0 is black, 1 is white):
00000000 001xxxxx
00000000 010xxxxx
00000000 100xxxxx
In this case, no filter was applied to any line, because the first byte of each line is zero (0 means no filter applied, values from 1 to 4 means a filter was applied).

Related

How can I store each pixel in an image as a 16 bit index into a colortable?

I have a 2D array of float values:
float values[1024][1024];
that I want to store as an image.
The values are in the range: [-range,+range].
I want to use a colortable that goes from red(-range) to white(0) to black(+range).
So far I have been storing each pixel as a 32 bit RGBA using the BMP file format. The total memory for storing my array is then 1024*1024*4 bytes = 4MB.
This seems very vasteful knowing that my colortable is "1 dimensional" whereas the 32 RGBA is "4 dimensional".
To see what I mean; lets assume that my colortable went from black(-range) to blue(+range).
In this case the only component that varies is clearly the B, all the others are fixed.
So I am only getting 8bits of precision whereas I am "paying" for 32 :-(.
I am therefore looking for a "palette" based file format.
Ideally I would like each pixel to be a 16 bit index (unsigned short int) into a "palette" consisting of 2^16 RGBA values.
The total memory used for storing my array in this case would be: 1024*1024*2 bytes + 2^16*4bytes = 2.25 MB.
So I would get twice as good precision for almost half the "price"!
Which image formats support this?
At the moment I am using Qt's QImage to write the array to file as an image. QImage has an internal 8 bit indexed ("palette") format. I would like a 16 bit one. Also I did not understand from Qt's documentation which file formats support the 8 bit indexed internal format.
Store it as a 16 bit greyscale PNG and do the colour table manually yourself.
You don't say why your image can be decomposed in 2^16 colours but using your knowledge of this special image you could make an algorithm so that indices that are near each other have similar colours and are therefore easier to compress.
"I want to use a colortable that goes from red(-range) to white(0) to black(+range)."
Okay, so you've got FF,00,00 (red) to FF,FF,FF (white) to 00,00,00 (black). In 24 bit RGB, that looks to me like 256 values from red to white and then another 256 from white to black. So you don't need a palette size of 2^16 (16384); you need 2^9 (512).
If you're willing to compromise and use a palette size of 2^8 then the GIF format could work. That's still relatively fine resolution: 128 shades of red on the negative size, plus 128 shades of grey on the positive. Each of a GIF's 256 palette entries can be an RGB value.
PNG is another candidate for palette-based color. You have more flexibility with PNG, including RGBA if you need an alpha channel.
You mentioned RGBA in your question but the use of the alpha channel was not explained.
So independent of file format, if you can use a 256 entry palette then you will have a very well compressed image. Back to your mapping requirement (i.e. mapping floats [-range -> 0.0 -> +range] to [red -> white -> black], here is a 256 entry palette that covers the range red-white-black you wanted:
float entry# color rgb
------ ------- ----- --------
-range 0 00 red FF,00,00
1 01 FF,02,02
2 02 FF,04,04
... ...
... ...
127 7F FF,FD,FD
0.0 128 80 white FF,FF,FF
129 81 FD,FD,FD
... ....
... ...
253 FD 04,04,04
254 FE 02,02,02
+range 255 FF black 00,00,00
If you double the size of the color table to be 9 bits (512 values) then you can make the increments between RGB entries more fine: increments of 1 instead of 2. Such a 9-bit palette would give you full single-channel resolution in RGB on both the negative and positive sides of the range. It's not clear that allocating 16 bits of palette would really be able to store any more visual information given the mapping you want to do. I hope I understand your question and maybe this is helpful.
PNG format supports paletted format up to 8-bits, but should also support grayscale images up to 16-bits. However, 16-bit modes are less used, and software support may be lacking. You should test your tools first.
But you could also test with plain 24-bit RGB truecolor PNG images. They are compressed and should produce better result than BMP in any case.

How to transform rgb (three bytes) to one byte for bitmap format?

I have data for every pixel red one byte, green one byte, blue one byte. I need to pack this to 8 bits bitmap, so I have only one byte for pixel. How to transform rgb (three bytes) to one byte for bitmap format ?
( I am using C++ and cannot use any external libraries)
I think you misunderstood how to form a bitmap structure. You do not need to pack (somehow) 3 bytes into one. That is not possible after all, unless you throw away information (like using special image formats GL_R3_G3_B2).
The BMP file format wiki page shows detailed BMP format : it is a header, followed by data. Now depending on what you set in your header, it is possible to form a BMP image containing RBG data component, where each component is one byte.
First you need to decide how many bits you want to allocate for each color.
3bit per color will overflow a byte (9bits)
2bits per color will underflow;
In three byte RGB bitmap you have one byte to represent each color's intensity. Where 0 is minimum and 255 is max intensity. When you convert it to 1 byte bitmap (assuming you will choose 2bits per color ) transform should be:
1-byte red color/64
i.e you will get only 4 shades out of a spectrum of 265 shades per color.
First you have to produce 256 colors palette that best fits your source image.
Then you need to dither the image using the palette you've generated.
Both problems have many well-known solutions. However, it's impossible to produce high-quality result completely automatic: for different source images, different approaches work best. For example, here's the Photoshop UI that tunes the parameters of the process:

C++ Unicode for Color16 Values

This has been one huge headache. Ive googled everything and found very little, and have little knowledge on unicode, learned a bit from the searching. What I am needing is really simple, right? A struct I am using requires COLOR16.
So I know 0x0000 and 0x00FF is 0 to 255, which for COLOR16 is useless.
The four zeros each can represent 0 to 15 Ive seen.
I know COLOR16 represents all 16^4 colors.
But I cannot for the life of me figure out how to convert say, (R:100; G:35; B:42) to a unicode value.
I could really use some info on this, or a tutorial or anything.
Thanks.
I know what you're looking for. You're just asking the wrong way. You mean a short value, not a Unicode value. The common name for this is RGB565. That means 5-bits for red, 6 for green and 5 for blue.
That adds up to 16 bits. You pack the bits in like this:
unsigned short val = ((r<<8) & 0xf800) | ((g<<3) & 0x07e0) | (b>>3);
The bits are like this:
R 00000000 12345xxx -> 12345000 00000000 (shift left by 8 and masked)
G 00000000 123456xx -> 00000123 45600000 (shift left by 3 and masked)
B 00000000 12345xxx -> 00000000 00012345 (shift right by 3, no mask required)
Obviously information is lost in this process. You are just taking the most significant bits of the colour and using that. It's like lossy compression, but pretty good for video when you don't notice the loss of colour definition as much. The reason green gets the extra bit is because human eyes are more sensitive to colours in that spectrum.
Finally found a random example that had the solution in it. This takes a COLORREF and extracts 16bit colors for the TRIVERTEX struct:
vertex[1].Red = GetRValue(clrStart)<<8;
vertex[1].Green = GetGValue(clrStart)<<8;
vertex[1].Blue = GetBValue(clrStart)<<8;
It is misleading in the Windows API if that is what you are asking, RGBA all "COLOR16" references should be stored as byte [0-255] for each channel in the TRIVERTEX structure. though I have seen most use short values which are 8 bits
Red [0-255]; <= byte
Green [0-255]; <= byte
Blue [0-255]; <= byte
Alpha [0-255]; <= byte
Microsoft: what can you say

Compressing BMP methods

I am working on a project to losslessly compress a specific style of BMP images that look like this
I have thought about doing pattern recognition, to find repetitive blocks of N x N pixels but I feel like it wont be fast enough execution time.
Any suggestions?
EDIT: I have access to the dataset that created these images too, I just use the image to visualize my data.
Optical illusions make it hard to tell for sure but are the colors only black/blue/red/green? If so, the most straightforward compression would be to simply make more efficient use of pixels. I'm thinking pixels use a fixed amount of space regardless of what color they are. Thus, chances are you are using 12x as many pixels as you really need to be. Since a pixel can be a lot more colors than just those four.
A simple way to do that would be to do label the pixels with the following base 4 numbers:
Black = 0
Red = 1
Green = 2
Blue = 3
Example:
The first four colors of the image seems to be Blue-Red-Blue-Blue. This is equal to 3233 in base 4, which is simply EF in base 16 or 239 in base 10. This is enough to define what the red color of the new pixel should be. The next 4 would define the green color and the final 4 define what the blue color is. Thus turning 12 pixels into a single pixel.
Beyond that you'll probably want to look into more conventional compression software.

C++ Bitmap Bit per pixel

I'm trying to understand building a bmp based on raw data in c++ and I have a few questions.
My bmp can be black and white so I figured that the in the bit per pixel field I should go with 1. However in a lot of guides I see the padding field adds the number of bits to keep 32 bit alignment, meaning my bmp will be the same file size as a 24 bit per pixel bmp.
Is this understanding correct or in some way is the 1 bit per pixel smaller than 24, 32 etc?
Thanks
Monochrome bitmaps are aligned too, but they will not take as much space as 24/32-bpp ones.
A row of 5-pixel wide 24-bit bitmap will take 16 bytes: 5*3=15 for pixels, and 1 byte of padding.
A row of 5-pixel wide 32-bit bitmap will take 20 bytes: 5*4=20 for pixels, no need for padding.
A row of 5-pixel wide monochrome bitmap will take 4 bytes: 1 byte for pixels (it is not possible to use less than a byte, so whole byte is taken but 3 of its 8 bits are not used), and 3 bytes of padding.
So, monochrome bitmap will of course be smaller than 24-bit one.
The answer is already given above (that bitmap rows are aligned/padded to 32-bit boundary), however if you want more information, you might want to read DIBs and Their Uses, the "DIB Header" section - it explains in detail.
Every scanline is DWORD-aligned. The scanline is buffered to alignment; the buffering is not necessarily 0.
The scanlines are stored upside down, with the first scan (scan 0) in memory being the bottommost scan in the image. (See Figure 1.) This is another artifact of Presentation Manager compatibility. GDI automatically inverts the image during the Set and Get operations. Figure 1. (Embedded image showing memory and screen representations.)