Use png8 instead of png24 as RGB height map - compression

I have a 1 band DEM geotiff, and a formula to convert altitude -> RGB and RGB -> altitude (something like this: https://docs.mapbox.com/help/troubleshooting/access-elevation-data).
Using the formula (and GDAL/Python), I converted my geotiff to a 3 bands (R, G & B) geotiff, each band having values in the 0-255 range.
Using mapnik / mod_tile, I'm then serving my geotiff as PNG tiles to a web client. Everything is fine if I setup mod_tile to serve the tiles as 24 or 32 bits PNGs. But if I serve them as 8 bits PNGs (to reduce their size), then the decoded values are a bit off (I can't see the difference when looking at the image, but the RGB values are not exactly the same and it thus messes my decoded altitudes).
Am I right expecting to be able to do what I want (retrieving the exact RGB values) with 8 bits PNGs instead of 24/32, or there something I don't understand about 8 bits PNGs (if so, I'll have to dive into mod_tile code, I guess that when we ask 8 bits, it generates 24 or 32 and then compress)?

No, you are not right in expecting that you can compress any ensemble of 24-bit values losslessly to 8-bit values. If there are more than 256 different 24-bit values in the original, then some of those different values will necessarily map to the same 8-bit value.

Related

What kind of image is used for training in Mask RCNN( only 8 bit or 16 bit images or any depth)?

I have a small doubt regarding the MaskRCNN images for training purpose. Is MRCNN is taking only 8 bit images for training? if its taking any 16 bit or 32 bit images, How it will help us by training?
Usually the visualization happens for 8 bit images. I had a dilemma if its processing 16bit how it will help in classification and mapping.
As long as you keep the data type the same and the image intensity range to be "consistent" for all input images, then it should be fine. For example if we prefer 8 bit image, you should rescale 16 and 32 bit images to 8 bit, i.e. input images should be of type uint8 - values between [0,255]. This type of "preprocessing" is required when training and doing inference with most machine learning models.
In one of the examples by matterport/Mask_RCNN, the input images are of type uint8.
Alternatively, why not just cast the images to be of type float and range [0,1], thereby preserving the pixel resolution for 16 and 32bit images? Hope this helps.

DXT3 (BC2) compression format alpha data

I'm trying to read image info from a dds file. I managed to get the DXT1 and DXT5 formats working fine, however I have a question concerning the alpha data of the DXT3 format (Also know as BC2).
When looking at the layout of a compressed BC2 block, it shows the alpha data for the 16-pixel block is stored in the first 8 bytes of the data, with each value taking up 4 bits.
Does this mean that, since the stored alpha value can only be 0-15, the actual alpha data is calculated as follows:
unsigned char bitvalue = GetAlphaBitValue(); // assume this works and gets the 4-bit value i am looking for
unsigned char alpha = (bitvalue / 15.0f) * 255;
Is this correct, or am I looking at it wrong?
That's what this specification seems to say:
The alpha component for a texel at location (x,y) in the block is
given by alpha(x,y) / 15.
Because the result there is supposed to be in [0 .. 1], not [0 .. 255].
Since 255 is divisible by 15, it's probably easier to think of the transformation to [0 .. 255] as
uint8_t alpha = bitvalue * 17;
It is now more obvious that what's going on is the usual "replicate" mapping (just like eg CSS short color codes) that gives a nice spreading of output values (allows both the minimum and the maximum values to be encoded, and has equal steps between all values).

Converting 12 bit color values to 8 bit color values C++

I'm attempting to convert 12-bit RGGB color values into 8-bit RGGB color values, but with my current method it gives strange results.
Logically, I thought that simply dividing the 12-bit RGGB into 8-bit RGGB would work and be pretty simple:
// raw_color_array contains R,G1,G2,B in a bayer pattern with each element
// ranging from 0 to 4096
for(int i = 0; i < array_size; i++)
{
raw_color_array[i] /= 16; // 4096 becomes 256 and so on
}
However, in practice this actually does not work. Given, for example, a small image with water and a piece of ice in it you can see what actually happens in the conversion (right most image).
Why does this happen? and how can I get the same (or close to) image on the left, but as 8-bit values instead? Thanks!
EDIT: going off of #MSalters answer, I get a better quality image but the colors are still drasticaly skewed. What resources can I look into for converting 12-bit data to 8-bit data without a steep loss in quality?
It appears that your raw 12 bits data isn't on a linear scale. That is quite common for images. For a non-linear scale, you can't use a linear transformation like dividing by 16.
A non-linear transform like sqrt(x*16) would also give you an 8 bits value. So would std::pow(x, 12.0/8.0)
A known problem with low-gradient images is that you get banding. If your images has an area where the original value varies from say 100 to 200, the 12-to-8 bit reduction will shrink that to less than 100 different values. You get rounding , and with naive (local) rounding you get bands. Linear or non-linear, there will then be some inputs x that all map to y, and some that map to y+1. This can be mitigated by doing the transformation in floating point, and then adding a random value between -1.0 and +1.0 before rounding. This effectively breaks up the band structure.
After you clarified that this 12bit data is only for one color, here is my simple answer:
Since you want to convert its value to its 8 bit equivalent, it obviously means you lost some of the data (4bits). This is the reason why you are not getting the same output.
After clarification:
If you want to retain the actual colour values!
Apply de-mosaicking in the 12 Bit image and then scale the resultant data to 8 - Bit. So that the colour loss due to de-mosaicking will be less compared to the previous approach.
You say that your 12-bits represent 2^12 bits of one colour. That is incorrect. There are reds, greens and blues in your image. Look at the histogram. I made this with ImageMagick at the command line:
convert cells.jpg histogram:png:h.png
If you want 8-bits per pixel, rather than trying to blindly/statically apportion 3 bits to Green, 2 bits to Red and 3 bits to Blue, you would probably be better off going with an 8-bit palette so you can have 250+ colours of all variations rather than restricting yourself to just 8 blue shades, 4 reds an 8 green. So, like this:
convert cells.jpg -colors 254 PNG8:result.png
Here is the result of that beside the original:
The process above is called "quantisation" and if you want to implement it in C/C++, there is a writeup here.

OpenCV convertTo()

I came across this code:
image.convertTo(temp_image,CV_16SC3);
I saw the description of the convertTo() function from here, but what confuses me is image. How can we read the above code? What would be the relation between image and temp_image?
Thanks.
The other answers here are correct, but lack some details. Let me try.
image.convertTo(temp_image,CV_16SC3);
You have a source image image, and a destination image temp_image. You didn't specify the type of image, but probably is CV_8UC3 or CV_32FC3, i.e. a 3 channel image (since convertTo doesn't change the number of channels), where each channel has depth 8 bit (unsigned char, CV_8UC3) or 32 bit (float, CV_32FC3).
This line of code will change the depth of each channel, so that temp_image has each channel of depth 16 bit (short). Specifically it's a signed short, since the type specifier has the S: CV_16SC3.
Note that if you are narrowing down the depth, as in the case from float to signed short, then saturate_cast will make sure that all the values in temp_image will be in the correct range, i.e. in [–32768, 32767] for signed short.
Why you need to change the depth of an image?
Some OpenCV functions require input images with a specific depth.
You need a matrix to contain a different range of values. E.g. if you need to sum (or subtract) some images CV_8UC3 (tipically BGR images), you'd better store the result in a CV_16SC3 or you'll probably get wrong results due to saturations, since the range for CV_8U images is in [0,255]
You read with imread, or want to store with imwrite images with 16bit depth. This are usually used (AFAIK) in medical or graphics application to allow a wider range of colors. However, most monitors do not support 16bit image visualization.
There may be other cases, let me know if I miss the one important to you.
An image is a matrix of pixel information (i.e. a 1080p image will be a 1,920 × 1,080 matrix where each entry contains rbg values for that pixel). All you are doing is reformatting that matrix (each pixel entry, iteratively) into a new type (CV_16SC3) so it can be read by different programs.
The temp_image is a new matrix of pixel information based off of image formatted into CV_16SC3.
The first one is a source, the second one - destination. So, it takes image, converts it into type CV_16SC3 and stores in temp_image.

How can I store each pixel in an image as a 16 bit index into a colortable?

I have a 2D array of float values:
float values[1024][1024];
that I want to store as an image.
The values are in the range: [-range,+range].
I want to use a colortable that goes from red(-range) to white(0) to black(+range).
So far I have been storing each pixel as a 32 bit RGBA using the BMP file format. The total memory for storing my array is then 1024*1024*4 bytes = 4MB.
This seems very vasteful knowing that my colortable is "1 dimensional" whereas the 32 RGBA is "4 dimensional".
To see what I mean; lets assume that my colortable went from black(-range) to blue(+range).
In this case the only component that varies is clearly the B, all the others are fixed.
So I am only getting 8bits of precision whereas I am "paying" for 32 :-(.
I am therefore looking for a "palette" based file format.
Ideally I would like each pixel to be a 16 bit index (unsigned short int) into a "palette" consisting of 2^16 RGBA values.
The total memory used for storing my array in this case would be: 1024*1024*2 bytes + 2^16*4bytes = 2.25 MB.
So I would get twice as good precision for almost half the "price"!
Which image formats support this?
At the moment I am using Qt's QImage to write the array to file as an image. QImage has an internal 8 bit indexed ("palette") format. I would like a 16 bit one. Also I did not understand from Qt's documentation which file formats support the 8 bit indexed internal format.
Store it as a 16 bit greyscale PNG and do the colour table manually yourself.
You don't say why your image can be decomposed in 2^16 colours but using your knowledge of this special image you could make an algorithm so that indices that are near each other have similar colours and are therefore easier to compress.
"I want to use a colortable that goes from red(-range) to white(0) to black(+range)."
Okay, so you've got FF,00,00 (red) to FF,FF,FF (white) to 00,00,00 (black). In 24 bit RGB, that looks to me like 256 values from red to white and then another 256 from white to black. So you don't need a palette size of 2^16 (16384); you need 2^9 (512).
If you're willing to compromise and use a palette size of 2^8 then the GIF format could work. That's still relatively fine resolution: 128 shades of red on the negative size, plus 128 shades of grey on the positive. Each of a GIF's 256 palette entries can be an RGB value.
PNG is another candidate for palette-based color. You have more flexibility with PNG, including RGBA if you need an alpha channel.
You mentioned RGBA in your question but the use of the alpha channel was not explained.
So independent of file format, if you can use a 256 entry palette then you will have a very well compressed image. Back to your mapping requirement (i.e. mapping floats [-range -> 0.0 -> +range] to [red -> white -> black], here is a 256 entry palette that covers the range red-white-black you wanted:
float entry# color rgb
------ ------- ----- --------
-range 0 00 red FF,00,00
1 01 FF,02,02
2 02 FF,04,04
... ...
... ...
127 7F FF,FD,FD
0.0 128 80 white FF,FF,FF
129 81 FD,FD,FD
... ....
... ...
253 FD 04,04,04
254 FE 02,02,02
+range 255 FF black 00,00,00
If you double the size of the color table to be 9 bits (512 values) then you can make the increments between RGB entries more fine: increments of 1 instead of 2. Such a 9-bit palette would give you full single-channel resolution in RGB on both the negative and positive sides of the range. It's not clear that allocating 16 bits of palette would really be able to store any more visual information given the mapping you want to do. I hope I understand your question and maybe this is helpful.
PNG format supports paletted format up to 8-bits, but should also support grayscale images up to 16-bits. However, 16-bit modes are less used, and software support may be lacking. You should test your tools first.
But you could also test with plain 24-bit RGB truecolor PNG images. They are compressed and should produce better result than BMP in any case.