DXT3 (BC2) compression format alpha data - c++

I'm trying to read image info from a dds file. I managed to get the DXT1 and DXT5 formats working fine, however I have a question concerning the alpha data of the DXT3 format (Also know as BC2).
When looking at the layout of a compressed BC2 block, it shows the alpha data for the 16-pixel block is stored in the first 8 bytes of the data, with each value taking up 4 bits.
Does this mean that, since the stored alpha value can only be 0-15, the actual alpha data is calculated as follows:
unsigned char bitvalue = GetAlphaBitValue(); // assume this works and gets the 4-bit value i am looking for
unsigned char alpha = (bitvalue / 15.0f) * 255;
Is this correct, or am I looking at it wrong?

That's what this specification seems to say:
The alpha component for a texel at location (x,y) in the block is
given by alpha(x,y) / 15.
Because the result there is supposed to be in [0 .. 1], not [0 .. 255].
Since 255 is divisible by 15, it's probably easier to think of the transformation to [0 .. 255] as
uint8_t alpha = bitvalue * 17;
It is now more obvious that what's going on is the usual "replicate" mapping (just like eg CSS short color codes) that gives a nice spreading of output values (allows both the minimum and the maximum values to be encoded, and has equal steps between all values).

Related

Use png8 instead of png24 as RGB height map

I have a 1 band DEM geotiff, and a formula to convert altitude -> RGB and RGB -> altitude (something like this: https://docs.mapbox.com/help/troubleshooting/access-elevation-data).
Using the formula (and GDAL/Python), I converted my geotiff to a 3 bands (R, G & B) geotiff, each band having values in the 0-255 range.
Using mapnik / mod_tile, I'm then serving my geotiff as PNG tiles to a web client. Everything is fine if I setup mod_tile to serve the tiles as 24 or 32 bits PNGs. But if I serve them as 8 bits PNGs (to reduce their size), then the decoded values are a bit off (I can't see the difference when looking at the image, but the RGB values are not exactly the same and it thus messes my decoded altitudes).
Am I right expecting to be able to do what I want (retrieving the exact RGB values) with 8 bits PNGs instead of 24/32, or there something I don't understand about 8 bits PNGs (if so, I'll have to dive into mod_tile code, I guess that when we ask 8 bits, it generates 24 or 32 and then compress)?
No, you are not right in expecting that you can compress any ensemble of 24-bit values losslessly to 8-bit values. If there are more than 256 different 24-bit values in the original, then some of those different values will necessarily map to the same 8-bit value.

Converting 12 bit color values to 8 bit color values C++

I'm attempting to convert 12-bit RGGB color values into 8-bit RGGB color values, but with my current method it gives strange results.
Logically, I thought that simply dividing the 12-bit RGGB into 8-bit RGGB would work and be pretty simple:
// raw_color_array contains R,G1,G2,B in a bayer pattern with each element
// ranging from 0 to 4096
for(int i = 0; i < array_size; i++)
{
raw_color_array[i] /= 16; // 4096 becomes 256 and so on
}
However, in practice this actually does not work. Given, for example, a small image with water and a piece of ice in it you can see what actually happens in the conversion (right most image).
Why does this happen? and how can I get the same (or close to) image on the left, but as 8-bit values instead? Thanks!
EDIT: going off of #MSalters answer, I get a better quality image but the colors are still drasticaly skewed. What resources can I look into for converting 12-bit data to 8-bit data without a steep loss in quality?
It appears that your raw 12 bits data isn't on a linear scale. That is quite common for images. For a non-linear scale, you can't use a linear transformation like dividing by 16.
A non-linear transform like sqrt(x*16) would also give you an 8 bits value. So would std::pow(x, 12.0/8.0)
A known problem with low-gradient images is that you get banding. If your images has an area where the original value varies from say 100 to 200, the 12-to-8 bit reduction will shrink that to less than 100 different values. You get rounding , and with naive (local) rounding you get bands. Linear or non-linear, there will then be some inputs x that all map to y, and some that map to y+1. This can be mitigated by doing the transformation in floating point, and then adding a random value between -1.0 and +1.0 before rounding. This effectively breaks up the band structure.
After you clarified that this 12bit data is only for one color, here is my simple answer:
Since you want to convert its value to its 8 bit equivalent, it obviously means you lost some of the data (4bits). This is the reason why you are not getting the same output.
After clarification:
If you want to retain the actual colour values!
Apply de-mosaicking in the 12 Bit image and then scale the resultant data to 8 - Bit. So that the colour loss due to de-mosaicking will be less compared to the previous approach.
You say that your 12-bits represent 2^12 bits of one colour. That is incorrect. There are reds, greens and blues in your image. Look at the histogram. I made this with ImageMagick at the command line:
convert cells.jpg histogram:png:h.png
If you want 8-bits per pixel, rather than trying to blindly/statically apportion 3 bits to Green, 2 bits to Red and 3 bits to Blue, you would probably be better off going with an 8-bit palette so you can have 250+ colours of all variations rather than restricting yourself to just 8 blue shades, 4 reds an 8 green. So, like this:
convert cells.jpg -colors 254 PNG8:result.png
Here is the result of that beside the original:
The process above is called "quantisation" and if you want to implement it in C/C++, there is a writeup here.

How can I store each pixel in an image as a 16 bit index into a colortable?

I have a 2D array of float values:
float values[1024][1024];
that I want to store as an image.
The values are in the range: [-range,+range].
I want to use a colortable that goes from red(-range) to white(0) to black(+range).
So far I have been storing each pixel as a 32 bit RGBA using the BMP file format. The total memory for storing my array is then 1024*1024*4 bytes = 4MB.
This seems very vasteful knowing that my colortable is "1 dimensional" whereas the 32 RGBA is "4 dimensional".
To see what I mean; lets assume that my colortable went from black(-range) to blue(+range).
In this case the only component that varies is clearly the B, all the others are fixed.
So I am only getting 8bits of precision whereas I am "paying" for 32 :-(.
I am therefore looking for a "palette" based file format.
Ideally I would like each pixel to be a 16 bit index (unsigned short int) into a "palette" consisting of 2^16 RGBA values.
The total memory used for storing my array in this case would be: 1024*1024*2 bytes + 2^16*4bytes = 2.25 MB.
So I would get twice as good precision for almost half the "price"!
Which image formats support this?
At the moment I am using Qt's QImage to write the array to file as an image. QImage has an internal 8 bit indexed ("palette") format. I would like a 16 bit one. Also I did not understand from Qt's documentation which file formats support the 8 bit indexed internal format.
Store it as a 16 bit greyscale PNG and do the colour table manually yourself.
You don't say why your image can be decomposed in 2^16 colours but using your knowledge of this special image you could make an algorithm so that indices that are near each other have similar colours and are therefore easier to compress.
"I want to use a colortable that goes from red(-range) to white(0) to black(+range)."
Okay, so you've got FF,00,00 (red) to FF,FF,FF (white) to 00,00,00 (black). In 24 bit RGB, that looks to me like 256 values from red to white and then another 256 from white to black. So you don't need a palette size of 2^16 (16384); you need 2^9 (512).
If you're willing to compromise and use a palette size of 2^8 then the GIF format could work. That's still relatively fine resolution: 128 shades of red on the negative size, plus 128 shades of grey on the positive. Each of a GIF's 256 palette entries can be an RGB value.
PNG is another candidate for palette-based color. You have more flexibility with PNG, including RGBA if you need an alpha channel.
You mentioned RGBA in your question but the use of the alpha channel was not explained.
So independent of file format, if you can use a 256 entry palette then you will have a very well compressed image. Back to your mapping requirement (i.e. mapping floats [-range -> 0.0 -> +range] to [red -> white -> black], here is a 256 entry palette that covers the range red-white-black you wanted:
float entry# color rgb
------ ------- ----- --------
-range 0 00 red FF,00,00
1 01 FF,02,02
2 02 FF,04,04
... ...
... ...
127 7F FF,FD,FD
0.0 128 80 white FF,FF,FF
129 81 FD,FD,FD
... ....
... ...
253 FD 04,04,04
254 FE 02,02,02
+range 255 FF black 00,00,00
If you double the size of the color table to be 9 bits (512 values) then you can make the increments between RGB entries more fine: increments of 1 instead of 2. Such a 9-bit palette would give you full single-channel resolution in RGB on both the negative and positive sides of the range. It's not clear that allocating 16 bits of palette would really be able to store any more visual information given the mapping you want to do. I hope I understand your question and maybe this is helpful.
PNG format supports paletted format up to 8-bits, but should also support grayscale images up to 16-bits. However, 16-bit modes are less used, and software support may be lacking. You should test your tools first.
But you could also test with plain 24-bit RGB truecolor PNG images. They are compressed and should produce better result than BMP in any case.

Getting RGB from WIC image c++

I am using WIC to load in an image and then I want to be able to get the RGB values of each pixel. I have got as far as using getDataPointer() to create a byte buffer (which I cast to a COLORREF array) but after this things get strange.
I have a 10x10 24bit png that I'm testing with. If I look at the size value getDataPointer() gives me it says it's 300, which makes sense because 10 * 10 * 3 (for 3 bytes per pixel) = 300. If I do getStride() it gives me 30, which also makes sense.
So I create a loop to go through the COLORREF with an iterator and the condition is i < size/3 because I know there are only 100 pixels in the array. I then use the macros GetRValue(), GetGValue() and GetBValue() to get the rgb values out.
This is when things go weird - My small image is just a test image with solid red, green, blue and black pixels in, but my RGB values come out (255, 0, 0), (0, 255, 0), (27, 255, 36), (0, 0, 0) etc. it seems that some values don't come out properly and are corrupted somehow. Also, the last 20/30 pixels of the image are either loads of crazy colors or all black making me think there some sort of corruption.
I also tested it with a larger actual photograph and this comes out all grayscale and repeating the same pattern, making me think it's a stride issue but I don't understand how because when I call getPixelFormat() WIC says it's 24bppBGR or 24bppRGB depending on the images.
Does anyone have any idea what I am doing wrong? Shouldn't I be using COLORREF and the macros or something like that?
Thanks for your time.
EDIT
Hmm I've still got a problem here. I tried with another 24bit PNG that PixelFormat() was reporting as 24bppBGR but it seems like the stride is off or something because it is drawing as skewed (obligatory nyan cat test):
EDIT 2
Okay so now it seems I have some that work and some that don't. Some reporting themselves as 24bpp BGR work while others look like the image above and if I calculate the stride it gives me compared to what it should be they are different and also the size of the buffer is different too. I also have some 32bpp BGR images and some of those work and others don't. Am I missing something here? What could make up the extra bytes in the buffer?
Here is an example image:
24bppBGR JPEG:
width = 126
height = 79
buffer size = 30018
stride = 380
If I calculate this:
buffer size should be: width * height * 3 = 126 * 79 * 3 = 29862
difference between calculation and actual buffer size: 30018 - 29862 = 156 bytes
stride size should be: width * 3 = 378
difference between calculation and actual buffer size: 380 - 378 = 2 bytes
I was thinking that we had 2 extra bytes per line but 79 * 2 is 158 not 156 hmm.
If I do these calculations for the images that have worked so far I find no difference in the calculations and the values the code gives me...
Am I understanding what is happening here wrong? Should these calculations now work as I have thought?
Thanks again
You shouldn't be using COLORREF and related macros. COLORREF is a 4-byte type, and you have 3-byte pixels. Accessing the data as an array of COLORREF values won't work. Instead, you should access it as an array of bytes with each pixel located at ((x + y * width) * 3). The order of individual channels is indicated by the format name. So if it's 24bppBGR you'd do data[(x + y * width) * 3] to get the blue channel, data[(x + y * width) * 3 + 1] for green, and data[(x + y * width) * 3 + 2] for red.
If you really want an array of pixels, you can make a structure with 3 BYTE fields, but since the meaning of those fields depends on the pixel format, that may not be useful.
In fact, you can't assume an arbitrary image will load as a 24-bit format at all. The number of formats you could get is larger than you could reasonably be expected to support.
Instead, you should use WICConvertBitmapSource to convert the data to a format you can work with. If you prefer to work with a COLORREF array and related macros, use GUID_WICPixelFormat32bppBGR.

How to read direct3d texture pixels

So I have a x8r8g8b8 formatted IDirect3DSurface9 that contains the contents of the back buffer. When I call LockRect on it I get access to a struct containing pBits, a pointer to the pixels I assume, and and integer Pitch (which I am very unclear about its purpose).
How to read the individual pixels?
Visual Studio 2008 C++
The locked area is stored in a D3DLOCKED_RECT. I haven't ever used this but the documentation says it is the "Number of bytes in one row of the surface". Actually people would normally call this "stride" (some terms explained in the MSDN).
For example, if one pixel has 4 bytes (8 bits for each component of XRGB), and the texture width is 7, the image is usually stored as 8*4 bytes instead of 7*4 bytes because the memory can be accessed faster if the data is DWORD-aligned.
So, in order to read pixel [x, y] you would have to read
uint8_t *pixels = rect.pBits;
uint32_t *mypixel = (uint32_t*)&pixels[rect.Pitch*y + 4*x];
where 4 is the size of a pixel. *myPixel would be the content of the pixel in my example.
Yep, you would access the individual RGB components of the pixel like that.
The first byte of the pixel is not used, but it is more efficient to use 4 Bytes per pixel, so that each pixel is aligned on a 32Bit boundary (that's also, why there's the pitch).
In your example, the x is not used, but note that there are lso other pixel formats, for example ARGB, which stores the alpha value (transparency) in the first byte. Sometimes the colors are also reversed (BGR instead of RGB). If you're unsure what byte corresponds to what color, a good trick is to create a texture which is entirely red, green or blue and then check which of the 4 bytes has the value 255.