Getting RGB from WIC image c++ - c++

I am using WIC to load in an image and then I want to be able to get the RGB values of each pixel. I have got as far as using getDataPointer() to create a byte buffer (which I cast to a COLORREF array) but after this things get strange.
I have a 10x10 24bit png that I'm testing with. If I look at the size value getDataPointer() gives me it says it's 300, which makes sense because 10 * 10 * 3 (for 3 bytes per pixel) = 300. If I do getStride() it gives me 30, which also makes sense.
So I create a loop to go through the COLORREF with an iterator and the condition is i < size/3 because I know there are only 100 pixels in the array. I then use the macros GetRValue(), GetGValue() and GetBValue() to get the rgb values out.
This is when things go weird - My small image is just a test image with solid red, green, blue and black pixels in, but my RGB values come out (255, 0, 0), (0, 255, 0), (27, 255, 36), (0, 0, 0) etc. it seems that some values don't come out properly and are corrupted somehow. Also, the last 20/30 pixels of the image are either loads of crazy colors or all black making me think there some sort of corruption.
I also tested it with a larger actual photograph and this comes out all grayscale and repeating the same pattern, making me think it's a stride issue but I don't understand how because when I call getPixelFormat() WIC says it's 24bppBGR or 24bppRGB depending on the images.
Does anyone have any idea what I am doing wrong? Shouldn't I be using COLORREF and the macros or something like that?
Thanks for your time.
EDIT
Hmm I've still got a problem here. I tried with another 24bit PNG that PixelFormat() was reporting as 24bppBGR but it seems like the stride is off or something because it is drawing as skewed (obligatory nyan cat test):
EDIT 2
Okay so now it seems I have some that work and some that don't. Some reporting themselves as 24bpp BGR work while others look like the image above and if I calculate the stride it gives me compared to what it should be they are different and also the size of the buffer is different too. I also have some 32bpp BGR images and some of those work and others don't. Am I missing something here? What could make up the extra bytes in the buffer?
Here is an example image:
24bppBGR JPEG:
width = 126
height = 79
buffer size = 30018
stride = 380
If I calculate this:
buffer size should be: width * height * 3 = 126 * 79 * 3 = 29862
difference between calculation and actual buffer size: 30018 - 29862 = 156 bytes
stride size should be: width * 3 = 378
difference between calculation and actual buffer size: 380 - 378 = 2 bytes
I was thinking that we had 2 extra bytes per line but 79 * 2 is 158 not 156 hmm.
If I do these calculations for the images that have worked so far I find no difference in the calculations and the values the code gives me...
Am I understanding what is happening here wrong? Should these calculations now work as I have thought?
Thanks again

You shouldn't be using COLORREF and related macros. COLORREF is a 4-byte type, and you have 3-byte pixels. Accessing the data as an array of COLORREF values won't work. Instead, you should access it as an array of bytes with each pixel located at ((x + y * width) * 3). The order of individual channels is indicated by the format name. So if it's 24bppBGR you'd do data[(x + y * width) * 3] to get the blue channel, data[(x + y * width) * 3 + 1] for green, and data[(x + y * width) * 3 + 2] for red.
If you really want an array of pixels, you can make a structure with 3 BYTE fields, but since the meaning of those fields depends on the pixel format, that may not be useful.
In fact, you can't assume an arbitrary image will load as a 24-bit format at all. The number of formats you could get is larger than you could reasonably be expected to support.
Instead, you should use WICConvertBitmapSource to convert the data to a format you can work with. If you prefer to work with a COLORREF array and related macros, use GUID_WICPixelFormat32bppBGR.

Related

DXT3 (BC2) compression format alpha data

I'm trying to read image info from a dds file. I managed to get the DXT1 and DXT5 formats working fine, however I have a question concerning the alpha data of the DXT3 format (Also know as BC2).
When looking at the layout of a compressed BC2 block, it shows the alpha data for the 16-pixel block is stored in the first 8 bytes of the data, with each value taking up 4 bits.
Does this mean that, since the stored alpha value can only be 0-15, the actual alpha data is calculated as follows:
unsigned char bitvalue = GetAlphaBitValue(); // assume this works and gets the 4-bit value i am looking for
unsigned char alpha = (bitvalue / 15.0f) * 255;
Is this correct, or am I looking at it wrong?
That's what this specification seems to say:
The alpha component for a texel at location (x,y) in the block is
given by alpha(x,y) / 15.
Because the result there is supposed to be in [0 .. 1], not [0 .. 255].
Since 255 is divisible by 15, it's probably easier to think of the transformation to [0 .. 255] as
uint8_t alpha = bitvalue * 17;
It is now more obvious that what's going on is the usual "replicate" mapping (just like eg CSS short color codes) that gives a nice spreading of output values (allows both the minimum and the maximum values to be encoded, and has equal steps between all values).

How can I store each pixel in an image as a 16 bit index into a colortable?

I have a 2D array of float values:
float values[1024][1024];
that I want to store as an image.
The values are in the range: [-range,+range].
I want to use a colortable that goes from red(-range) to white(0) to black(+range).
So far I have been storing each pixel as a 32 bit RGBA using the BMP file format. The total memory for storing my array is then 1024*1024*4 bytes = 4MB.
This seems very vasteful knowing that my colortable is "1 dimensional" whereas the 32 RGBA is "4 dimensional".
To see what I mean; lets assume that my colortable went from black(-range) to blue(+range).
In this case the only component that varies is clearly the B, all the others are fixed.
So I am only getting 8bits of precision whereas I am "paying" for 32 :-(.
I am therefore looking for a "palette" based file format.
Ideally I would like each pixel to be a 16 bit index (unsigned short int) into a "palette" consisting of 2^16 RGBA values.
The total memory used for storing my array in this case would be: 1024*1024*2 bytes + 2^16*4bytes = 2.25 MB.
So I would get twice as good precision for almost half the "price"!
Which image formats support this?
At the moment I am using Qt's QImage to write the array to file as an image. QImage has an internal 8 bit indexed ("palette") format. I would like a 16 bit one. Also I did not understand from Qt's documentation which file formats support the 8 bit indexed internal format.
Store it as a 16 bit greyscale PNG and do the colour table manually yourself.
You don't say why your image can be decomposed in 2^16 colours but using your knowledge of this special image you could make an algorithm so that indices that are near each other have similar colours and are therefore easier to compress.
"I want to use a colortable that goes from red(-range) to white(0) to black(+range)."
Okay, so you've got FF,00,00 (red) to FF,FF,FF (white) to 00,00,00 (black). In 24 bit RGB, that looks to me like 256 values from red to white and then another 256 from white to black. So you don't need a palette size of 2^16 (16384); you need 2^9 (512).
If you're willing to compromise and use a palette size of 2^8 then the GIF format could work. That's still relatively fine resolution: 128 shades of red on the negative size, plus 128 shades of grey on the positive. Each of a GIF's 256 palette entries can be an RGB value.
PNG is another candidate for palette-based color. You have more flexibility with PNG, including RGBA if you need an alpha channel.
You mentioned RGBA in your question but the use of the alpha channel was not explained.
So independent of file format, if you can use a 256 entry palette then you will have a very well compressed image. Back to your mapping requirement (i.e. mapping floats [-range -> 0.0 -> +range] to [red -> white -> black], here is a 256 entry palette that covers the range red-white-black you wanted:
float entry# color rgb
------ ------- ----- --------
-range 0 00 red FF,00,00
1 01 FF,02,02
2 02 FF,04,04
... ...
... ...
127 7F FF,FD,FD
0.0 128 80 white FF,FF,FF
129 81 FD,FD,FD
... ....
... ...
253 FD 04,04,04
254 FE 02,02,02
+range 255 FF black 00,00,00
If you double the size of the color table to be 9 bits (512 values) then you can make the increments between RGB entries more fine: increments of 1 instead of 2. Such a 9-bit palette would give you full single-channel resolution in RGB on both the negative and positive sides of the range. It's not clear that allocating 16 bits of palette would really be able to store any more visual information given the mapping you want to do. I hope I understand your question and maybe this is helpful.
PNG format supports paletted format up to 8-bits, but should also support grayscale images up to 16-bits. However, 16-bit modes are less used, and software support may be lacking. You should test your tools first.
But you could also test with plain 24-bit RGB truecolor PNG images. They are compressed and should produce better result than BMP in any case.

Compressing BMP methods

I am working on a project to losslessly compress a specific style of BMP images that look like this
I have thought about doing pattern recognition, to find repetitive blocks of N x N pixels but I feel like it wont be fast enough execution time.
Any suggestions?
EDIT: I have access to the dataset that created these images too, I just use the image to visualize my data.
Optical illusions make it hard to tell for sure but are the colors only black/blue/red/green? If so, the most straightforward compression would be to simply make more efficient use of pixels. I'm thinking pixels use a fixed amount of space regardless of what color they are. Thus, chances are you are using 12x as many pixels as you really need to be. Since a pixel can be a lot more colors than just those four.
A simple way to do that would be to do label the pixels with the following base 4 numbers:
Black = 0
Red = 1
Green = 2
Blue = 3
Example:
The first four colors of the image seems to be Blue-Red-Blue-Blue. This is equal to 3233 in base 4, which is simply EF in base 16 or 239 in base 10. This is enough to define what the red color of the new pixel should be. The next 4 would define the green color and the final 4 define what the blue color is. Thus turning 12 pixels into a single pixel.
Beyond that you'll probably want to look into more conventional compression software.

How does QImage with Format_Mono store informations ?

i'm trying to copy values from QImage to my own image structure (because of school work), and I'am not able to figure out, how are pixels stored
API says that when using Format_Mono, The image is stored using 1-bit per pixel.
I created the following code:
QImage image(10,10,QImage::Format_Mono); // create 10x10 image
image.fill(1); // whiten the image
QPainter p;
p.begin(&image);
p.setPen(QPen(QColor(Qt::black)));
p.drawPoint(10,1); // make ONE point black
p.end();
uchar* pixels = image.constBits();
int count = image.byteCount(); // returns 40 !!
First thing: I don't understand why 40 bytes is used (I expected 20 will be more than enough - as would BufferedImage in java do)
Second thing: When iterating throuh pixels, every fourth(starting on third - indexes 2,6,10...) byte is set to 173 and every fourth(starting on fourth - indexes 3,7,11...) byte is set to 186.
Other bytes are correctly(??) set to 255 (white).
I expected 20 bytes, so 19 would be set 255, and one (with colored pixel [10,1] set to other value)
What am I missig? Thank you
API: The scanline data is aligned on a 32-bit boundary.
That was the reason ... the Qt documentation of the method bits() forgot to mention it...

Load Images from memory (libharu) from Magick++ images

I am working on some pdf generation software in c++ based on libharu and I would like to be able to first manipulate images using Magick++ and then load them from memory using libharu function:
HPDF_LoadRawImageFromMem()
Which according to the documentation essentially load images from some void *buffer.
My goal is to be able to get this void* data out of a Magick::Image instance and load this image into my haru pdf based on this data.
I have tried writing to a void*or to a Magick::Blob but the only achievement I have had so far was some black rectangle instead of the image I am expecting.
Does anyone have any experience in converting Raw image data from one library into another one ?
The reason I am trying to do this from memory is because so far I am writing Magick::Image instances into a file and then reading from this file to load then in haru, which is a huge performance hit in the context of my Application.
I'm a little late to answer I guess, but here's a real-life answer.
I successfully added an itk::Image to my pdf using LibHaru so it should work about the same for you. First, you need to know if the library you use is row major or column major. LibHaru (and all the libraries I know) works in row major, so your library should too, or you will need to "transpose" your data.
// Black and white image (8 bits per pixel)
itk::Image<unsigned char, 2>::Pointer image = ...;
const unsigned char *imageData = image->GetBufferPointer();
const HPDF_Image image = HPDF_LoadRawImageFromMem(m_Document,
imageData, width, height, HPDF_CS_DEVICE_GRAY, 8);
// Or color image (24 bits per pixel, 8 bits per color component)
itk::Image<RGBPixel, 2>::Pointer image = ...;
const RGBPixel *imageData = image->GetBufferPointer();
const HPDF_Image image = HPDF_LoadRawImageFromMem(m_Document,
reinterpret_cast<const unsigned char *>(imageData),
width, height, HPDF_CS_DEVICE_RGB, 8);
// Usual LibHaru code. EndText, Position, Draw, StartText, etc.
// This code should not be dependant on the type
InsertImage(image);
I think the only complicated part is the reinterpret_cast. The black and white image don't need one because it's already defined as byte. For example, if you have this image
102 255 255
99 200 0
255 0 100
imageData == {102, 255, 255, 99, 200, 0, 255, 0, 100};
However, if you have this color image
( 0, 0, 255) (0, 255, 255) ( 42, 255, 242)
(200, 200, 255) (0, 199, 199) (190, 190, 190)
imageData == {0, 0, 255, 0, 255, 255, 42, 255, 242, 200, 200, 255, ... }
which LibHaru will inderstand because you tell him to use HPDF_CS_DEVICE_RGB, which means that it will group the data in (R, G, B).
Of course, using ImageMagick, you need to find how to access the first pixel. It's probably a method like data(), begin(), pointer(), etc.
Unfortunately I neither worked with ImageMagic nor libharu, however I have some experience with image processing and since nobody answered yet, maybe I can be of some help.
The problem is probably that there is a plethora of raw image formats and I'm quite sure that both libraries do not have the same understanding of these. What makes things worse is that the raw image interpretation of libharu is virtually not documented. However the conclusions that libharu handles raw data quite straightforward can be drawn from the parameters of: "HPDF_LoadRawImageFromMem".
Width and Height are pretty much self-explanatory, with the only question of the used (probably pixels). More interesting is: "bits_per_component". This parameter probably describes how many bits are used to define one pixel (common values are 8: indexed from a palette of 256 values, 16: indexed from a palette of 65535 values, 24: one byte for red, green, and blue respectivly [RGB], 32: as 24 but with alpha channel or 8 bits for cyan, magenta, yellow, and black [CMYK], 36: as 32 but with 9 bit per value for easier transpostion...). A problem is the lousy documentation of the type: HPDF_ColorSpace, since it probably describes how color values of with: "bits_per_component" are to be interpreted.
A totally different approach seems to be implemented by ImageMagic. An image object seems to have always an image format (JPEG, PNG, GIF), therefore an image object probably never has a "straightforward" memory representation but is encoded.
My recommendation would be to switch the ImagaMagic image to the TIFF format, since it condones compression and therefore has a similar approach to the assumed raw interpretation by libharu.
Hope this helped at least a bit...
Cheers
Mark.
It is never late to answer.
I have used a PNG blob as intermediate step:
Image image;
image.read("file.jpg");
Blob blob;
image.write(blob, "PNG");
HPDF_Image pdfImg = HPDF_LoadPngImageFromMem(doc, (const HPDF_BYTE*)blob.data(), blob.length());
HPDF_Page_DrawImage(doc, pdfImg, 0, 0, image.columns(), image.rows());
PDF document and page creation omitted for brevity.