When using libjpeg to feed images into OpenCL, to be able to treat channels as normalized uint8's with CL_UNORM_INT8 (floats in the range [0.0, 1.0]), you can only feed it buffers with 4 channel components. This is problematic, because libjpeg only outputs 3 (by default in RGB order) since JPEG has no notion of opacity.
The only workaround I see is to scanlines with libjpeg and then make a duplicate buffer of the appropriate length (with the fourth channel component added for each pixel in the scanlines) and then memcpy the values over, setting the alpha component to 255 for each. You could even do this in place if you are tricky and initialize the buffer to be of row_stride * 4 initially and then walk backwards from index row_stride * 3 - 1 to 0, moving components to the proper places in the full buffer (and adding 255 for alpha where necessary).
However, this feels hacky and if you're dealing with large images (I am), it's unacceptable to have this extra pass over (what will be in aggregate) the entire image.
So, is there a way to get libjpeg to just extend the number of components to 4? I've tried setting properties on cinfo like output_components to no avail. I've read that the only workaround is to compile a special version of libjpeg with the constant RGB_COMPONENTS = 4 set in jmorecfg.h, but this certainly doesn't feel portable or for that matter necessary for such a (common) change of output.
So it turns out that the best solution (at least, the one that doesn't require any custom builds of libs or extra passes through the buffer) is to use libjpeg-turbo. As of 1.1.90 they provide a colorspace constant JCS_EXT_RGBX that adds a fake alpha channel. To my knowledge this is only documented in the release notes of a beta version on SourceForge so barring that this URL changes or no longer exists (read: the internet revolts against sf for its shady insertion of code into "inactive" popular repos and they are forced to shut down), here is the relevant bit reproduced:
When decompressing a JPEG image using an output colorspace of
JCS_EXT_RGBX, JCS_EXT_BGRX, JCS_EXT_XBGR, or JCS_EXT_XRGB, libjpeg-turbo will
now set the unused byte to 0xFF, which allows applications to interpret that
byte as an alpha channel (0xFF = opaque).
Note that this also allows for alternate orderings such as BGR should you need them.
To use it after your jpeg_read_header() call (because this call sets a member on cinfo we need to a default) but before your jpeg_start_decompress() call (because it uses the value of this member), add:
cinfo.out_color_space = JCS_EXT_RGBX; // or JCS_EXT_XRGB, JCS_EXT_BGRX, etc.
And now scanning lines during the decompress will return an extra fourth component for each pixel set to 255.
Related
I want to add support for 10-bit color in my DirectX C++ Windows app.
I'm talking about the 10-bit per RGB channel (30-bits total for RGB) using DXGI_FORMAT_R10G10B10A2_UNORM.
How to detect if the system supports that, I mean if the display monitor actually supports this mode?
Because for example, I'm enumerating display modes list:
IDXGIOutput *output=null; for(Int i=0; OK(adapter->EnumOutputs(i, &output)); i++)
{
DXGI_FORMAT mode=DXGI_FORMAT_R10G10B10A2_UNORM;
UInt descs_elms=0; output->GetDisplayModeList(mode, 0, &descs_elms, null); // get number of mode descs
RELEASE(output);
}
And even though my laptop display doesn't support 10-bit, I still get valid results, including the list of resolutions.
Later on I can create a full screen swap chain, having 10-bit, and all works OK.
However because I don't know if the monitor is 10-bit or 8-bit, then I don't know if I need to manually apply some dithering to simulate 10-bit.
So what I want to know, if the display is actually 10-bit = no dithering needed, or is it 8-bit = then I will apply my custom dithering shader.
I'm working with both classic Win32/WinAPI and the new Universal Windows Platform (so I need solution for both platforms).
You are probably better off having some kind of user-setting to turn on/off dithering and just using DXGI_FORMAT_R10G10B10A2_UNORM for your swapchain.
Note that DXGI_FORMAT_R10G10B10A2_UNORM is also a valid HDR format, but requires manual application of the ST.2084 color-curve and setting the color-space appropriately to indicate the use of wide-gamut (i.e. HDR10). You could use DXGI_FORMAT_R16G16B16A16_FLOAT and render in linear color-space, but you are leaving it up to the system to do the 'correct' thing which may or may not match your expectations.
For non-HDR/4kUHD scenarios, you can't really get anything more than 8-bit with windowed mode or UWP CoreWindows swapchains anyhow because DWM is converting you to 8-bit. 10-bit scan out for Win32 'classic' desktop apps is possible with exclusive full-screen mode but there are a lot of variables (DVI cables don't support it for example).
See High Dynamic Range and Wide Color Gamut, as well as the D3D12HDR sample Win32 / UWP
Figuring out whether monitor really supports 10-bit mode seems to be very tricky since a lot of of 10-bit models are actually 8-bit + dither (and a lot of 8-bit models are just 6-bit + dither). At least in some software I'm familiar with such check was implemented through white list of display models. So I think it would be better for you to always use DXGI_FORMAT_R16G16B16A16_FLOAT (this one is probably supported by most adapters) or DXGI_FORMAT_R32G32B32A32_FLOAT as output format and let system convert it according to display expectations. Custom dithering can be left as option for user.
I am trying to use vkCreateImage with a 3-component image (rgb).
But all the the rgb formats give:
vkCreateImage format parameter (VK_FORMAT_R8G8B8_xxxx) is an unsupported format
Does this mean that I have to reshape the data in memory? So add an empty byte after each 3, and then load it as RGBA?
I also noticed R8 and R8G8 formats do work, so I would guess the only reason RGB is not supported because 3 is not a power of two.
Before I actually do this reshaping of the data I'd like to know for sure that this is the only way, because it is not very good for performance and maybe there is some offset or padding value somewhere that will help loading the RGB data into an RGBA format. So can somebody confirm the reshaping into RGBA is a necessary step to load RGB formats (albeit with 33% overhead)?
Thanks in advance.
First, you're supposed to check to see what is supported before you try to create an image. You shouldn't rely on validation layers to stop you; that's just a debugging aid to catch something when you forgot to check. What is and is not supported is dynamic, not static. It's based on your implementation. So you have to ask every time your application starts whether the formats you intend to use are available.
And if they are not, then you must plan accordingly.
Second, yes, if your implementation does not support 3-channel formats, then you'll need to emulate them with a 4-channel format. You will have to re-adjust your data to fit your new format.
If you don't like doing that, I'm sure there are image editors you can use to load your image, add an opaque alpha of 1.0, and save it again.
I'm trying to convert a 2d array to a DDS and saving it to a file. Array is full of Color structs (each having a red, green, blue and alpha component). Once I get the array to the correct format, I'm sure the saving it to file part won't be a problem.
I'm fine with either using a lib for this (as long as its license allows me to use it in a closed source project and works on both Linux and Windows) or doing it manually, if I can find a nice resource explaining how to do it.
If anyone can point me in the right direction, I'd really appreciate it.
In DirectDraw you can create a surface from the data in memory, by setting up certain fields in the DDSURFACEDESC structure and passing it to the CreateSurface method of the IDirectDraw interface.
First you need to tell DirectDraw which fields of the DDSURFACEDESC structure contain the correct information by setting the dwFlags field to the following set of flags: DDSD_WIDTH | DDSD_HEIGHT | DDSD_PIXELFORMAT | DDSD_LPSURFACE | DDSD_PITCH.
Oh, and this only works for system-memory surfaces, so it's probably needed to add the DDSCAPS_SYSTEMMEMORY flag in the ddsCaps.dwCaps field (if DirectDraw won't do it by default).
Then you specify the address of the beginning of your pixel data array in the lpSurface field. If your buffer is continuous, just set the lPitch to 0. Else you set the correct pitch there (the distance in bytes between the beginnings of two subsequent scanlines).
Set the correct pixel format in ddpfPixelFormat field, with correct bit depth in dwRGBBitCount and RGB masks in dwRBitMask, dwGBitMask and dwBBitMask.
Then set the lXPitch to the number of bytes your pixel has (3 for RGB). It depends on the pixel format you use.
Then pass the filled structure into CreateSurface and see if it works.
When you create the surface this way, keep in mind that DirectDraw will not manage its data buffer himself, and won't free this memory once you call Release on your surface. You need to free this memory yourself when it's no longer used by the surface.
If you want this pixel data to be placed in video memory, on the other hand, you need to create an offscreen surface in a usual way and then lock it, copy your pixels to its own buffer in video memory (you'll find its address in the lpSurface field, and remember to take lPitch in count!), and then unlock it.
I'm currently doing a steganography project (for myself). I have done a bit of code already but after thinking about it, I know there are better ways of doing what I want.
Also - this is my first time using dynamic memory allocation and binary file I/O.
Here is my code to hide a text file within a BMP image: Link to code
Also note that I'm not using the LSB to store the message in this code, but rather replacing the alpha byte, assuming its a 32 bit per pixel (bbp) image. Which is another reason why this won't be very flexible if there are 1, 4, 8, 16, 24 bpp in the image. For example if it were 24 bbp, the alpha channel will be 6 bits, not 1 byte.
My question is what is the best way to read the entire BMP into memory using structures?
This is how I see it:
Read BITMAPFILEHEADER
Read BITMAPINFOHEADER
Read ColorTable (if there is one)
Read PixelArray
I know how I to read in the two headers, but the ColorTable is confusing me, I don't know what size the ColorTable is, or if there is one in an image at all.
Also, after the PixelArray, Wikipedia says that there could be an ICC Color Profile, how do I know one exists? Link to BMP File Format (Wikipedia)
Another thing, since I need to know the header info in order to know where the PixelArray starts, I would need to make multiple reads like I showed above, right?
Sorry for all the questions in one, but I'm really unsure at the moment on what to do.
The size of the color table is determined by bV5ClrUsed.
An ICC color profile is present in the file only if bV5CSType == PROFILE_EMBEDDED.
The documentation here provides all that information.
Then, 24-bit color means 8 red, 8 green, 8 blue, 0 alpha. You'd have to convert that to 32-bit RGBA in order to have any alpha channel at all.
Finally, the alpha channel DOES affect the display of the image, so you can't use it freely for steganography. You really are better off using the least significant bits of all channels (and maybe not from all pixels).
I need to convert 24bppRGB to 16bppRGB, 8bppRGB, 4bppRGB, 8bpp grayscal and 4bpp grayscale. Any good link or other suggestions?
preferably using Windows/GDI+
[EDIT] speed is more critical than quality. source images are screenshots
[EDIT1] color conversion is required to minimize space
You're better off getting yourself a library, as others have suggested. Aside from ImageMagick, there are others, such as OpenCV. The benefits of leaving this to a library are:
Save yourself some time -- by cutting out dev and testing time for the algorithm
Speed. Most libraries out there are optimized to a level far greater than a standard developer (such as ourselves) could achieve
Standards compliance. There are many image formats out there, and using a library cuts the problem of standards compliance out of the equation.
If you're doing this yourself, then your problem can be divided into the following sub-problems:
Simple color quantization. As #Alf P. Steinbach pointed out, this is just "downscaling" the number of colors. RGB24 has 8 bits per R, G, B channels, each. For RGB16 you can do a number of conversions:
Equal number of bits for each of R, G, B. This typically means 4 or 5 bits each.
Favor the green channel (human eyes are more sensitive to green) and give it 6 bits. R and B get 5 bits.
You can even do the same thing for RGB24 to RGB8, but the results won't be as pretty as a palletized image:
4 bits green, 2 red, 2 blue.
3 bits green, 5 bits between red and blue
Palletization (indexed color). This is for going from RGB24 to RGB8 and RGB4. This is a hard problem to solve by yourself.
Color to grayscale conversion. Very easy. Convert your RGB24 to YUV' color space, and keep the Y' channel. That will give you 8bpp grayscale. If you want 4bpp grayscale, then you either quantize or do palletization.
Also be sure to check out chroma subsampling. Often, you can decrease the bitrate by a third without visible losses to image quality.
With that breakdown, you can divide and conquer. Problems 1 and 2 you can solve pretty quickly. That will allow you to see the quality you can get simply by doing coarser color quantization.
Whether or not you want to solve Problem 2 will depend on the result from above. You said that speed is more important, so if the quality of color quantization only is good enough, don't bother with palletization.
Finally, you never mentioned WHY you are doing this. If this is for reducing storage space, then you should be looking at image compression. Even lossless compression will give you better results than reducing the color depth alone.
EDIT
If you're set on using PNG as the final format, then your options are quite limited, because both RGB16 and RGB8 are not valid combinations in the PNG header.
So what this means is: regardless of bit depth, you will have to switch to index color if you want RGB color images below 24bpp (8 bits per channel). This means you will NOT be able to take advantage of the color quantization and chroma decimation that I mentioned above -- it's not supported in PNG. So this means you will have to solve Problem 2 -- palletization.
But before you think about that, some more questions:
What are the dimensions of your images?
What sort of ideal file-size are you after?
How close to that ideal file-size do you get with straight RBG24 + PNG compression?
What is the source of your images? You've mentioned screenshots, but since you're so concerned about disk space, I'm beginning to suspect that you might be dealing with image sequences (video). If this is so, then you could do better than PNG compression.
Oh, and if you're serious about doing things with PNG, then definitely have a look at this library.
Find your self a copy of the ImageMagick [sic] library. It's very configurable, so you can teach it about the details of some binary format that you need to process...
See: ImageMagick, which has a very practical license.
I received acceptable results (preliminary) by GDI+, v.1.1 that is shipped with Vista and Win7. It allows conversion to 16bpp (I used PixelFormat16bppRGB565) and to 8bpp and 4bpp using standard palettes. Better quality could be received by "optimal palette" - GDI+ would calculate optimal palette for each screenshot, but it's two times slower conversion. Grayscale was received by specifying simple custom palette, e.g. as demonstrated here, except that I didn't need to modify pixels manually, Bitmap::ConvertFormat() did it for me.
[EDIT] results were really acceptable until I decided to check the solution on WinXP. Surprisingly, Microsoft decided to not ship GDI+ v.1.1 (required for Bitmap::ConvertFormat) to WinXP. Nice move! So I continue researching...
[EDIT] had to reimplement this on clean GDI hardcoding palettes from GDI+