Understanding image dithering and how they help blending CSM - opengl

So I wish to implement dithering as a blend mode between my cascade shadow map splits.
I had no idea what they were so I've watched this video to try and understand it.As far as I understand it it's a way to map an image colors to a limited pallet while trying to maintain a convincing gradient between different colored pixels.
Now from this video I understand how to calculate what color my eye will see based on the weights of the dithering pattern. What I do not understand is how we take an image with 4 bytes pixels data and for example trying to map it to 1 byte pixel data. How can we map each pixel color in the original image to a dither pattern that it's weighted average will look as if it's the original color if we're basically limited? Say we were limited to only 5 colors, I'm guessing not every possible weighted average combination of dither pattern using these 5 pallet color could result in the original pixel color so how can this be achieved? Also is a dither pattern is calculated for each pixel to achieve a dithered image?
Besides these general question about image dithering I'm still having difficulties understanding how this technique is helping us blend between cascade splits, where as far as actually implementing it in code, I've seen an example where it uses the space coordinates of a fragment and calculate a dither (Not sure what it's calculating actually because it doesn't return a matrix it returns a float):
float GetDither2(ivec2 p)
{
float d = 0.0;
if((p.x & 1) != (p.y & 1))
d += 2.0;
if((p.y & 1) == 1)
d += 1.0;
d *= 0.25;
return d;
}
float GetDither4(ivec2 p)
{
float d = GetDither2(p);
d = d * 0.25 + GetDither2(p >> 1);
return d;
}
float threshold = GetDither4(ivec2(gl_FragCoord.xy));
if(factor <= threshold)
{
// sample current cascade
}
else
{
// sample next cascade
}
And then it samples either cascade map based on this returned float.
So my brain can't translate what I learned that you can have a dither pattern to simulate large color pattern, into this example that uses a returned float as a threshold factor and compares it to some blend factor just to sample from either shadow map. So it made me more confused.
Would appreciate a good explanation of this 🙏
EDIT:
Ok I see correlation between the algorithm I was provided with to the wikipedia article about ordered dithering, which as far as I understand is the preferred dithering algorithm because according to the article:
Additionally, because the location of the dithering patterns always
stays the same relative to the display frame, it is less prone to
jitter than error-diffusion methods, making it suitable for
animations.
Now I see the code tries to get this threshold value for a given space coordinate although it seems to me it got it a bit wrong because the following calculation of threshold is a follows:
Mpre(i,j) = (Mint(i,j)+1) / n^2
And it needs to set: float d = 1.0 instead of float d = 0.0 if Im not mistaken.
Secondly, I’m not sure how left shifting the ivec2 space coordinate (I’m not even sure what’s the behavior of bitwise shift on vector in glsl…) but I assumes it just component bitwise operation, and I tried plug-in (head calculating) for a given space coordinate (2,1) (according to my assumptions about the bitwise operation) and got different threshold result for what should be the threshold value of this position in a 4x4 Bayer matrix.
So I'm skeptic about how well this code implements the ordered dithering algorithm.
Secondly I’m still not sure how this threshold value has anything to do with choosing between shadow map 1 or 2, and not just reducing color pallet of a given pixel, this logic hasn’t settled in my mind yet as I do not understand the use of dithering threshold value for a given space coordinate to choose the right map to sample from.
Lastly won’t choosing space coordinate will cause jitters? Given fragment in world position (x,y,z) who’s shadowed. Given this fragment space coordinate for a given frame are (i,j). If the camera moves won’t this fragment space coordinate bound to change making the dither threshold calculated for this fragment change with each movement causing jitters of the dither pattern?
EDIT2:
Tried to blend the maps as follow although result not look so good any ideas?
const int indexMatrix8x8[64] = int[](
0, 32, 8, 40, 2, 34, 10, 42,
48, 16, 56, 24, 50, 18, 58, 26,
12, 44, 4, 36, 14, 46, 6, 38,
60, 28, 52, 20, 62, 30, 54, 22,
3, 35, 11, 43, 1, 33, 9, 41,
51, 19, 59, 27, 49, 17, 57, 25,
15, 47, 7, 39, 13, 45, 5, 37,
63, 31, 55, 23, 61, 29, 53, 21
);
for (int i = 0; i < NR_LIGHT_SPACE; i++) {
if (fs_in.v_FragPosClipSpaceZ <= u_CascadeEndClipSpace[i]) {
shadow = isInShadow(fs_in.v_FragPosLightSpace[i], normal, lightDirection, i) * u_ShadowStrength;
int x = int(mod(gl_FragCoord.x, 8));
int y = int(mod(gl_FragCoord.y, 8));
float threshold = (indexMatrix8x8[(x + y * 8)] + 1) / 64.0;
if (u_CascadeBlend >= threshold)
{
shadow = isInShadow(fs_in.v_FragPosLightSpace[i + 1], normal, lightDirection, i + 1) * u_ShadowStrength;
}
}
break;
}
}
Basically if I understand what I'm doing is getting the threshold value from the matrix for each space coordinate of a shadowed pixel and if it's (using probability) higher than a blend factor than I sample the second map instead.
Here're the results:
The larger red box is where the split between map occurs.
The smaller red box goes to show that there's some dither pattern but the image isn't so blended as I think it should.

First of all I have no knowledge about CSM so I focus on dithering and blending. Firstly see these:
my very simple dithering in C++ I come up with
image dithering routine that accepts an amount of dithering? and its variation
They basically answers you question about how to compute the dithering pattern/pixels.
Also its important to have good palette for dithering that reduce your 24/32 bpp into 8 bpp (or less). There are 2 basic approaches
reduce colors (color quantization)
so compute histogram of original image and pick significant colors from it that more or less cover whole image information. For more info see:
Effective gif/image color quantization?
dithering palette
dithering use averaging of pixels to generate desired color so we need to have such colors that can generate all possible colors we want. So its good to have few (2..4) shades of each base color (R,G,B,C,M,Y) and some (>=4) shades of gray. From these you can combine any color and intensity you want (if you have enough pixels)
#1 is the best but it is per image related so you need to compute palette for each image. That can be problem as that computation is nasty CPU hungry stuff. Also on old 256 color modes you could not show 2 different palettes at the same time (which with true color is no more a problem anymore) so dithering is usually better choice.
You can even combine the two for impressive results.
The better the used palette is the less grainy the result is ...
The standard VGA 16 and 256 color palettes where specially designed for dithering so its a good idea to use them...
Standard VGA 16 color palette:
Standard VGA 256 color palette:
Here also C++ code for the 256 colors:
//---------------------------------------------------------------------------
//--- EGA VGA pallete -------------------------------------------------------
//---------------------------------------------------------------------------
#ifndef _vgapal_h
#define _vgapal_h
//---------------------------------------------------------------------------
unsigned int vgapal[256]=
{
0x00000000,0x00220000,0x00002200,0x00222200,
0x00000022,0x00220022,0x00001522,0x00222222,
0x00151515,0x00371515,0x00153715,0x00373715,
0x00151537,0x00371537,0x00153737,0x00373737,
0x00000000,0x00050505,0x00000000,0x00030303,
0x00060606,0x00111111,0x00141414,0x00101010,
0x00141414,0x00202020,0x00242424,0x00202020,
0x00252525,0x00323232,0x00303030,0x00373737,
0x00370000,0x00370010,0x00370017,0x00370027,
0x00370037,0x00270037,0x00170037,0x00100037,
0x00000037,0x00001037,0x00001737,0x00002737,
0x00003737,0x00003727,0x00003717,0x00003710,
0x00003700,0x00103700,0x00173700,0x00273700,
0x00373700,0x00372700,0x00371700,0x00371000,
0x00371717,0x00371727,0x00371727,0x00371737,
0x00371737,0x00371737,0x00271737,0x00271737,
0x00171737,0x00172737,0x00172737,0x00173737,
0x00173737,0x00173737,0x00173727,0x00173727,
0x00173717,0x00273717,0x00273717,0x00373717,
0x00373717,0x00373717,0x00372717,0x00372717,
0x00372525,0x00372531,0x00372536,0x00372532,
0x00372537,0x00322537,0x00362537,0x00312537,
0x00252537,0x00253137,0x00253637,0x00253237,
0x00253737,0x00253732,0x00253736,0x00253731,
0x00253725,0x00313725,0x00363725,0x00323725,
0x00373725,0x00373225,0x00373625,0x00373125,
0x00140000,0x00140007,0x00140006,0x00140015,
0x00140014,0x00150014,0x00060014,0x00070014,
0x00000014,0x00000714,0x00000614,0x00001514,
0x00001414,0x00001415,0x00001406,0x00001407,
0x00001400,0x00071400,0x00061400,0x00151400,
0x00141400,0x00141500,0x00140600,0x00140700,
0x00140606,0x00140611,0x00140615,0x00140610,
0x00140614,0x00100614,0x00150614,0x00110614,
0x00060614,0x00061114,0x00061514,0x00061014,
0x00061414,0x00061410,0x00061415,0x00061411,
0x00061406,0x00111406,0x00151406,0x00101406,
0x00141406,0x00141006,0x00141506,0x00141106,
0x00141414,0x00141416,0x00141410,0x00141412,
0x00141414,0x00121414,0x00101414,0x00161414,
0x00141414,0x00141614,0x00141014,0x00141214,
0x00141414,0x00141412,0x00141410,0x00141416,
0x00141414,0x00161414,0x00101414,0x00121414,
0x00141414,0x00141214,0x00141014,0x00141614,
0x00100000,0x00100004,0x00100000,0x00100004,
0x00100010,0x00040010,0x00000010,0x00040010,
0x00000010,0x00000410,0x00000010,0x00000410,
0x00001010,0x00001004,0x00001000,0x00001004,
0x00001000,0x00041000,0x00001000,0x00041000,
0x00101000,0x00100400,0x00100000,0x00100400,
0x00100000,0x00100002,0x00100004,0x00100006,
0x00100010,0x00060010,0x00040010,0x00020010,
0x00000010,0x00000210,0x00000410,0x00000610,
0x00001010,0x00001006,0x00001004,0x00001002,
0x00001000,0x00021000,0x00041000,0x00061000,
0x00101000,0x00100600,0x00100400,0x00100200,
0x00100303,0x00100304,0x00100305,0x00100307,
0x00100310,0x00070310,0x00050310,0x00040310,
0x00030310,0x00030410,0x00030510,0x00030710,
0x00031010,0x00031007,0x00031005,0x00031004,
0x00031003,0x00041003,0x00051003,0x00071003,
0x00101003,0x00100703,0x00100503,0x00100403,
0x00000000,0x00000000,0x00000000,0x00000000,
0x00000000,0x00000000,0x00000000,0x00000000,
};
//---------------------------------------------------------------------------
class _vgapal_init_class
{
public: _vgapal_init_class();
} vgapal_init_class;
//---------------------------------------------------------------------------
_vgapal_init_class::_vgapal_init_class()
{
int i;
BYTE a;
union { unsigned int dd; BYTE db[4]; } c;
for (i=0;i<256;i++)
{
c.dd=vgapal[i];
c.dd=c.dd<<2;
a =c.db[0];
c.db[0]=c.db[2];
c.db[2]= a;
vgapal[i]=c.dd;
}
}
//---------------------------------------------------------------------------
#endif
//---------------------------------------------------------------------------
//--- end. ------------------------------------------------------------------
//---------------------------------------------------------------------------
Now back to your question about blending by dithering
Blending is merging of 2 images of the same resolution together by some amount (weights). So each pixel color is computed like this:
color = w0*color0 + w1*color1;
where color? are pixels in source images and w? are weights where all weights together sum up to 1:
w0 + w1 = 1;
here example:
Draw tbitmap with scale and alpha channel faster
and preview (the dots are dithering from my GIF encoder):
But Blending by dithering is done differently. Instead of Blending colors we use some percentage of pixels from one image and others from the second one. So:
if (Random()<w0) color = color0;
else color = color1;
Where Random() returns pseudo random number in range <0,1>. As you can see no combinig of colors is done simply you just chose from which image you copy the pixel... Here preview:
Now the dots are caused by the blending by dithering as the intensities of the images are very far away of each other so it does not look good but if you dither relatively similar images (like your shadow maps layers) the result should be good enough (with almost no performance penalty).
To speed up this its usual to precompute the Random() outputs for some box (8x8, 16x16 , ...) and use that for whole image (its a bit blocky but that is sort of used as a fun effect ...). This way it can be done also branchlessly (if you store pointers to source images instead of random value). Also it can be done fully on integers (withou fixed precision) if the weights are integers for example <0..255> ...
Now to make cascade/transition from image0 to image1 or what ever just simply do something like this:
for (w0=1.0;w0>=0.0;w0-=0.05)
{
w1=1.0-w0;
render blended images;
Sleep(100);
}
render image1;

I got the dither blend to work in my code as follows:
for (int i = 0; i < NR_LIGHT_SPACE; i++) {
if (fs_in.v_FragPosClipSpaceZ <= u_CascadeEndClipSpace[i])
{
float fade = fadedShadowStrength(fs_in.v_FragPosClipSpaceZ, 1.0 / u_CascadeEndClipSpace[i], 1.0 / u_CascadeBlend);
if (fade < 1.0) {
int x = int(mod(gl_FragCoord.x, 8));
int y = int(mod(gl_FragCoord.y, 8));
float threshold = (indexMatrix8x8[(x + y * 8)] + 1) / 64.0;
if (fade < threshold)
{
shadow = isInShadow(fs_in.v_FragPosLightSpace[i + 1], normal, lightDirection, i + 1) * u_ShadowStrength;
}
else
{
shadow = isInShadow(fs_in.v_FragPosLightSpace[i], normal, lightDirection, i) * u_ShadowStrength;
}
}
else
{
shadow = isInShadow(fs_in.v_FragPosLightSpace[i], normal, lightDirection, i) * u_ShadowStrength;
}
break;
}
}
First check if we're close to the cascade split by a fading factor taking into account frag position clip space and the end of the cascade clip-space with fadedShadowStrength (I use this function for normal blending between cascade to know when to start blending, basically if blending factor u_CascadeBlend is set to 0.1 for example then we blend when we're atleast 90% into the current cascade (z clip space wise).
Then if we need to fade (if (fade <1.0)) I just compare the fade factor to the threshold from the matrix and choose shadow map accordingly.
Results:

Related

Fast, good quality pixel interpolation for extreme image downscaling

In my program, I am downscaling an image of 500px or larger to an extreme level of approx 16px-32px. The source image is user-specified so I do not have control over its size. As you can imagine, few pixel interpolations hold up and inevitably the result is heavily aliased.
I've tried bilinear, bicubic and square average sampling. The square average sampling actually provides the most decent results but the smaller it gets, the larger the sampling radius has to be. As a result, it gets quite slow - slower than the other interpolation methods.
I have also tried an adaptive square average sampling so that the smaller it gets the greater the sampling radius, while the closer it is to its original size, the smaller the sampling radius. However, it produces problems and I am not convinced this is the best approach.
So the question is: What is the recommended type of pixel interpolation that is fast and works well on such extreme levels of downscaling?
I do not wish to use a library so I will need something that I can code by hand and isn't too complex. I am working in C++ with VS 2012.
Here's some example code I've tried as requested (hopefully without errors from my pseudo-code cut and paste). This performs a 7x7 average downscale and although it's a better result than bilinear or bicubic interpolation, it also takes quite a hit:
// Sizing control
ctl(0): "Resize",Range=(0,800),Val=100
// Variables
float fracx,fracy;
int Xnew,Ynew,p,q,Calc;
int x,y,p1,q1,i,j;
//New image dimensions
Xnew=image->width*ctl(0)/100;
Ynew=image->height*ctl(0)/100;
for (y=0; y<image->height; y++){ // rows
for (x=0; x<image->width; x++){ // columns
p1=(int)x*image->width/Xnew;
q1=(int)y*image->height/Ynew;
for (z=0; z<3; z++){ // channels
for (i=-3;i<=3;i++) {
for (j=-3;j<=3;j++) {
Calc += (int)(src(p1-i,q1-j,z));
} //j
} //i
Calc /= 49;
pset(x, y, z, Calc);
} // channels
} // columns
} // rows
Thanks!
The first point is to use pointers to your data. Never use indexes at every pixel. When you write: src(p1-i,q1-j,z) or pset(x, y, z, Calc) how much computation is being made? Use pointers to data and manipulate those.
Second: your algorithm is wrong. You don't want an average filter, but you want to make a grid on your source image and for every grid cell compute the average and put it in the corresponding pixel of the output image.
The specific solution should be tailored to your data representation, but it could be something like this:
std::vector<uint32_t> accum(Xnew);
std::vector<uint32_t> count(Xnew);
uint32_t *paccum, *pcount;
uint8_t* pin = /*pointer to input data*/;
uint8_t* pout = /*pointer to output data*/;
for (int dr = 0, sr = 0, w = image->width, h = image->height; sr < h; ++dr) {
memset(paccum = accum.data(), 0, Xnew*4);
memset(pcount = count.data(), 0, Xnew*4);
while (sr * Ynew / h == dr) {
paccum = accum.data();
pcount = count.data();
for (int dc = 0, sc = 0; sc < w; ++sc) {
*paccum += *i;
*pcount += 1;
++pin;
if (sc * Xnew / w > dc) {
++dc;
++paccum;
++pcount;
}
}
sr++;
}
std::transform(begin(accum), end(accum), begin(count), pout, std::divides<uint32_t>());
pout += Xnew;
}
This was written using my own library (still in development) and it seems to work, but later I changed the variables names in order to make it simpler here, so I don't guarantee anything!
The idea is to have a local buffer of 32 bit ints which can hold the partial sum of all pixels in the rows which fall in a row of the output image. Then you divide by the cell count and save the output to the final image.
The first thing you should do is to set up a performance evaluation system to measure how much any change impacts on the performance.
As said precedently, you should not use indexes but pointers for (probably) a substantial
speed up & not simply average as a basic averaging of pixels is basically a blur filter.
I would highly advise you to rework your code to be using "kernels". This is the matrix representing the ratio of each pixel used. That way, you will be able to test different strategies and optimize quality.
Example of kernels:
https://en.wikipedia.org/wiki/Kernel_(image_processing)
Upsampling/downsampling kernel:
http://www.johncostella.com/magic/
Note, from the code it seems you apply a 3x3 kernel but initially done on a 7x7 kernel. The equivalent 3x3 kernel as posted would be:
[1 1 1]
[1 1 1] * 1/9
[1 1 1]

Flag undefined colors and no sample present from color matching system

By way of a color sensor, I am matching plastic color swatches to a pre-defined palette of colors in an array using the Euclidean distance (closest distance) approach. When a color is identified, a linear actuator moves. This works well, even for fairly similar pastel colors.
However, how do I code for those situations where 1. no color swatch is in front of the sensor or 2. the color is not in the array? I need to generate a "No sample" (1.) or "No match found" (2.) message and have the actuator not moving in both cases.
As it is now, when no swatch is over the sensor, the code finds a closest equivalent from the ambient light and the actuator moves (1.), when a non-matching swatch is over the sensor, the code finds a closest equivalent and the actuator moves (2.). In both cases, nothing should happen apart from outputting the messages mentioned above.
Thanks for some hints!
const int SAMPLES[12][5] = { // Values from colour "training" (averaged raw r, g and b; averaged raw c; actuator movement)
{8771, 6557, 3427, 19408, 10},
{7013, 2766, 1563, 11552, 20},
{4092, 1118, 1142, 6213, 30},
{4488, 1302, 1657, 7357, 40},
{3009, 1846, 2235, 7099, 50},
{2650, 3139, 4116, 10078, 60},
{ 857, 965, 1113, 2974, 70},
{ 964, 2014, 2418, 5476, 80},
{1260, 2200, 1459, 5043, 90},
{4784, 5898, 3138, 14301, 100},
{5505, 5242, 2409, 13642, 110},
{5406, 3893, 1912, 11457, 120}, // When adding more samples no particular order is required
};
byte findColour(int r, int g, int b) {
int distance = 10000; // Raw distance from white to black (change depending on selected integration time and gain)
byte foundColour;
for (byte i = 0; i < samplesCount; i++) {
int temp = sqrt(pow(r - SAMPLES[i][0], 2) + pow(g - SAMPLES[i][1], 2) + pow(b - SAMPLES[i][2], 2)); // Calculate Euclidean distance
if (temp < distance) {
distance = temp;
foundColour = i + 1;
}
}
return foundColour;
}
When color is present or not in the table can be decided by the distance of best match. When distance of it is bigger than certain threshold then return some value that indicates "not found", for example -1 or 255.
Also store whatever the sensor senses without sample present (during calibration) and when that is the best match then return some value that indicates "no sample" for example 0.

colorbalance in an image using c++ and opencv

I'm trying to score the colorbalance of an image using c++ and opencv.
To do this the easiest way is to count the number of pixels in each color and then see if one of the colors is more prevalent.
I figured I should probably used calcHist and with the split function I can split a image in R, G, and B histograms. However I am unsure about what to do next. I could probably walk through all the bins and just see how many pixels are in there but this seems like a lot of work (I currently use 256 bins).
Is there a faster way to count the pixels in a color range? Also I am not sure how it would work if white or black are the more prevalant colors?
Automatic color balance algorithm is described in this link http://web.stanford.edu/~sujason/ColorBalancing/simplestcb.html
For C++ Code you can refer to this link : https://www.morethantechnical.com/2015/01/14/simplest-color-balance-with-opencv-wcode/
/// perform the Simplest Color Balancing algorithm
void SimplestCB(Mat& in, Mat& out, float percent) {
assert(in.channels() == 3);
assert(percent > 0 && percent < 100);
float half_percent = percent / 200.0f;
vector<Mat> tmpsplit; split(in,tmpsplit);
for(int i=0;i<3;i++) {
//find the low and high precentile values (based on the input percentile)
Mat flat; tmpsplit[i].reshape(1,1).copyTo(flat);
cv::sort(flat,flat,CV_SORT_EVERY_ROW + CV_SORT_ASCENDING);
int lowval = flat.at<uchar>(cvFloor(((float)flat.cols) * half_percent));
int highval = flat.at<uchar>(cvCeil(((float)flat.cols) * (1.0 - half_percent)));
cout << lowval << " " << highval << endl;
//saturate below the low percentile and above the high percentile
tmpsplit[i].setTo(lowval,tmpsplit[i] < lowval);
tmpsplit[i].setTo(highval,tmpsplit[i] > highval);
//scale the channel
normalize(tmpsplit[i],tmpsplit[i],0,255,NORM_MINMAX);
}
merge(tmpsplit,out);
}
// Usage example
void main() {
Mat tmp,im = imread("lily.png");
SimplestCB(im,tmp,1);
imshow("orig",im);
imshow("balanced",tmp);
waitKey(0);
return;
}
Colour balance is normally looking at a white (or gray) surface and checking the ratios of red/blue to green. A perfectly balanced system would have equal signal levels in red/blue.
You can then simply work out the average red/blue from the test gray card image and apply the same scaling to your real image.
Doing it on a live image with no reference is trickier, you have to find areas that are probably white (ie bright and nearly r=g=b) and use them as the reference
There's no definitive algorithm for colour balance, so anything you might implement, however good it is, will probably fail in some conditions.
One of the simplest algorithms is called Grey World, and assumes that statistically the average colour of a scene should be grey. And if it isn't, it means that it needs to be corrected to grey. So, very simply (in pseudo-python), if you have an image RGB:
cc[0] = np.mean(RGB[:,0]) # calculating channel-wise average
cc[1] = np.mean(RGB[:,1])
cc[2] = np.mean(RGB[:,2])
cc = cc / np.sqrt((cc**2).sum()) # normalise the light (you might want to
# play with this a bit
RGB /= cc # divide every pixel by the estimated light
Note that here I'm assuming that RGB is an array of floats with values between 0 and 1. Something else that helps is to exclude from the average pixels that contain values below and above certain thresholds (e.g., below 0.05 and above 0.95). This way you ignore pixels whose value is heavily influenced by noise (small values) and pixels that saturated the camera sensor and whose colour may not be reliable (large values).

OpenCV: Calculating new red pixel value

I'm currently aiming to adjust the red pixels in an image (more specifically, an eye region to remove red eyes caused by flash), and this works well, but the issue I'm getting is sometimes green patches appear on the skin.
This is a good result (before and after):
I realize why this is happening, but when I go to adjust the threshold to a higher a value (meaning the red intensity must be stronger), less red pixels are picked up and changed, i.e.:
The lower the threshold, the more green shows up on the skin.
I was wondering if there was an alternate method to what I'm currently doing to change the red pixels?
int lcount = 0;
for(int y=0;y<lcroppedEye.rows;y++)
{
for(int x=0;x<lcroppedEye.cols;x++)
{
double b = lcroppedEye.at<cv::Vec3b>(y, x)[0];
double g = lcroppedEye.at<cv::Vec3b>(y, x)[1];
double r = lcroppedEye.at<cv::Vec3b>(y, x)[2];
double redIntensity = r / ((g + b) / 2);
//currently causes issues with non-red-eye images
if (redIntensity >= 1.8)
{
double newRedValue = (g + b) / 2;
cv::Vec3b pixelColor(newRedValue,g,b);
lroi.at<cv::Vec3b>(cv::Point(x,y)) = pixelColor;
lcount++;
}
}
}
EDIT: I can possibly add in a check to ensure the new RGB values are low enough, and so R, G, B values are similar/close values so black/grey pixels are written out only... or have a range of RGB values (greenish) which aren't allowed... would that work?
Adjusting color in RGB space has caveats like this greenish areas you faced. Convert the R,G,B values to a better color space, like HSV or LUV.
I suggest you go for HSV to detect and change the red-eye colors. R/(G+B) is not a good way for calculating red intensity. This means you are calling (R=10,G=1,B=0) a very red color, but it is deadly black. Take a look at the comparison below:
So, you'd better check if Saturation and Value are high values which is the case for a red-eye color. If you encounter other high intensity colors, you may check the Hue is in the range of something like [0-20] and [340-359]. But without this, you are still safe against the white itself, as it has a very low saturation and you won't select white areas anyway.
That was for selecting, for changing the color, it is again better to not use RGB, as changing in that space is not linear as we perceive colors. Looking at the image above, you can see that lowering both the saturation and value would be a good start. But you may experiment with it and see what looks better. Maybe you'll be fine with a dark gray always, that would mean set Saturation to zero, and lower the Value a bit. You may think a dark brown would be better, go for a low saturation and value but set Hue to something about 30 degrees.
References that may help you:
Converting color values in OpenCV
An online tool to experiment with RGB and HSV colors
It may be better to change
double redIntensity = r / ((g + b) / 2);
to
double redIntensity = r / ((g+b+1) / 2);
because g+b can be equal to 0, and you'll get NAN.
Also take alook at cv::floodfill method.
May be it is better to ignore color information at red zones at all, as soon as color information in extra red area is too much distorted by extra red values. So new values could be:
newRedValue = (g+b)/2; newGreenValue = newRedValue; newBlueValue = newRedValue;
Even if you will detect wrong red area its desaturating will give better result than greenish area.
You can also use morphological closing operations (using circle structuring element) to avoid gaps in your red area mask. So you will need perform 3 steps: 1. find red areas and create mask for this 2. do red area mask morphological closing operations 3. desaturate image using this mask
And yes, don't use "r /((g+b)/2)" as it can lead to division by zero error.
Prepare a mask the same size as your lcroppedEye image, which is initially all black (I'll call this image maskImage here onwards).
For every pixel in lcroppedEye(row, col) that pass your (redIntensity >= 1.8) condition, set the maskImage(row, col) pixel to white.
When you are done with all the pixels in lcroppedEye, maskImage will have all redeye-like pixels in white.
If you perform a connected component analysis on this maskImage, you should be able to filter out other regions by considering circle or disk-like-features etc.
Now you can use this maskImage as a mask to apply the color transformation to the ROI of the original image
(You may have to do some preprocessing on maskImage before moving on to connected component analysis. Also you can replace the code segment in the question with split, divide and threshold functions unless there's special reason to iterate through pixels)
The problem seems to be that you replace regardless of the presence of any red eye, so you must somehow test if there is any high red values (more red than your skin).
My guess it that the areas where there is reflection there will also be specific blue and green values, either high or low that should be check so that you for example need high red values combined with low blue and/or low green values.
// first pass, getting the highest red value
int highRed = 0;
cv::Point redPos = cv::Point(0,0);
int lcount = 0;
for(int y=0;y<lcroppedEye.rows;y++)
{
for(int x=0;x<lcroppedEye.cols;x++)
{
double r = lcroppedEye.at<cv::Vec3b>(y, x)[2];
if (redIntensity > highRed)
{
highRed = redIntensity ;
redPos = cv::Point(x,y);
}
}
}
// decide if its red enough, need to find a good minRed value.
if (highRed < minRed)
return;
Original code here with the following changes.
// avoid division by zero, code from #AndreySmorodov
double redIntensity = r / ((g+b+1) / 2);
// add check for actual red colour.
if (redIntensity >= 1.8 && r > highRed*0.75)
// potential add check for low absolute r/b values.
{
double newRedValue = (g + b) / 2;
cv::Vec3b pixelColor(newRedValue,g,b);
lroi.at<cv::Vec3b>(cv::Point(x,y)) = pixelColor;
lcount++;
}
}

Generating a 3DLUT (.3dl file) for sRGB to CIELAB colorspace transformation

We already have a highly optimized class in our API to read 3D Lut(Nuke format) files and apply the transform to the image. So instead of iterating pixel-by-pixel and converting RGB values to Lab (RGB->XYZ->Lab) values using the complex formulae, I think it would be better if I generated a lookup table for RGB to LAB (or XYZ to LAB) transform. Is this possible?
I understood how the 3D Lut works for transformations from RGB to RGB, but I am confused about RGB to Lab as L, a and b have different ranges. Any hints ?
EDIT:
Can you please explain me how the Lut will work ?
Heres one explanation: link
e.g Below is my understanding for a 3D Lut for RGB->RGB transform:
a sample Nuke 3dl Lut file:
0 64 128 192 256 320 384 448 512 576 640 704 768 832 896 960 1023
R, G, B
0, 0, 0
0, 0, 64
0, 0, 128
0, 0, 192
0, 0, 256
.
.
.
0, 64, 0
0, 64, 64
0, 64, 128
.
.
Here instead of generating a 1024*1024*1024 table for the source 10-bit RGB values, each R,G and B range is quantized to 17 values generating a 4913 row table.
The first line gives the possible quantized values (I think here only the length and the max value matter ). Now suppose, if the source RGB value is (20, 20, 190 ), the output would be line # 4 (0, 0, 192) (using some interpolation techniques). Is that correct?
This one is for 10-bit source, you could generate a smiliar one for 8-bit by changing the range from 0 to 255?
Similarly, how would you proceed for sRGB->Lab conversion ?
An alternative approach makes use of graphics hardware, aka "general purpose GPU computing". There are some different tools for this, e.g. OpenGL GLSL, OpenCL, CUDA, ... You should gain an incredible speedup of about 100x and more compared to a CPU solution.
The most "compatible" solution is to use OpenGL with a special fragment shader with which you can perform computations. This means: upload your input image as a texture to the GPU, render it in a (target) framebuffer with a special shader program which converts your RGB data to Lab (or it can also make use of a lookup table, but most float computations on the GPU are faster than table / texture lookups, so we won't do this here).
First, port your RGB to Lab conversion function to GLSL. It should work on float numbers, so if you used integral values in your original conversion, get rid of them. OpenGL uses "clamp" values, i.e. float values between 0.0 and 1.0. It will look like this:
vec3 rgbToLab(vec3 rgb) {
vec3 lab = ...;
return lab;
}
Then, write the rest of the shader, which will fetch a pixel of the (RGB) texture, calls the conversion function and writes the pixel in the color output variable (don't forget the alpha channel):
uniform sampler2D texture;
varying vec2 texCoord;
void main() {
vec3 rgb = texture2D(texture, texCoord).rgb;
gl_FragColor = vec4(lab, 1.0);
}
The corresponding vertex shader should write texCoord values of (0,0) in the bottom left and (1,1) in the top right of a target quad filling the whole screen (framebuffer).
Finally, use this shader program in your application by rendering on a framebuffer with the same size than your image. Render a quad which fills the whole region (without setting any transformations, just render a quad from the 2D vertices (-1,-1) to (1,1)). Set the uniform value texture to your RGB image which you uploaded as a texture. Then, read back the framebuffer from the device, which should hopefully contain your image in Lab color space.
Assuming your source colorspace is a triplet of bytes (RGB, 8 bits each) and both color spaces are stored in structs with the names SourceColor and TargetColor respectively, and you have a conversion function given like this:
TargetColor convert(SourceColor color) {
return ...
}
Then you can create a table like this:
TargetColor table[256][256][256]; // 16M * sizeof(TargetColor) => put on heap!
for (int r, r < 256; ++r)
for (int g, g < 256; ++g)
for (int b, b < 256; ++b)
table[r][g][b] = convert({r, g, b}); // (construct SourceColor from r,g,b)
Then, for the actual image conversion, use an alternative convert function (I'd suggest that you write a image conversion class which takes a function pointer / std::function in its constructor, so it's easily exchangeable):
TargetColor convertUsingTable(SourceColor source) {
return table[source.r][source.g][source.b];
}
Note that the space consumption is 16M * sizeof(TargetColor) (assuming 32 bit for Lab this will be 64MBytes), so the table should be heap-allocated (it can be stored in-class if your class is going to live on the heap, but better allocate it with new[] in the constructor and store it in a smart pointer).