JPEG: YCrCb <-> RGB conversion precision - c++

I've implemented rgb->ycrcb and ycrcb->rgb conversion using JPEG conversion formulae from
http://www.w3.org/Graphics/JPEG/jfif3.pdf
(the same at: http://en.wikipedia.org/wiki/YCbCr (JPEG conversion)).
When checking whether results are correct (original->YCrCb->RGB), some of pixels differ by one, e.g 201->200.
Average percent of precision errors is 0.1%, so it's not critical.
/// converts RGB pixel to YCrCb using { en.wikipedia.org/wiki/YCbCr: JPEG conversion }
ivect4 rgb2ycrcb(int r, int g, int b)
{
int y = round(0.299*r + 0.587*g + 0.114*b) ;
int cb = round(128.0 - (0.1687*r) - (0.3313*g) + (0.5*b));
int cr = round(128.0 + (0.5*r) - (0.4187*g) - (0.0813*b));
return ivect4(y, cr, cb, 255);
}
/// converts YCrCb pixel to RGB using { en.wikipedia.org/wiki/YCbCr: JPEG conversion }
ivect4 ycrcb2rgb(int y, int cr, int cb)
{
int r = round(1.402*(cr-128) + y);
int g = round(-0.34414*(cb-128)-0.71414*(cr-128) + y);
int b = round(1.772*(cb-128) + y);
return ivect4(r, g, b, 255);
}
I use round formula:
floor((x) + 0.5)
When using other types of rounding, e.g. float(int), or std::ceil(), results are even worse.
So, does there exist the way to do YCrCb <-> RGB conversion without loss in precision?

The problem isn't rounding modes.
Even if you converted your floating point constants to ratios and used only integer math, you'd still see different values after the inverse.
To see why, consider a function where I tell you I'm going to shift the numbers 0 through N to the range 0 through N-2. The fact is that this transform is just doesn't have an inverse. You can represent it more or less exactly with a floating point computation (f(x) = x*(N-2)/N), but some of the neighboring values will map to the same result in integer math (pigeonhole principle!). This is a simplification and "compresses" the range, but the same thing happens in arbitrary affine transforms like this one you are using.
If you had r, g, b in floating point, and kept it that way until you quantized to integer, that would be a different story - but in integers you will necessarily always see some difference between the original and the inverse.

Only about 60% of all RGB values can be represented in YCbCr space when using the same amount of bits for both triplets. This means the most damage happens in RGB->YCbCr when you take a 3*8 bit RGB triplet, convert and round it back to 3*8 bits of precision. The trick is to store the YCbCr triplet at a higher precision until it's time to do forward DCT. There, the data needs to be scaled up anyway, so you can do e.g. 16 bit * 16 bit -> MSB16 multiplies, which are well supported by various SIMD instruction sets.
At the decoder it's the reverse: The results of inverse DCT have to be stored at higher precision until it's time to do the YCbCr->RGB conversion.
This doesn't make the process lossless, but for JPEG, it may buy a few dB of PSNR at the extreme high end of the quality scale, i.e. where the difference can't be seen with a naked eye but can be measured.

Yes, supposedly JPEG XR defines a color conversion that is reversible. The code is open source if you want to investigate in depth how they're doing it. The method is loosely described on the Wiki-page I linked to.
Also this SO post might give you some insights.

Another problem is that there is not a 1 to 1 mapping between rgb and YCbCR. There are YCbCr values with no corresponding RGB value and RBG values with no corresponding YCbCR values.

Related

What is the correct gamma correction function?

Currently I use the following formula to gamma correct colors (convert them from RGB to sRGB color space) after the lighting pass:
output = pow(color, vec3(1.0/2.2));
Is this formula the correct formula for gamma correction? I ask because I have encountered a few people saying that its not, and that the correct formula is more complicated and has something to do with power 2.4 rather than 2.2. I also heard something that the three color R, G and B should have different weights (something like 0.2126, 0.7152, 0.0722).
I am also curious which function does OpenGL use when GL_FRAMEBUFFER_SRGB is enabled.
Edit:
This is one of many topics covered in Guy Davidson's talk "Everything you know about color is wrong". The gamma correction function is covered here, but the whole talk is related to color spaces including sRGB and gamma correction.
Gamma correction may have any value, but considering linear RGB / non-linear sRGB conversion, 2.2 is an approximate, so that your formula may be considered both wrong and correct:
https://en.wikipedia.org/wiki/SRGB#Theory_of_the_transformation
Real sRGB transfer function is based on 2.4 gamma coefficient and has discontinuity at dark values like this:
float Convert_sRGB_FromLinear (float theLinearValue) {
return theLinearValue <= 0.0031308f
? theLinearValue * 12.92f
: powf (theLinearValue, 1.0f/2.4f) * 1.055f - 0.055f;
}
float Convert_sRGB_ToLinear (float thesRGBValue) {
return thesRGBValue <= 0.04045f
? thesRGBValue / 12.92f
: powf ((thesRGBValue + 0.055f) / 1.055f, 2.4f);
}
In fact, you may find even more rough approximations in some GLSL code using 2.0 coefficient instead of 2.2 and 2.4, so that to avoid usage of expensive pow() (x*x and sqrt() are used instead). This is to achieve maximum performance (in context of old graphics hardware) and code simplicity, while sacrificing color reproduction. Practically speaking, the sacrifice is not that noticeable, and most games apply additional tone-mapping and user-managed gamma correction coefficient, so that result is not directly correlated to sRGB standard.
GL_FRAMEBUFFER_SRGB and sampling from GL_SRGB8 textures are expected to use more correct formula (in case of texture sampling it is more likely pre-computed lookup table on GPU rather than real formula as there are only 256 values to convert). See, for instance, comments to GL_ARB_framebuffer_sRGB extension:
Given a linear RGB component, cl, convert it to an sRGB component, cs, in the range [0,1], with this pseudo-code:
if (isnan(cl)) {
/* Map IEEE-754 Not-a-number to zero. */
cs = 0.0;
} else if (cl > 1.0) {
cs = 1.0;
} else if (cl < 0.0) {
cs = 0.0;
} else if (cl < 0.0031308) {
cs = 12.92 * cl;
} else {
cs = 1.055 * pow(cl, 0.41666) - 0.055;
}
The NaN behavior in the pseudo-code is recommended but not specified in the actual specification language.
sRGB components are typically stored as unsigned 8-bit fixed-point values.
If cs is computed with the above pseudo-code, cs can be converted to a [0,255] integer with this formula:
csi = floor(255.0 * cs + 0.5)
Here is another article describing sRGB usage in OpenGL applications, which you may find useful: https://unlimited3d.wordpress.com/2020/01/08/srgb-color-space-in-opengl/

c++ half library has lower precision for positive numbers

I know that I am using a capability not built into c++ however, this library seems to be so commonly used that I am surprised to see this error pop up.
For those of you who do not know about the library it can be found here. Essentially, it is supposed to allow the support of 16 bit floating point (lower precision) numbers.
My problem is that the precision of half floats appears to diminish for positive numbers.
In this code, I am generating a bunch of points to be rendered to the screen. {xs1, ys1} represents floating point precision calculation of sigmoid. {xs3, ys3} represents the values cast into floating point precision.
vector<float> xs1, ys1, xs3, ys3;
int res = 200000;
for (int i = 0; i < res; i++)
{
float prec = float(i) / float(res);
float fx = ((perc - 0.5) * 2.0)*8.0;
half hx = half(fx);
float fy = MFunctions::sigmoid(fx);
half hy = half(fy);
xs1.push_back(fx);
ys1.push_back(fy);
xs3.push_back(float(hx));
ys3.push_back(float(hy));
}
Here are the results (looking at zoomed in portions of the graph this generates with a window width of 2.2 and a window height of 0.02 units):
When looking at the floating precision graph, {xs1, ys1} both of the corners of the sigmoid function are smooth:
However, when looking at the half precision graph {xs3, ys3} the corner in the positive x axis shows a stepping effect while the corner in the negative x axis shows a lower resolution but smooth graph:
I am not sure why this is happening since the only difference between positive and negative numbers should be a sign bit.
Is there something wrong that I am doing or is this a flaw in the half library?
Sigmoid function output values are [0;1], so what you see is normal: in the bottom picture, values are around 1, so precision is much lower than around 0.

matrix multiplication resulting in values greater than 255

If I am performing matrix multiplication on two 8UC1 images, or per element multiplication, what happens if one of the resulting pixel values is greater than 255? For example, if in image A a certain pixel has value 100, and in image B that same pixel has value 150 (for the per element multiplication case), then clearly 100*150 > 255 - so does that pixel simply get truncated to 255 value? And if so is there some transformation I can make to preserve that information without having it truncated?
opencv will saturate the result for a uchar img.
to avoid that, use e.g. the dtype flag in multiply and specify a type larger than your input
Mat a, b; //input, CV_8U
Mat c; // output, yet unspecified
multiply( a,b, c, 1, CV_32S ); // c will be of int type, untruncated results

RGB value encoded from height

I have a program where I need to represent height as an RGBT (in float) value. That is:
[R, G, B, T (Transperancy)] -> [0.0f-1.0f, 0.0f-1.0f, 0.0f-1.0f, 0.0f-1.0f]
Conceptually I know that you can encode via basic height between max and min height. I even have some code for greyScale height encoding:
double Heightmin=0;
double Heightmax=23;
osg::Vec4 getColourFromHeight(double height, double alpha=1.0) {
double r=(height-Heightmin)/Heightmax;
double b=(height-Heightmin)/Heightmax;
double g=(height-Heightmin)/Heightmax;
return osg::Vec4(r, g, b, 1.0);
}
What I would like to know, is if there is an algorithm that's more complex then just using R and G like this:
double r=(height-Heightmin)/Heightmax;
double b=0.0f;
double g=(Heightmax- height + Heightmin)/Heightmax;
(That is, the G is the inverted form of R, so at low values it will appear more green and at high values it will appear more red.
I would like to be able to utilise R G and B to give realistic looking hieght encoded landscapes:
This is an image of a 72dpi RGB height encoded topographic map. I would like to be able to achive something similar to this. Is there a simple algorithm to create an RGB value based on a minimum and maximum hieght?
Thaks for your help.
Ben
You just need to come up with a suitable colour gradient that you like, and then put it in a lookup table (or similar).
Then all you need is something that will map a value in the range min_height -> max_height into the range 0 -> 255 (for example).
Of course, it's possible that you will find a colour gradient that can be expressed as mathematical functions, but that's less general.

Mapping colors to an interval

I'm porting a MATLAB piece of code in C/C++ and I need to map many RGB colors in a graph to an integer interval.
Let [-1;1] be the interval a function can have a value in, I need to map -1 and any number below it to a color, +1 and any number above it to another color, any number between -1 and +1 to another color intermediate between the boundaries. Obviously numbers are infinite so I'm not getting worried about how many colors I'm going to map, but it would be great if I could map at least 40-50 colors in it.
I thought of subdividing the [-1;1] interval into X sub-intervals and map every one of them to a RGB color, but this sounds like a terribly boring and long job.
Is there any other way to achieve this? And if there isn't, how should I do this in C/C++?
If performance isn't an issue, then I would do something similar to what High Performance Mark suggested, except maybe do it in HSV color space: Peg the S and V values at maximum and vary the H value linearly over a particular range:
s = 1.0; v = 1.0;
if(x <= -1){h = h_min;}
else if(x >= 1){h = h_max;}
else {h = h_min + (h_max - h_min)*0.5*(x + 1.0);}
// then convert h, s, v back to r, g, b - see the wikipedia link
If performance is an issue (e.g., you're trying to process video in real-time or something), then calculate the rgb values ahead of time and load them from a file as an array. Then simply map the value of x to an index:
int r, g, b;
int R[NUM_COLORS];
int G[NUM_COLORS];
int B[NUM_COLORS];
// load R, G, B from a file, or define them in a header file, etc
unsigned int i = 0.5*(x + 1.0);
i = MIN(NUM_COLORS-1, i);
r = R[i]; g = G[i]; b = B[i];
Here's a poor solution. Define a function which takes an input, x, which is a float (or double) and returns a triplet of integers each in the range 0-255. This triplet is, of course, a specification of an RGB color.
The function has 3 pieces;
if x<=-1 f[x] = {0,0,0}
if x>= 1 f[x] = {255,255,255}
if -1<x<1 f[x] = {floor(((x + 1)/2)*255),floor(((x + 1)/2)*255),floor(((x + 1)/2)*255)}
I'm not very good at writing C++ so I'll leave this as pseudocode, you shouldn't have too much problem turning it into valid code.
The reason it isn't a terribly good function is that there isn't a natural color gradient between the values that this plots through RGB color space. I mean, this is likely to produce a sequence of colors which is at odds to most people's expectations of how colors should change. If you are one of those people, I invite you to modify the function as you see fit.
For all of this I blame RGB color space, it is ill-suited to this sort of easy computation of 'neighbouring' colors.