BGR -> YCbCr conversion not working correctly - c++

I am trying to manually convert an image from RBG (BGR in OpenCV) to the YCbCr color space.
My image is a png color image, 800 width and 600 height, 3 channels, 16 bit depth.
Here's how I tried solving this.
cv::Mat convertToYCbCr(cv::Mat image) {
// converts an RGB image to YCbCr
// cv::Mat: B-G-R
std::cout << "Converting image to YCbCr color space." << std::endl;
int i, j;
for (i = 0; i <= image.cols; i++) {
for (j = 0; j <= image.rows; j++) {
// R, G, B values
auto R = image.at<cv::Vec3d>(j, i)[2];
auto G = image.at<cv::Vec3d>(j, i)[1];
auto B = image.at<cv::Vec3d>(j, i)[0];
// Y'
auto Y = image.at<cv::Vec3d>(j,i)[0] = 0.299 * R + 0.587 * G + 0.114 * B + 16;
// Cb
auto Cb = image.at<cv::Vec3d>(j,i)[1] = 128 + (-0.169 * R -0.331 * G + 0.5 * B);
// Cr
auto Cr = image.at<cv::Vec3d>(j,i)[2] = 128 + (0.5 * R -0.419 * G -0.081 * B);
std::cout << "At conversion: Y = " << Y << ", Cb = " << Cb << ", "
<< Cr << std::endl;
}
}
std::cout << "Converting finished." << std::endl;
return image;
}
The image I receive looks like this:
What I am expecting is this (using OpenCV method):
The vertical lines hint maybe at something? Is my loop wrong? Can I even just "replace" the RGB values with YCbCr values and expect the image to look like the example? typeid() returns the same value for both images, N2cv3MatE.

The primary reason for incorrect results being observed is the incorrect data-type used to access the image. The correct type for accessing 16 bit unsigned pixels is cv::Vec3w (not cv::Vec3d).
The next issue is that the coefficients that are being using for conversion are designed for analog signals ( YPbPr ). For digital images, we have to use coefficients designed for digital images ( YCbCr ). You can find more details on the Wikipedia article on YCbCr in section ITU-R BT.601 conversion.
The piece of information missing from the article is that how will the coefficients change if the images are of 16 bit unsigned depth or 32 bit floating point depth? The answer to this is that we will have to scale the coefficients according to the bit depth of our image.
For images with 16 bit unsigned depth, the scaling should be performed as follows:
auto Y = (R * 65.481f * scale) + (G * 128.553f * scale) + (B * 24.966f * scale) + (16.0f * offset);
auto Cb = (R * -37.797f * scale) + (G * -74.203f * scale) + (B * 112.0f * scale) + (128.0f * offset);
auto Cr = (R * 112.0f * scale) + (G * -93.786f * scale) + (B * -18.214f * scale) + (128.0f * offset);
where scale is equal to 257.0/65535.0 and offset is equal to 257.0.
This conversion technique has been adopted from MATLAB source code for rgb2ycbcr function which references the following book describing the scaling:
C.A. Poynton, "A Technical Introduction to Digital Video", John Wiley
& Sons, Inc., 1996, Chapter 9, Page 175`
Now that the conversion has been done, the third issue we face is the visualization of image similar to that of OpenCV. When we perform color conversion with OpenCV, the output image is stored in the order YCrCb instead of the usual YCbCr. So to get the same image with our custom conversion logic, we have to store values in the relevant order.
A sample conversion code may look like this:
if(image.type() == CV_16UC3)
{
const float scale = 257.0f / 65535.0f;
const float offset = 257.0f;
for (int i = 0; i < image.cols; i++)
{
for (int j = 0; j < image.rows; j++)
{
auto R = image.at<cv::Vec3w>(j, i)[2];
auto G = image.at<cv::Vec3w>(j, i)[1];
auto B = image.at<cv::Vec3w>(j, i)[0];
auto Y = (R * 65.481f * scale) + (G * 128.553f * scale) + (B * 24.966f * scale) + (16.0f * offset);
auto Cb = (R * -37.797f * scale) + (G * -74.203f * scale) + (B * 112.0f * scale) + (128.0f * offset);
auto Cr = (R * 112.0f * scale) + (G * -93.786f * scale) + (B * -18.214f * scale) + (128.0f * offset);
image.at<cv::Vec3w>(j, i)[0] = (unsigned short)Y;
image.at<cv::Vec3w>(j, i)[1] = (unsigned short)Cr;
image.at<cv::Vec3w>(j, i)[2] = (unsigned short)Cb;
}
}
}

You should use cv::cvtColor
cvtColor(src, target_image, cv::COLOR_RGB2YCrCb);
Then just flip the second and third channels.
Though you could be getting that error because you're not casting the resulting values to ints.

Related

Cubic Interpolation with the official formula fails

I am trying to implement the Cubic Interpolation method using the next formula when a=-0.5 as usual.
My Linear Interpolation and Nearest Neighbor interpolation is working great but for some reason the Cubic interpolation fails with white pixels and turn them sometimes to turquoise color and sometimes messing around with another colors.
for example using rotation: (NOTE: please look carefully on the right image and you will notice the problems)
Another Example with much more black pixels. It almost seems to work perfectly but look on the dog's tongue. (strong white pixels turn to turquoise again)
you can see that my implementation of the Linear Interpolation is working great:
Since the actual rotation worked, I think I have a small mistake in the code that I did not notice, or maybe it's a numeric error or a double / float error.
It is important to note that I read the image normally and store the destination image as follows:
cv::Mat img = cv::imread("../dogfails.jpeg");
cv::Mat rotatedImageCubic(img.rows,img.cols,CV_8UC3);
Clarifications:
Inside my cubic interpolation function, srcPoint (newX and newY) is the "landing point" from the inverse transformation.
In my inverse transformations I am not using matrix multiplication with the pixels, right now I am just using the formulas for rotation. It might be important for the "numerical errors". For example:
rotatedX = x * cos(angle * toRadian) + y * sin(angle * toRadian);
rotatedY = x * (-sin(angle * toRadian)) + y * cos(angle * toRadian);
Here is my code for the Cubic Interpolation
double cubicEquationSolver(double d,double a) {
d = abs(d);
if( 0.0 <= d && d <= 1.0) {
double score = (a + 2.0) * pow(d, 3.0) - ((a + 3.0) * pow(d, 2.0)) + 1.0;
return score;
}
else if(1 < d && d <= 2) {
double score = a * pow(d, 3.0) - 5.0*a * pow(d, 2.0) + 8.0*a * d - 4.0*a;
return score;
}
else
return 0.0;
}
void Cubic_Interpolation_Helper(const cv::Mat& src, cv::Mat& dst, const cv::Point2d& srcPoint, cv::Point2i& dstPixel) {
double newX = srcPoint.x;
double newY = srcPoint.y;
double dx = abs(newX - round(newX));
double dy = abs(newY - round(newY));
double sumCubicBValue = 0;
double sumCubicGValue = 0;
double sumCubicRValue = 0;
double sumCubicGrayValue = 0;
double uX = 0;
double uY = 0;
if (floor(newX) - 1 < 0 || floor(newX) + 2 > src.cols - 1 || floor(newY) < 0 || floor(newY) > src.rows - 1) {
if (dst.channels() > 1)
dst.at<cv::Vec3b>(dstPixel) = cv::Vec3b(0, 0,0);
else
dst.at<uchar>(dstPixel) = 0;
}
else {
for (int cNeighbor = -1; cNeighbor <= 2; cNeighbor++) {
for (int rNeighbor = -1; rNeighbor <= 2; rNeighbor++) {
uX = cubicEquationSolver(rNeighbor + dx, -0.5);
uY = cubicEquationSolver(cNeighbor + dy, -0.5);
if (src.channels() > 1) {
sumCubicBValue = sumCubicBValue + (double) src.at<cv::Vec3b>(
cv::Point2i(round(newX) + rNeighbor, cNeighbor + round(newY)))[0] * uX * uY;
sumCubicGValue = sumCubicGValue + (double) src.at<cv::Vec3b>(
cv::Point2i(round(newX) + rNeighbor, cNeighbor + round(newY)))[1] * uX * uY;
sumCubicRValue = sumCubicRValue + (double) src.at<cv::Vec3b>(
cv::Point2i(round(newX) + rNeighbor, cNeighbor + round(newY)))[2] * uX * uY;
} else {
sumCubicGrayValue = sumCubicGrayValue + (double) src.at<uchar>(
cv::Point2i(round(newX) + rNeighbor, cNeighbor + round(newY))) * uX * uY;
}
}
}
if (dst.channels() > 1)
dst.at<cv::Vec3b>(dstPixel) = cv::Vec3b((int) round(sumCubicBValue), (int) round(sumCubicGValue),
(int) round(sumCubicRValue));
else
dst.at<uchar>(dstPixel) = sumCubicGrayValue;
}
I hope someone here will be able to help me, Thanks!

How to get the calculate the RGB values of a pixel from the luminance?

I want to compute the RGB values from the luminance.
The data that I know are :
the new luminance (the value that I want to apply)
the old luminance
the old RGB values.
We can compute the luminance from the RGB values like this :
uint8_t luminance = R * 0.21 + G * 0.71 + B * 0.07;
My code is :
// We create a function to set the luminance of a pixel
void jpegImage::setLuminance(uint8_t newLuminance, unsigned int x, unsigned int y) {
// If the X or Y value is out of range, we throw an error
if(x >= width) {
throw std::runtime_error("Error : in jpegImage::setLuminance : The X value is out of range");
}
else if(y >= height) {
throw std::runtime_error("Error : in jpegImage::setLuminance : The Y value is out of range");
}
// If the image is monochrome
if(pixelSize == 1) {
// We set the pixel value to the luminance
pixels[y][x] = newLuminance;
}
// Else if the image is colored, we throw an error
else if(pixelSize == 3) {
// I don't know how to proceed
// My image is stored in a std::vector<std::vector<uint8_t>> pixels;
// This is a list that contain the lines of the image
// Each line contains the RGB values of the following pixels
// For example an image with 2 columns and 3 lines
// [[R, G, B, R, G, B], [R, G, B, R, G, B], [R, G, B, R, G, B]]
// For example, the R value with x = 23, y = 12 is:
// pixels[12][23 * pixelSize];
// For example, the B value with x = 23, y = 12 is:
// pixels[12][23 * pixelSize + 2];
// (If the image is colored, the pixelSize will be 3 (R, G and B)
// (If the image is monochrome the pixelSIze will be 1 (just the luminance value)
}
}
How can I proceed ?
Thanks !
You don't need the old luminance if you have the original RGB.
Referencing https://www.fourcc.org/fccyvrgb.php for YUV to RGB conversion.
Compute U and V from original RGB:
```
V = (0.439 * R) - (0.368 * G) - (0.071 * B) + 128
U = -(0.148 * R) - (0.291 * G) + (0.439 * B) + 128
```
Y is the new luminance normalized to a value between 0 and 255
Then just convert back to RGB:
B = 1.164(Y - 16) + 2.018(U - 128)
G = 1.164(Y - 16) - 0.813(V - 128) - 0.391(U - 128)
R = 1.164(Y - 16) + 1.596(V - 128)
Make sure you clamp your computed values of each equation to be in range of 0..255. Some of these formulas can convert a YUV or RGB value to something less than 0 or higher than 255.
There's also multiple formula for converting between YUV and RGB. (Different constants). I noticed the page listed above has a different computation for Y than you cited. They are all relatively close with different precisions and adjustments. For just changing the brightness of a pixel, almost any formula will do.
Updated
I originally deleted this answer after the OP suggested it wasn't working for him. I was too busy for the last few days to investigate, but I wrote some sample code to confirm my hypothesis. At the bottom of this answer is a snippet of GDI+ based code increases the luminance of an image by a variable amount. Along with the code is an image that I tested this out on and two conversions. One at 130% brightness. Another at 170% brightness.
Here's a sample conversion
Original Image
Updated Image (at 130% Y)
Updated Image (at 170% Y)
Source:
#define CLAMP(val) {val = (val > 255) ? 255 : ((val < 0) ? 0 : val);}
void Brighten(Gdiplus::BitmapData& dataIn, Gdiplus::BitmapData& dataOut, const double YMultiplier=1.3)
{
if ( ((dataIn.PixelFormat != PixelFormat24bppRGB) && (dataIn.PixelFormat != PixelFormat32bppARGB)) ||
((dataOut.PixelFormat != PixelFormat24bppRGB) && (dataOut.PixelFormat != PixelFormat32bppARGB)))
{
return;
}
if ((dataIn.Width != dataOut.Width) || (dataIn.Height != dataOut.Height))
{
// images sizes aren't the same
return;
}
const size_t incrementIn = dataIn.PixelFormat == PixelFormat24bppRGB ? 3 : 4;
const size_t incrementOut = dataOut.PixelFormat == PixelFormat24bppRGB ? 3 : 4;
const size_t width = dataIn.Width;
const size_t height = dataIn.Height;
for (size_t y = 0; y < height; y++)
{
auto ptrRowIn = (BYTE*)(dataIn.Scan0) + (y * dataIn.Stride);
auto ptrRowOut = (BYTE*)(dataOut.Scan0) + (y * dataOut.Stride);
for (size_t x = 0; x < width; x++)
{
uint8_t B = ptrRowIn[0];
uint8_t G = ptrRowIn[1];
uint8_t R = ptrRowIn[2];
uint8_t A = (incrementIn == 3) ? 0xFF : ptrRowIn[3];
auto Y = (0.257 * R) + (0.504 * G) + (0.098 * B) + 16;
auto V = (0.439 * R) - (0.368 * G) - (0.071 * B) + 128;
auto U = -(0.148 * R) - (0.291 * G) + (0.439 * B) + 128;
Y *= YMultiplier;
auto newB = 1.164*(Y - 16) + 2.018*(U - 128);
auto newG = 1.164*(Y - 16) - 0.813*(V - 128) - 0.391*(U - 128);
auto newR = 1.164*(Y - 16) + 1.596*(V - 128);
CLAMP(newR);
CLAMP(newG);
CLAMP(newB);
ptrRowOut[0] = newB;
ptrRowOut[1] = newG;
ptrRowOut[2] = newR;
if (incrementOut == 4)
{
ptrRowOut[3] = A; // keep original alpha
}
ptrRowIn += incrementIn;
ptrRowOut += incrementOut;
}
}
}

Bilinear interpolation in 2D transformation Qt

I'm currently working on 2D transformations (translation, scaling, shearing and rotation) in Qt. I have a problem with bilinear interpolation, which I want to use to cover the 'black pixels' in output image. I'm using matrix calculations to get new coordinates of pixels of input image. Then I use reverse matrix calculation to check which pixel of input image responds to output pixel. Result of that is some float number which I use to interpolation. I check the four neighbour points and calculate the value (color) of output pixel. I have checked my calculations 'by hand' and they seem to be good.
Can anyone find any bug in that code? (I cut out the parts of code which are responsible for interface such as sliders).
Geometric::Geometric(QWidget* parent) : QWidget(parent) {
resize(1000, 800);
displayLogoDefault = true;
a = shx = shy = x0 = y0 = 0;
scx = scy = 1;
tx = ty = 0;
x = 200, y = 200;
paintT = paintSc = paintR = paintShx = paintShy = false;
img = new QImage(600,600,QImage::Format_RGB32);
img2 = new QImage("logo.jpeg");
}
Geometric::~Geometric() {
delete img;
delete img2;
img = NULL;
img2 = NULL;
}
void Geometric::makeChange() {
displayLogoDefault = false;
// iteration through whole input image
for(int i = 0; i < img2->width(); i++) {
for(int j = 0; j < img2->height(); j++) {
// calculate new coordinates basing on given 2D transformations values
//I calculated that formula eariler by multiplying/adding matrixes
x = cos(a)*scx*(i-x0) - sin(a)*scy*(j-y0) + shx*sin(a)*scx*(i-x0) + shx*cos(a)*scy*(j-y0);
y = shy*(x) + sin(a)*scx*(i-x0) + cos(a)*scy*(j-y0);
// tx and ty goes for translation. scx and scy for scaling
// shx and shy for shearing and a is angle for rotation
x += (x0 + tx);
y += (y0 + ty);
if(x >= 0 && y >= 0 && x < img->width() && y < img->height()) {
// reverse matrix calculation formula to find proper pixel from input image
float tmx = x - x0 - tx;
float tmy = y - y0 - ty;
float recX = 1/scx * ( cos(-a)*( (tmx + shx*shy*tmx - shx*tmx) ) + sin(-a)*( shy*tmx - tmy ) ) + x0 ;
float recY = 1/scy * ( sin(-a)*(tmx + shx*shy*tmx - shx*tmx) - cos(-a)*(shy*tmx-tmy) ) + y0;
// here the interpolation starts. I calculate the color basing on four points from input image
// that points are taken from the reverse matrix calculation
float a = recX - floorf(recX);
float b = recY - floorf (recY);
if(recX + 1 > img2->width()) recX -= 1;
if(recY + 1 > img2->height()) recY -= 1;
QColor c1 = QColor(img2->pixel(recX, recY));
QColor c2 = QColor(img2->pixel(recX + 1, recY));
QColor c3 = QColor(img2->pixel(recX , recY + 1));
QColor c4 = QColor(img2->pixel(recX + 1, recY + 1));
float colR = b * ((1.0 - a) * (float)c3.red() + a * (float)c4.red()) + (1.0 - b) * ((1.0 - a) * (float)c1.red() + a * (float)c2.red());
float colG = b * ((1.0 - a) * (float)c3.green() + a * (float)c4.green()) + (1.0 - b) * ((1.0 - a) * (float)c1.green() + a * (float)c2.green());
float colB = b * ((1.0 - a) * (float)c3.blue() + a * (float)c4.blue()) + (1.0 - b) * ((1.0 - a) * (float)c1.blue() + a * (float)c2.blue());
if(colR > 255) colR = 255; if(colG > 255) colG = 255; if(colB > 255) colB = 255;
if(colR < 0 ) colR = 0; if(colG < 0 ) colG = 0; if(colB < 0 ) colB = 0;
paintPixel(x, y, colR, colG, colB);
}
}
}
// x0 and y0 are the starting point of image
x0 = abs(x-tx);
y0 = abs(y-ty);
repaint();
}
// function painting a pixel. It works directly on memory
void Geometric::paintPixel(int i, int j, int r, int g, int b) {
unsigned char *ptr = img->bits();
ptr[4 * (img->width() * j + i)] = b;
ptr[4 * (img->width() * j + i) + 1] = g;
ptr[4 * (img->width() * j + i) + 2] = r;
}
void Geometric::paintEvent(QPaintEvent*) {
QPainter p(this);
p.drawImage(0, 0, *img);
if (displayLogoDefault == true) p.drawImage(0, 0, *img2);
}

create 2D LoG kernel in openCV like fspecial in Matlab

My question is not how to filter an image using the laplacian of gaussian (basically using filter2D with the relevant kernel etc.).
What I want to know is how I generate the NxN kernel.
I'll give an example showing how I generated a [Winsize x WinSize] Gaussian kernel in openCV.
In Matlab:
gaussianKernel = fspecial('gaussian', WinSize, sigma);
In openCV:
cv::Mat gaussianKernel = cv::getGaussianKernel(WinSize, sigma, CV_64F);
cv::mulTransposed(gaussianKernel,gaussianKernel,false);
Where sigma and WinSize are predefined.
I want to do the same for a Laplacian of Gaussian.
In Matlab:
LoGKernel = fspecial('log', WinSize, sigma);
How do I get the exact kernel in openCV (exact up to negligible numerical differences)?
I'm working on a specific application where I need the actual kernel values and simply finding another way of implementing LoG filtering by approximating Difference of gaussians is not what I'm after.
Thanks!
You can generate it manually, using formula
LoG(x,y) = (1/(pi*sigma^4)) * (1 - (x^2+y^2)/(sigma^2))* (e ^ (- (x^2 + y^2) / 2sigma^2)
http://homepages.inf.ed.ac.uk/rbf/HIPR2/log.htm
cv::Mat kernel(WinSize,WinSize,CV_64F);
int rows = kernel.rows;
int cols = kernel.cols;
double halfSize = (double) WinSize / 2.0;
for (size_t i=0; i<rows;i++)
for (size_t j=0; j<cols;j++)
{
double x = (double)j - halfSize;
double y = (double)i - halfSize;
kernel.at<double>(j,i) = (1.0 /(M_PI*pow(sigma,4))) * (1 - (x*x+y*y)/(sigma*sigma))* (pow(2.718281828, - (x*x + y*y) / 2*sigma*sigma));
}
If function above is not OK, you can simply rewrite matlab version of fspecial:
case 'log' % Laplacian of Gaussian
% first calculate Gaussian
siz = (p2-1)/2;
std2 = p3^2;
[x,y] = meshgrid(-siz(2):siz(2),-siz(1):siz(1));
arg = -(x.*x + y.*y)/(2*std2);
h = exp(arg);
h(h<eps*max(h(:))) = 0;
sumh = sum(h(:));
if sumh ~= 0,
h = h/sumh;
end;
% now calculate Laplacian
h1 = h.*(x.*x + y.*y - 2*std2)/(std2^2);
h = h1 - sum(h1(:))/prod(p2); % make the filter sum to zero
I want to thank old-ufo for nudging me in the correct direction.
I was hoping I won't have to reinvent the wheel by doing a quick matlab-->openCV conversion but guess this is the best solution I have for a quick solution.
NOTE - I did this for square kernels only (easy to modify otherwise, but I have no need for that so...).
Maybe this can be written in a more elegant form but is a quick job I did so I can carry on with more pressing matters.
From main function:
int WinSize(7); int sigma(1); // can be changed to other odd-sized WinSize and different sigma values
cv::Mat h = fspecialLoG(WinSize,sigma);
And the actual function is:
// return NxN (square kernel) of Laplacian of Gaussian as is returned by Matlab's: fspecial(Winsize,sigma)
cv::Mat fspecialLoG(int WinSize, double sigma){
// I wrote this only for square kernels as I have no need for kernels that aren't square
cv::Mat xx (WinSize,WinSize,CV_64F);
for (int i=0;i<WinSize;i++){
for (int j=0;j<WinSize;j++){
xx.at<double>(j,i) = (i-(WinSize-1)/2)*(i-(WinSize-1)/2);
}
}
cv::Mat yy;
cv::transpose(xx,yy);
cv::Mat arg = -(xx+yy)/(2*pow(sigma,2));
cv::Mat h (WinSize,WinSize,CV_64F);
for (int i=0;i<WinSize;i++){
for (int j=0;j<WinSize;j++){
h.at<double>(j,i) = pow(exp(1),(arg.at<double>(j,i)));
}
}
double minimalVal, maximalVal;
minMaxLoc(h, &minimalVal, &maximalVal);
cv::Mat tempMask = (h>DBL_EPSILON*maximalVal)/255;
tempMask.convertTo(tempMask,h.type());
cv::multiply(tempMask,h,h);
if (cv::sum(h)[0]!=0){h=h/cv::sum(h)[0];}
cv::Mat h1 = (xx+yy-2*(pow(sigma,2))/(pow(sigma,4));
cv::multiply(h,h1,h1);
h = h1 - cv::sum(h1)[0]/(WinSize*WinSize);
return h;
}
There is some difference between your function and the matlab version:
http://br1.einfach.org/tmp/log-matlab-vs-opencv.png.
Above is matlab fspecial('log', 31, 6) and below is the result of your function with the same parameters. Somehow the hat is more 'bent' - is this intended and what is the effect of this in later processing?
I can create a kernel very similar to the matlab one with these functions, which just directly reflect the LoG formula:
float LoG(int x, int y, float sigma) {
float xy = (pow(x, 2) + pow(y, 2)) / (2 * pow(sigma, 2));
return -1.0 / (M_PI * pow(sigma, 4)) * (1.0 - xy) * exp(-xy);
}
static Mat LOGkernel(int size, float sigma) {
Mat kernel(size, size, CV_32F);
int halfsize = size / 2;
for (int x = -halfsize; x <= halfsize; ++x) {
for (int y = -halfsize; y <= halfsize; ++y) {
kernel.at<float>(x+halfsize,y+halfsize) = LoG(x, y, sigma);
}
}
return kernel;
}
Here's a NumPy version that is directly translated from the fspecial function in MATLAB.
import numpy as np
import sys
def get_log_kernel(siz, std):
x = y = np.linspace(-siz, siz, 2*siz+1)
x, y = np.meshgrid(x, y)
arg = -(x**2 + y**2) / (2*std**2)
h = np.exp(arg)
h[h < sys.float_info.epsilon * h.max()] = 0
h = h/h.sum() if h.sum() != 0 else h
h1 = h*(x**2 + y**2 - 2*std**2) / (std**4)
return h1 - h1.mean()
The code below is the exact equivalent to fspecial('log', p2, p3):
def fspecial_log(p2, std):
siz = int((p2-1)/2)
x = y = np.linspace(-siz, siz, 2*siz+1)
x, y = np.meshgrid(x, y)
arg = -(x**2 + y**2) / (2*std**2)
h = np.exp(arg)
h[h < sys.float_info.epsilon * h.max()] = 0
h = h/h.sum() if h.sum() != 0 else h
h1 = h*(x**2 + y**2 - 2*std**2) / (std**4)
return h1 - h1.mean()
I wrote exact Implementation of Matlab fspecial function in OpenCV
function:
Mat C_fspecial_LOG(double* kernel_size,double sigma)
{
double size[2]={ (kernel_size[0]-1)/2 , (kernel_size[1]-1)/2};
double std = sigma;
const double eps = 2.2204e-16;
cv::Mat kernel(kernel_size[0],kernel_size[1],CV_64FC1,0.0);
int row=0,col=0;
for (double y = -size[0]; y <= size[0]; ++y,++row)
{
col=0;
for (double x = -size[1]; x <= size[1]; ++x,++col)
{
kernel.at<double>(row,col)=exp( -( pow(x,2) + pow(y,2) ) /(2*pow(std,2)));
}
}
double MaxValue;
cv::minMaxLoc(kernel,nullptr,&MaxValue,nullptr,nullptr);
Mat condition=~(kernel < eps*MaxValue)/255;
condition.convertTo(condition,CV_64FC1);
kernel = kernel.mul(condition);
cv::Scalar SUM = cv::sum(kernel);
if(SUM[0]!=0)
{
kernel /= SUM[0];
}
return kernel;
}
usage of this function :
double kernel_size[2] = {4,4}; // kernel size set to 4x4
double sigma = 2.1;
Mat kernel = C_fspecial_LOG(kernel_size,sigma);
compare OpenCV result with Matlab:
opencv result:
[0.04918466596701741, 0.06170341496034986, 0.06170341496034986, 0.04918466596701741;
0.06170341496034986, 0.07740850411228289, 0.07740850411228289, 0.06170341496034986;
0.06170341496034986, 0.07740850411228289, 0.07740850411228289, 0.06170341496034986;
0.04918466596701741, 0.06170341496034986, 0.06170341496034986, 0.04918466596701741]
Matlab result for fspecial('gaussian', 4, 2.1) :
0.0492 0.0617 0.0617 0.0492
0.0617 0.0774 0.0774 0.0617
0.0617 0.0774 0.0774 0.0617
0.0492 0.0617 0.0617 0.0492
Just for the sake of reference, here is a Python implementation which creates the LoG filter kernel to detect blobs of a pre-defined radius in pixels.
def create_log_filter_kernel(r_in_px: float):
"""
Creates a LoG filter-kernel to detect blobs of a given radius r_in_px.
\[
LoG(x,y) = \frac{-1}{\pi\sigma^4}\left(1 - \frac{x^2 + y^2}{2\sigma^2}\right)e^{\frac{-(x^2+y^2)}{2\sigma^2}}
\]
Look for maxima if blob is black, minima if blob is white.
:param r_in_px:
:return: filter kernel
"""
# sigma from radius: LoG has zero-crossing at $1 - \frac{x^2 + y^2}{2\sigma^2} = 0$
# i.e. r^2 = 2\sigma^2$ and thus $sigma = r / \sqrt{2}$
sigma = r_in_px/np.sqrt(2)
# ksize such that filter covers $3\sigma$
ksize = int(np.round(sigma*3))*2 + 1
# setup filter
xgv = np.arange(0, ksize) - ksize / 2
ygv = np.arange(0, ksize) - ksize / 2
x, y = np.meshgrid(xgv, ygv)
kernel = -1 / (np.pi * sigma**4) * (1 - (x**2 + y**2) / (2*sigma**2)) * np.exp(-(x**2 + y**2) / (2 * sigma**2))
#normalize to sum zero (does not change zero crossing, I tried it out for r < 100)
kernel -= np.sum(kernel) / ksize**2
#this is important: normalize such that positive/negative parts are comparable over different scales
kernel /= np.sum(kernel[kernel>0])
return kernel

Complex Numbers and Naive Fourier Transform (C++)

I'm trying to get fourier transforms to work, I have to do it for an assignment and I think I have it to where it should be working and i'm not sure why it's not. I think it has something to do with the complex numbers since 'i' is involved. I've looked at many references and I understand the formula but i'm having trouble programming it. this is what i have so far
void NaiveDFT::Apply( Image & img )
{
//make the fourier transform using the naive method and set that to the image.
Image dft(img);
Pixel ** dftData = dft.GetImageData();
Pixel ** imgData = img.GetImageData();
for(unsigned u = 0; u < img.GetWidth(); ++u)
{
for(unsigned v = 0; v < img.GetHeight(); ++v)
{
std::complex<double> sum = 0;
for(unsigned x = 0; x < img.GetWidth(); ++x)
{
for(unsigned y = 0; y < img.GetHeight(); ++y)
{
std::complex<double> i = sqrt(std::complex<double>(-1));
std::complex<double> theta = 2 * M_PI * (((u * x) / img.GetWidth()) + ((v * y) / img.GetHeight()));
sum += std::complex<double>(imgData[x][y]._red) * cos(theta) + (-i * sin(theta));
//sum += std::complex<double>(std::complex<double>(imgData[x][y]._red) * pow(EULER, -i * theta));
}
}
dftData[u][v] = (sum.imag() / (img.GetWidth() * img.GetHeight()));
}
}
img = dft;
}
I have a few test images i'm testing this with and i'm either getting like an all black image or like, an all gray image.
I've also tried the sum of e^(-i*2*PI*(x*u*width + y*v*height) * 1/width * height which gets the same result as expected although it's still not the desiered output.
I've also tried the sum.real() number and that doesn't look right either
if anyone has any tips or can point me in the right direction, that'd be great, at this point, i just keep trying different things and checking the output until I get what I should be getting.
thanks.
I think that there can be a problem during the multiplication with the complex term. The line:
sum += std::complex<double>(imgData[x][y]._red) * cos(theta) + (-i * sin(theta));
should be:
sum += std::complex<double>(imgData[x][y]._red) * ( cos(theta) + -i * sin(theta));
Moreover, while calculating theta you need to use double precision:
std::complex<double> theta = 2 * M_PI * ((((double)u * x) / (double)(img.GetWidth())) + (((double)v * y) / (double)(img.GetHeight())));