efficient way of accessing opencv Mat elements - c++

I'm trying to play around with some OpenCV and thought up an interesting little scenario to work on.
Basically, I want to take a pixel, add the colour values from the 3 neighbouring pixels (so (x, y), (x+1, y) (x, y+1) and (x+1, y+1)) and divide the result by 4 to get an average colour value. Then the next set of pixels I process is (x+2, y+2) with it's 3 neighbours.
I then also want to be able to do a similar thing, but with 9 pixels (with the chosen co-ordinate to work from being the centre).
Initially I started with a gaussian blur type masking, but that's not the result I want to acheive. As from those calculations, I just want to get 1 pixel value. So the output image will be 1/4 or a 1/9 of the size. So for now I've got it working where I've literally written out the calculation in a for loop as:
for (int i = 1; i < myImage.rows -1; i++)
{
b = 0;
for (int k = 1; k < myImage.cols -1; k++)
{
//9 pixel radius
Result.at<Vec3b>(a, b)[1] = (myImage.at<Vec3b>(i-1, k-1)[1]+myImage.at<Vec3b>(i-1, k)[1]+myImage.at<Vec3b>(i+1, k)[1] + myImage.at<Vec3b>(i, k)[1]+myImage.at<Vec3b>(i, k-1)[1]+myImage.at<Vec3b>(i, k+1)[1] + myImage.at<Vec3b>(i + 1, k+1)[1] + myImage.at<Vec3b>(i-1, k + 1)[1] + myImage.at<Vec3b>(i + 1, k - 1)[1]) / 9;
Result.at<Vec3b>(a, b)[2] = (myImage.at<Vec3b>(i-1, k-1)[2]+myImage.at<Vec3b>(i-1, k)[2]+myImage.at<Vec3b>(i+1, k)[2] + myImage.at<Vec3b>(i, k)[2]+myImage.at<Vec3b>(i, k-1)[2]+myImage.at<Vec3b>(i, k+1)[2] + myImage.at<Vec3b>(i + 1, k+1)[2] + myImage.at<Vec3b>(i-1, k + 1)[2] + myImage.at<Vec3b>(i + 1, k - 1)[2]) / 9;
Result.at<Vec3b>(a, b)[0] = (myImage.at<Vec3b>(i-1, k-1)[0]+myImage.at<Vec3b>(i-1, k)[0]+myImage.at<Vec3b>(i+1, k)[0] + myImage.at<Vec3b>(i, k)[0]+myImage.at<Vec3b>(i, k-1)[0]+myImage.at<Vec3b>(i, k+1)[0] + myImage.at<Vec3b>(i + 1, k+1)[0] + myImage.at<Vec3b>(i-1, k + 1)[0] + myImage.at<Vec3b>(i + 1, k - 1)[0]) / 9;
//4 pixel radius
// Result.at<Vec3b>(a, b)[1] = (myImage.at<Vec3b>(i, k)[1] + myImage.at<Vec3b>(i + 1, k)[1] + myImage.at<Vec3b>(i, k + 1)[1] + myImage.at<Vec3b>(i, k - 1)[1] + myImage.at<Vec3b>(i - 1, k)[1]) / 5;
// Result.at<Vec3b>(a, b)[2] = (myImage.at<Vec3b>(i, k)[2] + myImage.at<Vec3b>(i + 1, k)[2] + myImage.at<Vec3b>(i, k + 1)[2] + myImage.at<Vec3b>(i, k - 1)[2] + myImage.at<Vec3b>(i - 1, k)[2]) / 5;
// Result.at<Vec3b>(a, b)[0] = (myImage.at<Vec3b>(i, k)[0] + myImage.at<Vec3b>(i + 1, k)[0] + myImage.at<Vec3b>(i, k + 1)[0] + myImage.at<Vec3b>(i, k - 1)[0] + myImage.at<Vec3b>(i - 1, k)[0]) / 5;
b++;
}
a++;
}
Obviously, it's possible to setup the two options as different function that is called, but I'm just wondering if there's a more efficient way of achieveing this, that would let the size of the mask be changed.
Thanks for any help!

I'm assuming that you want to do this all without built-in functions (like resize, mean, or filter2d) and just want to directly address the image using at. There are further optimizations that can be made, but this is intended as a reasonable and understandable improvement on the original code.
Also, it should be noted that I ignore any extra rows/columns when the image size is not exactly divisible by the scale factor. You'll need to specify the expected behavior if you want something different.
The first thing I'd do is change what you think of as the target pixel. Assume you have a 3x3 neighborhood like so:
1 2 3
4 5 6
7 8 9
We're going to take the mean value of all of these pixels anyway, so whether we call pixel 5 the target or pixel 1 makes no difference to the resulting image. I'm going to call pixel 1 the target because it makes the math cleaner.
The 1 pixel will always be on coordinates divisible by the scaling factor. If the scaling factor is 2, the coordinates of 1 will always be even.
Second, rather than loop over the original image dimensions, which actually results in recalculating the same pixel in Result numerous times, I'm going to loop over the dimensions of Result and figure out which pixels in the original image contribute to each pixel in the result.
So to find neighborhood in the original image that corresponds to pixel (x, y) in the result image, we just have to look for pixel 1 of that neighborhood. Since it's a multiple of the scaling factor, it's just
(x * scaleFactor, y * scaleFactor)
Finally, we need to add two more nested loops to loop over the scaleFactor x scaleFactor window. This is the part the avoids having to type out those long calculations.
In the 3x3 example above, for example, pixel 9 in the neighborhood of (x, y) will be:
(x * scaleFactor + 2, y * scaleFactor + 2)
I also do the mean calculation directly in a vector rather than doing each channel individually. This means that our results will overflow a uchar, so I use Vec3i and cast it back to a Vec3b after the division. This is one place where you should consider using a built-in function mean to calculate the average over the window as it will remove the need for these new loops.
So, if our original image is myImage, we have:
int scaleFactor = 3;
Mat Result(myImage.rows/scaleFactor, myImage.rows/scaleFactor,
myImage.type(), Scalar::all(0));
for (int i = 0; i < Result.rows; i++)
{
for (int k = 0; k < Result.cols; k++)
{
// make sum an int vector so it can hold
// value = scaleFactor x scaleFactor x 255
Vec3i areaSum = Vec3i(0,0,0);
for (int m = 0; m < scaleFactor; m++)
{
for (int n = 0; n < scaleFactor; n++)
{
areaSum += myImage.at<Vec3b>(i*scaleFactor+m, k*scaleFactor+n);
}
}
Result.at<Vec3b>(i,k) = Vec3b(areaSum/(scaleFactor*scaleFactor));
}
}
Here are a couple of samples...
Original:
scaleFactor = 2:
scaleFactor = 3:
scaleFactor = 5:

Related

Transform images with bezier curves

I'm using this article: nonlingr as a font to understand non linear transformations, in the section GLYPHS ALONG A PATH he explains how to use a parametric curve to transform an image, i'm trying to apply a cubic bezier to an image, however i have been unsuccessfull, this is my code:
OUT.aloc(IN.width(), IN.height());
//get the control points...
wVector p0(values[vindex], values[vindex+1], 1);
wVector p1(values[vindex+2], values[vindex+3], 1);
wVector p2(values[vindex+4], values[vindex+5], 1);
wVector p3(values[vindex+6], values[vindex+7], 1);
//this is to calculate t based on x
double trange = 1 / (OUT.width()-1);
//curve coefficients
double A = (-p0[0] + 3*p1[0] - 3*p2[0] + p3[0]);
double B = (3*p0[0] - 6*p1[0] + 3*p2[0]);
double C = (-3*p0[0] + 3*p1[0]);
double D = p0[0];
double E = (-p0[1] + 3*p1[1] - 3*p2[1] + p3[1]);
double F = (3*p0[1] - 6*p1[1] + 3*p2[1]);
double G = (-3*p0[1] + 3*p1[1]);
double H = p0[1];
//apply the transformation
for(long i = 0; i < OUT.height(); i++){
for(long j = 0; j < OUT.width(); j++){
//t = x / width
double t = trange * j;
//apply the article given formulas
double x_path_d = 3*t*t*A + 2*t*B + C;
double y_path_d = 3*t*t*E + 2*t*F + G;
double angle = 3.14159265/2.0 + std::atan(y_path_d / x_path_d);
mapped_point.Set((t*t*t)*A + (t*t)*B + t*C + D + i*std::cos(angle),
(t*t*t)*E + (t*t)*F + t*G + H + i*std::sin(angle),
1);
//test if the point is inside the image
if(mapped_point[0] < 0 ||
mapped_point[0] >= OUT.width() ||
mapped_point[1] < 0 ||
mapped_point[1] >= IN.height())
continue;
OUT.setPixel(
long(mapped_point[0]),
long(mapped_point[1]),
IN.getPixel(j, i));
}
}
Applying this code in a 300x196 rgb image all i get is a black screen no matter what control points i use, is hard to find information about this kind of transformation, searching for parametric curves all i find is how to draw them, not apply to images. Can someone help me on how to transform an image with a bezier curve?
IMHO applying a curve to an image sound like using a LUT. So you will need to check for the value of the curve for different image values and then switch the image value with the one on the curve, so, create a Look-Up-Table for each possible value in the image (e.g : 0, 1, ..., 255, for a gray value 8 bit image), that is a 2x256 matrix, first column has the values from 0 to 255 and the second one having the value of the curve.

c++ YUYV 422 Horizontal and Vertical Flipping

I have a uint8_t YUYV 422 (Interleaved) image array in memory and I want to be able to flip it both vertically and horizontally. I have successfully implemented a vertical flip but I'm having a problem with flipping both horizontally and vertically at the same time.
My code for the vertical flip, below, works perfectly.
int counter = 0;
int array_width = 2; // YUYV
for (int h = (m_Width * m_Height * array_width) - m_Width * array_width; h > 0; h -= m_Width * array_width)
{
for (int w = 0; w < m_Width * array_width; w++)
{
flipped[counter] = buffer[h + w];
counter++;
}
}
However, the following vertical and horizontal flip code appears to work but there is a loss of definition. To better understand what I am referring to, please see my sample images.
int x = 0;
for (int n = m_Width * m_Height * 2 - 1; n >= 0; n -= 4)
{
flipped[x] = buffer[n - 3]; // Y0
flipped[x + 1] = buffer[n - 2]; // U
flipped[x + 2] = buffer[n - 1]; // Y1
flipped[x + 3] = buffer[n]; // V
x += 4;
}
As you can see, I am moving the YUYV components and keeping them in the same order. I don't believe that I am dropping pixels so I don't understand why I am losing definition. To reiterate, I don't see this problem when flipping vertically (Using the first code snippet).
Here is the reference image, please note the stem of the lamp:
This is the flipped image, the stem of the lamp has lost definition:
You also need to swap Y0 and Y1 in your loop.
int x = 0;
for (int n = m_Width * m_Height * 2 - 1; n >= 3; n -= 4)
{
flipped[x] = buffer[n - 1]; // Y1->Y0
flipped[x + 1] = buffer[n - 2]; // U
flipped[x + 2] = buffer[n - 3]; // Y0->Y1
flipped[x + 3] = buffer[n]; // V
x += 4;
}
While I was at it, since you're accessing n - 3 I changed the loop condition to be absolutely sure it was safe.
m_Width * m_Height * 2 is not a multiple of 4 (the number of data blocks in YUYV format. Try changing '2' into '4', an also array_width.

Sliding window with roi c++

I have little indexing problem with the following for loop. Simply I am scanning the image with roi but I am not able to scan whole image. I have some non scanned regions left in the last rows and cols. Any suggestion?
Sorry for simple question.
`
// Sliding Window for scaning the image
for (int rowIndex = 0; rowIndex <= lBPIIImage2.rows - roih; rowIndex = getNextIndex(rowIndex, lBPIIImage2.rows, roih, steprow))
{
for (int colindex = 0; colindex <=lBPIIImage2.cols - roiw; colindex = getNextIndex(colindex, lBPIIImage2.cols, roiw, stepcol))
{
searchRect = cvRect(colindex, rowIndex, roiw, roih);
frameSearchRect = lBPIIImage2(searchRect);
LoopDummy = frameSearchRect.clone();
rectangle(frame, searchRect, CV_RGB(255, 0, 0), 1, 8, 0);
//normalize(LoopDummy, LoopDummy, 0, 255, NORM_MINMAX, CV_8UC1);
//imshow("Track", LoopDummy);
//waitKey(30);
images.push_back(LoopDummy);
Coordinate.push_back(make_pair(rowIndex, colindex));
}
}
Instead of this
for (int rowindex = 0; rowindex <= frameWidth - roiw; rowindex += StepRow)
try something like this
for (int rowIndex = 0; rowIndex < frameWidth - roiw;
rowIndex = getNextIndex(rowIndex, frameWidth, roiw, StepRow))
where function getNextIndex is like this:
int getNextIndex(int currentIndex, int frameSize, int roi, int step)
{
int temp = min(currentIndex + step, frameSize - roi - 1);
return currentIndex != temp ? temp : frameSize;
}
Note that I replaced <= with < in the condition part of the loop. It makes sense to me to change that bit. In my comments index values should be 0,5,6, instead of 0,5,7. If it really should stand <= then just remove the - 1 part above.
Dialecticus has your problem covered, but I'd like to expand a little. The problem, as you've noticed, affects both rows and columns, but I'll only talk about width since height handling is analogous.
Parts of an array or range not iterated over by a loop, especially ones at the end, is a typical symptom that the loop's condition and/or increment do not cover the entire array/range and need adjustment. In your case, to scan the entire image width, you'd need frameWidth - roiw to be an exact number of StepRows, i.e. (frameWidth - roiw) % StepRow == 0. I strongly suspect that's not true in your case.
For example, if your image is 14 pixels wide, and your region of interest width is 8, in which case your step would be 4 pixels (I assume you meant roiw * 0.5), then this would happen:
Iteration #3 would fail the rowindex <= frameWidth - roiw test and exit the loop without scanning the last 2 pixels.
I see three options, each with its disadvantages:
Disclaimer: All code below is not tested and may contain typos and logical errors. Please test and adjust accordingly.
Change the offset of the last region so it fits the end of the image. In this case, the last rowindex increment would not be StepRow but the number of pixels between the end of your last scanning region and the end of the image. The drawback is that the last two regions may overlap more than those before them - in our case, region 1 would cover pixels 0:7, region 2 - 4:11 and region 3 - 6:13.
To do this, you can modify your loop increment to check if this is the last iteration (please think of how to write it prettier, I just used the ternary operator to make the change smaller):
for (int rowindex = 0;
/* Note that a region of (roiw) pixels ending at (frameWidth - 1) should start
at (frameWidth - roiw), so the <= operator should be correct. To verify that,
check the single-region case where rowindex == 0 and frameWidth == roiw */
rowindex <= frameWidth - roiw;
rowindex += ((rowindex + roiw + StepRow <= frameWidth) ?
StepRow :
(frameWidth - (rowindex + roiw))))
Clamp the size of the last region to the end of the image. The drawback would be that your last region is smaller than those before it - in our case, regions 1 and 2 would have a size of 8, while region 3 would be 6 pixels.
To do this, you can put a check when constructing the cvRect to see if it needs to be smaller. You should also change your loop condition somehow, so that you actually enter it in this case:
/* (StepRow - 1) should make sure that enter the loop is entered when the
last unscanned region has a width between 1 and stepRow, inclusive */
for (int rowindex = 0; rowindex <= frameWidth - roiw + (StepRow - 1); rowindex += StepRow)
/* ... column loop ... */
{
if (rowindex > frameWidth - roiw)
{
roiw = frameWidth - rowindex;
}
searchRect = cvRect(rowindex, colindex, roiw, roih)
Use a varying step so that your regions are of equal width and overlap as equally as possible. In our case, the three regions would overlap equally, being at pixels 0:7, 3:11 and 6:14, but obviously that won't be true in most cases.
To do that, you can calculate how many iterations you expect to do with your current step, and then divide by that number the difference between your frame and ROI width, which would give you your fractional step. Since pixel offset is an integer, to construct the cvRect, you'd need an integer value. I suggest keeping and incrementing the rowindex and step as fractions and converting to int just before use in the cvRect constructor - in this way, your index will move as evenly as possible.
Note that I've used an epsilon value to (hopefully) avoid rounding errors in float-to-int conversion when the float value is very close to an int. See this blog post and comments for an example and an explanation of the technique. However, keep in mind that the code is not tested and there may still be rounding errors, especially if your int values are big!
/* To see how these calculations work out, experiment with
roiw == 8, StepRow == 4 and frameWidth == 14, 15, 16, 17 */
/* N.B.: Does not cover the case when frameWidth < roiw! */
int stepRowCount = (frameWidth - roiw + (StepRow - 1)) / StepRow + 1;
float step = ((float)(frameWidth - roiw)) / (stepRowCount - 1);
float eps = 0.005;
for (float rowindex = 0.0; (int)(rowindex + eps) <= frameWidth - roiw; rowindex += StepRow)
/* ... column loop ... */
{ searchRect = cvRect((int)(rowindex + eps), colindex, roiw, roih);
P.S.: Shouldn't your index that iterates the image width be called colindex and vice versa, since in an 1920x1080 image you have 1920 columns and 1080 rows (hence terms like 1080p)?

Gaussian Pyramid Out of Bounds

I am trying to write my own codes for Gaussian pyramid using c++.
I tried both reduce and expand equations as stated in http://persci.mit.edu/pub_pdfs/pyramid83.pdf, the equation (1) and (2). However, my array index is out of bounds when I am trying to access
[2i + m][2j + n] and [(i - m) / 2][(j - n) / 2], respectively.
My Gaussian kernel: the 5x5 matrix; g1Image: the original image reduced by 1 level, both row and column are half of the dimensions of the original image's.
My m and n are set to -2 < m/n <= 2, thus when i access my Gaussian kernel, i add 2 to the index, becoming
w[m + 2][n + 2] * original_image[2i + m][2j + n]
I did try to set my m and n to 0 < m/n <=4 as well, equation becomes
w[m][n] * original_image[2i + m][2j + n] or w[m][n] * original_image[2i + m - 2][2j + n - 2]
Any of the mentioned equations are out of bounds.
w[m][n] * original_image[2i][2j] for reduce equation and
w[m][n] * g1Image[i / 2][j / 2] for expand equation are working though.
However, the displayed image seems like there is no smoothing effect.
Can anyone explain to me how should I set my image dimension for each Gaussian Pyramid Reduction, Gaussian Pyramid Expansion and the m and n boundaries?
I have solved the problem by including this line
index1 = (2 * h) + m; index2 = (2 * w) + n;
if(index1 >= 0 && index1 < Height && index2 >= 0 && index2 < Width)
temp = w[m + 2][n + 2] * original_image[index1][index2];
More information at :
http://www.songho.ca/dsp/convolution/convolution.html

SSE optimization of Gaussian blur

I'm working on a school project , I have to optimize part of code in SSE, but I'm stuck on one part for few days now.
I dont see any smart way of using vector SSE instructions(inline assembler / instric f) in this code(its a part of guassian blur algorithm). I would be glad if somebody could give me just a small hint
for (int x = x_start; x < x_end; ++x) // vertical blur...
{
float sum = image[x + (y_start - radius - 1)*image_w];
float dif = -sum;
for (int y = y_start - 2*radius - 1; y < y_end; ++y)
{ // inner vertical Radius loop
float p = (float)image[x + (y + radius)*image_w]; // next pixel
buffer[y + radius] = p; // buffer pixel
sum += dif + fRadius*p;
dif += p; // accumulate pixel blur
if (y >= y_start)
{
float s = 0, w = 0; // border blur correction
sum -= buffer[y - radius - 1]*fRadius; // addition for fraction blur
dif += buffer[y - radius] - 2*buffer[y]; // sum up differences: +1, -2, +1
// cut off accumulated blur area of pixel beyond the border
// assume: added pixel values beyond border = value at border
p = (float)(radius - y); // top part to cut off
if (p > 0)
{
p = p*(p-1)/2 + fRadius*p;
s += buffer[0]*p;
w += p;
}
p = (float)(y + radius - image_h + 1); // bottom part to cut off
if (p > 0)
{
p = p*(p-1)/2 + fRadius*p;
s += buffer[image_h - 1]*p;
w += p;
}
new_image[x + y*image_w] = (unsigned char)((sum - s)/(weight - w)); // set blurred pixel
}
else if (y + radius >= y_start)
{
dif -= 2*buffer[y];
}
} // for y
} // for x
One more feature you can use is logical operations and masks:
for example instead of:
// process only 1
if (p > 0)
p = p*(p-1)/2 + fRadius*p;
you can write
// processes 4 floats
const __m128 &mask = _mm_cmplt_ps(p,0);
const __m128 &notMask = _mm_cmplt_ps(0,p);
const __m128 &p_tmp = ( p*(p-1)/2 + fRadius*p );
p = _mm_add_ps(_mm_and_ps(p_tmp, mask), _mm_and_ps(p, notMask)); // = p_tmp & mask + p & !mask
Also I can recommend you to use a special libraries, which overloads instructions. For example: http://code.compeng.uni-frankfurt.de/projects/vc
dif variable makes iterations of inner loop dependent. You should try to parallelize the outer loop. But with out instructions overloading the code will become unmanageable then.
Also consider rethinking the whole algorithm. Current one doesn't look paralell. May be you can neglect precision, or increase scalar time a bit?