OpenCV resize returns empty image when aspect ratio is inverted - c++

I'm using the OpenCV library in a CUDA C++ environment to resize an image obtained by GPU processing.
A crucial step of the processing involves resampling the image and INVERTING the aspect ratio.
Example problem:
Resize a 2000 x 500 and transform it into a 500 x 2000 image using CUDA
This is attempted by the following OpenCV command:
cv::cuda::resize(d_src,d_dst,cv::Size(500,2000),cv::INTER_CUBIC);
Where d_src and d_dst are the proper GpuMats with 2000 x 500 and 500 x 2000 size.
The maximum permitted resize is a square of either 2000x2000 or 500x500. The function behaves as expected as long as the aspect ratio is not inverted. I have also attempted making the interpolation in two steps, either by expansion and reduction:
Going from 2000x500 to 2000x2000 to 500x2000.
cv::cuda::resize(d_src,d_buffer,cv::Size(2000,2000),cv::INTER_CUBIC);
cv::cuda::resize(d_buffer,d_dst,cv::Size(500,2000),cv::INTER_CUBIC);
Going from 2000x500 to 500x500 to 500x2000.
cv::cuda::resize(d_src,d_buffer,cv::Size(500,500),cv::INTER_CUBIC);
cv::cuda::resize(d_buffer,d_dst,cv::Size(500,2000),cv::INTER_CUBIC);
Both these approaches fail and are not preferred since they consume a considerable amount of extra GPU memory.
Has anyone experienced a similar problem with this function? Could somebody help me out?
Thank you in advance

Nevermind. This problem can be solved by setting the size parameters to auto:
cv::cuda::resize(d_src,d_dst,d_dst.size(),0,0,cv::INTER_CUBIC);

Related

How to use buildOpticalFlowPyramid?

I'm using OpenCV 3.3.1. I want to do a semi-dense optical flow operation using cv::calcOpticalFlowPyrLK, but I've been getting some really noticeable slowdown whenever my ROI is pretty big (Partly due to the fact that I am letting the user decide what the winSize should be, ranging from from 10 to 100). Anyways, it seems like cv::buildOpticalFlowPyramid can mitigate the slowdown by building image pyramids? I'm sorta familiar what image pyramids are, but in context of the function, I'm especially confused about what parameters I pass in, and how it impacts my function call to cv::calcOpticalFlowPyrLK. With that in mind, I now have these set of questions:
The output is, according to the documentation, is an OutputArrayOfArrays, which I take it can be a vector of cv::Mat objects. If so, what do I pass in to cv::calcOpticalFlowPyrLK for prevImg and nextImg (assuming that I need to make image pyramids for both)?
According to the docs for cv::buildOpticalFlowPyramid, you need to pass in a winSize parameter in order to calculate required padding for pyramid levels. If so, do you pass in the same winSize value when you eventually call cv::calcOpticalFlowPyrLK?
What exactly are the arguments for pyrBorder and derivBorder doing?
Lastly, and apologies if it sounds newbish, but what is the purpose of this function? I always assumed that cv::calcOpticalFlowPyrLK internally builds the image pyramids. Is it just to speed up the optical flow operation?
I hope my questions were clear, I'm still very new to OpenCV, and computer vision, but this topic is very interesting.
Thank you for your time.
EDIT:
I used the function to see if my guess was correct, so far it has worked, but I've seen no noticeable speed up. Below is how I used it:
// Building pyramids
int maxLvl = 3;
maxLvl = cv::buildOpticalFlowPyramid(imgPrev, imPyr1, cv::Size(searchSize, searchSize), maxLvl, true);
maxLvl = cv::buildOpticalFlowPyramid(tmpImg, imPyr2, cv::Size(searchSize, searchSize), maxLvl, true);
// LK optical flow call
cv::calcOpticalFlowPyrLK(imPyr1, imPyr2, currentPoints, nextPts, status, err,
cv::Size(searchSize, searchSize), maxLvl, termCrit, 0, 0.00001);
So now I'm wondering what's the purpose of preparing the image pyramids if calcOpticalFlowPyrLK does it internally?
So the point of your question is that you are trying to improve speed of optical flow tracking by tuning your input parameters.
If you want dirty and quick answer then here it is
KTL (OpenCV's calcOpticalFlowPyrLK) define a e residual function which are sum of gradient of point inside search window .
The main purpose is to find vector of point that can minimize residual function
So if you increase search window size (winSize) then it is more difficult to find that set of points.
If your really really want to do that then please read the official paper.
See the section 2.4
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.185.585&rep=rep1&type=pdf
I took it from official document
https://docs.opencv.org/2.4/modules/video/doc/motion_analysis_and_object_tracking.html#bouguet00
Hope that help

The meaning of sigma_s and sigma_r in detailEnhance function on OpenCV

The detailEnhance function provided by openCV have parameters InputArray, OutputArray, sigma_s and sigma_r. What does sigma s and r mean and what is it used for?
Here is the source: http://docs.opencv.org/3.0-beta/modules/photo/doc/npr.html#detailenhance
Thank you in advance.
sigma_s controls how much the image is smoothed - the larger its value, the more smoothed the image gets, but it's also slower to compute.
sigma_r is important if you want to preserve edges while smoothing the image. Small sigma_r results in only very similar colors to be averaged (i.e. smoothed), while colors that differ much will stay intact.
See also: https://www.learnopencv.com/non-photorealistic-rendering-using-opencv-python-c/

How to filter a single column mat with Gaussian in OpenCV

I have mat with only one column and 1600 rows. I want to filter it using a Gaussian.
I tried the following:
Mat AFilt=Mat(palm_contour.size(),1,CV_32F);
GaussianBlur(A,AFilt,cv::Size(20,1),3);
But I get the exact same values in AFilt (the filtered mat) and A. It looks like GaussianBlur has done nothing.
What's the problem here? How can I smooth a single-column mat with a Gaussian kernel?
I read about BaseColumnFilt, but haven't seen any usage examples so I'm not sure how to use them.
Any help given will be greatly appreciated as I don't have a clue.
I'm working with OpenCV 2.4.5 on windows 8 using Visual Studio 2012.
Thanks
Gil.
You have a single column but you are specifying the width of the gaussian to be big instead of specifying the height! OpenCV use row,col or x,y notation depending on the context. A general rule is whenever you use Point or Size, they behave like x,y and whenever the parameters are separate values they behave like row,col.
The kernel size should also be odd. If you specify the kernel size you can set sigma to zero to let OpenCV compute a suitable sigma value.
To conclude, this should work better:
GaussianBlur(A,AFilt,cv::Size(1,21),0);
The documentation og GaussianBlur says the kernel size must be odd, I would try using an odd size kernel and see if that makes any difference

SuperResolution nextFrame bug

In the superresolution (gpu/super_resolution.cpp) sample (built with vc11 compiler) the the following line:
//Ptr superRes;
superRes->nextFrame(result);
results the following error error (tried with multipe test videos):
http://i.imgbox.com/abwNaL3z.jpg
and if I change the optical flow method to simple, it takes forever to run (stopped 30 min with an i7 2600k)
Any idea?
The BTV SuperResolution algorithm was oriented for small input videos. And it use a lot of memory for inner buffers. Your video has large resolution [768 x 576] and you upscale it with factor 4. Try to reduce scale factor, temporal radius or input resolution (for example upscale only a part of frame).

Fast/Efficent Pixel Access in Magick++

As an educational excercise for myself I'm writing an application that can average a bunch of images. This is often used in Astrophotography to reduce noise.
The library I'm using is Magick++ and I've succeeded in actually writing the application. But, unfortunately, its slow. This is the code I'm using:
for(row=0;row<rows;row++)
{
for(column=0;column<columns;column++)
{
red.clear(); blue.clear(); green.clear();
for(i=1;i<10;i++)
{
ColorRGB rgb(image[i].pixelColor(column,row));
red.push_back(rgb.red());
green.push_back(rgb.green());
blue.push_back(rgb.blue());
}
redVal = avg(red);
greenVal = avg(green);
blueVal = avg(blue);
redVal = redVal*MaxRGB; greenVal = greenVal*MaxRGB; blueVal = blueVal*MaxRGB;
Color newRGB(redVal,greenVal,blueVal);
stackedImage.pixelColor(column,row,newRGB);
}
}
The code averages 10 images by going through each pixel and adding each channel's pixel intensity into a double vector. The function avg then takes the vector as a parameter and averages the result. This average is then used at the corresponding pixel in stackedImage - which is the resultant image. It works just fine but as I mentioned, I'm not happy with the speed. It takes 2 minutes and 30s seconds on a Core i5 machine. The images are 8 megapixel and 16 bit TIFFs. I understand that its a lot of data, but I have seen it done faster in other applications.
Is it my loop thats slow or is pixelColor(x,y) a slow way to access pixels in an image? Is there a faster way?
Why use vectors/arrays at all?
Why not
double red=0.0, blue=0.0, green=0.0;
for(i=1;i<10;i++)
{
ColorRGB rgb(image[i].pixelColor(column,row));
red+=rgb.red();
blue+=rgb.blue();
green+=rgb.green();
}
red/=10;
blue/=10;
green/=10;
This avoids 36 function calls on vector objects per pixel.
And you may get even better performance by using a PixelCache of the whole image instead of the original Image objects. See the "Low-Level Image Pixel Access" section of the online Magick++ documentation for Image
Then the inner loop becomes
PixelPacket* pix = cache[i]+row*columns+column;
red+= pix->red;
blue+= pix->blue;
green+= pix->green;
Now you have also removed 10 calls to PixelColor, 10 ColorRGB constructors, and 30 accessor functions per pixel.
Note, This is all theory; I haven't tested any of it
Comments:
Why do you use vectors for red, blue and green? Because using push_back can perform reallocations, and bottleneck processing. You could instead allocate just once three arrays of 10 colors.
Couldn't you declare rgb outside of the loops in order to relieve stack of unnecessary constructions and destructions?
Doesn't Magick++ have a way to average images?
Just in case anyone else wants to average images to reduce noise, and doesn't feel like too much "educational exercise" ;-)
ImageMagick can do averaging of a sequence of images like this:
convert image1.tif image2.tif ... image32.tif -evaluate-sequence mean result.tif
You can also do median filtering and others by changing the word mean in the above command to whatever you want, e.g.:
convert image1.tif image2.tif ... image32.tif -evaluate-sequence median result.tif
You can get a list of the available operations with:
identify -list evaluate
Output
Abs
Add
AddModulus
And
Cos
Cosine
Divide
Exp
Exponential
GaussianNoise
ImpulseNoise
LaplacianNoise
LeftShift
Log
Max
Mean
Median
Min
MultiplicativeNoise
Multiply
Or
PoissonNoise
Pow
RightShift
RMS
RootMeanSquare
Set
Sin
Sine
Subtract
Sum
Threshold
ThresholdBlack
ThresholdWhite
UniformNoise
Xor