Understanding OpenCV image smoothing - c++

This question is about this tutorial http://docs.opencv.org/doc/tutorials/imgproc/gausian_median_blur_bilateral_filter/gausian_median_blur_bilateral_filter.html#smoothing
In that code, all the smoothing methods are running inside a loop for MAX_KERNEL_LENGTH times. What is this kernel?

To calculate a smoothing for example an average is calculated for the closest pixels. Which and how many pixels that are given by this kernel. The kernel also contains information about weighting of the pixels.
The kernel is most often represented as a matrix (and in this case also) which is centered at each pixel that is the average is calculated for. The calculating looks like this in pseudo c++ code.
for(int i=0;i<src.rows;i++){
for (int j=0;j<src.cols;j++){
dst[i][j]=0;
for(int kernel_i=0;i<kernel.rows;i++){
for (int kernel_j=0;j<kernel.cols;j++){
dst[i][j]+=
src[i-kernel.rows+kernel_i][j-kernel.cols+kernel_j]*
kernel[kernel_i][kernel_j];
}
}
}
}
The variable mentioned as MAX_KERNEL_LENGTH is simply the biggest size of the matrix creating one such kernel.

The MAX_KERNEL_LENGTH is defined as a constant (31) in the code. It is used to change the kernel size from 1x1 to 31x31 to show the effect of different kernel sizes in different blurring algorithms used in the tutorial.

Related

The meaning of sigma_s and sigma_r in detailEnhance function on OpenCV

The detailEnhance function provided by openCV have parameters InputArray, OutputArray, sigma_s and sigma_r. What does sigma s and r mean and what is it used for?
Here is the source: http://docs.opencv.org/3.0-beta/modules/photo/doc/npr.html#detailenhance
Thank you in advance.
sigma_s controls how much the image is smoothed - the larger its value, the more smoothed the image gets, but it's also slower to compute.
sigma_r is important if you want to preserve edges while smoothing the image. Small sigma_r results in only very similar colors to be averaged (i.e. smoothed), while colors that differ much will stay intact.
See also: https://www.learnopencv.com/non-photorealistic-rendering-using-opencv-python-c/

Function to determine all local maxima of a histogram

Is there an OpenCV function that can give me a list of all the local maxima for a histogram? Maybe there is a function that lets me specify a minimum peak/threshold and will tell me the bins of all those local maxima above that threshold.
If not, is there a function that can sort the bins from highest(most frequent) to lowest (least frequent). I can then grab all the first 20 or so bins and I have my 20 biggest local maxima.
Opencv minMaxLoc can be used in this context with a sliding window. If the location of the maxima is on an edge then ignore the maxima, otherwise record as maxima. You can use something like the function below (Note: this code is more like psuedocode it has not been tested)
/**
* Assumes a 1 channel histogram
*/
vector<int> findMaxima(Mat histogram, int windowsize, int histbins){
vector<int> maximas;
int lastmaxima;
for(int i = 0; i < histbins - windowsize; i++){
//Just some Local variables, only maxloc and maxval are used.
int maxval,minval;
Point* maxloc, maxloc;
//Crop the windows
Rect window(i,0,windowsize,1);
//Get the maxima
minMaxLoc(histogram(window), minval,maxval,maxloc,minloc);
//Check if its not on the side
if(maxloc.x != 0&&maxloc.x != windowsize-1){
//Translate from cropped window into real position
int originalposition = maxloc.x+i;
//Check that this is a new maxima and not already recorded
if(lastmaxima != originalposition){
maximas.push(originalposition);
lastmaxima = originalposition;
}
}
}
return maximas;
}
Of course this is a very simplistic system. You might want to use a multiscale approach with different sliding window sizes. You may also need to apply gaussian smoothing depending on your data. Another approach could be to run this for a small window size like 3 or 4 (you need a mimimum of 3). Then you could use something else for non maxima-suppression.
For your approach in which you suggested
Maybe there is a function that lets me specify a minimum peak/threshold and will tell me the bins of all those local maxima above that threshold.
You could simply perform a threshold before finding the maxima with the above function.
threshold(hist,res ...aditional parameters...);
vector<int> maximas = findMaximas(hist, ...other parameters...);
AFAIK OpenCV doesn't have such functionality, but it is possible do implement something similar yourself.
In order to sort histogram bins you can possibly use sortIdx, but as a result you will obtain list of largest bins, which is different than local maxima (those should be "surrounded" by smaller values).
To obtain local maxima you can compare each bin with its neighbors (2 in 1D case). A bin should be larger than neighbors with some margin to be considered a local maximum.
Depending on the size of the bins, you may want to filter the histogram before this step (for example convolve it with Gaussian kernel), since otherwise you'd obtain too much of these maxima, especially for small bin sizes. If you've used Gaussian kernel - it's sigma would be related to the size of the neighborhood in which detected local maxima are "global".
Once you detect those points - you may want to perform non-maximal suppression, to replace groups of points that lie very close together with a single point. A simple strategy for that would be to sort those maxima according to some criteria (for example difference with neighbors), then take one maximum and remove all the points in its neighborhood (its size can be related the the Gaussian kernel sigma), take next remaining maximum and again remove points in its neighborhood and so on until you run out of points or go below some meaningful difference values.
Finally, you may want to sort remaining candidate points by their absolute values (to get "largest" local maxima), or by their differences with neighbors (to get "sharpest" ones).
You may try another approach. We can use this definition of local maximum to implement a simpler algorithm: just move a sliding window of size S along the histogram and pick maximum in each position. This will have some problems:
in locations with prominent maximum multiple window positions will generate points that correspond to the same maximum (can be fixed with non maximum suppression),
in locations with no or small variation it will return
semi-random maxima (can be fixed with threshold on variance in
window or difference between maximum and neighborhood),
in regions with monotonic histogram it will return a largest value (which is not necessarily a maximum).
Once you perform all the "special case" handling - those 2 approaches would be quite similar I believe.
Another thing to implement may be "multi scale" approach, which can be considered as an extension if those 2. Basically it boils down to detecting local maxima for different neighborhood sizes, and then storing them all along with corresponding neighborhood size, which can be helpful for some purposes.
As you can see, this is a quite vague guide, and there's a reason for that: the type and amount of local maximas you want to get will most likely depend on the problem you have in mind. There's no hard and easy rule to decide if the point should be considered a local maxima, so you should probably start with some simple approach and then refine it for your specific case.

OpenCV: Understanding Kernel

My book says this about the Image Kernel concept in OpenCV
When a computation is done over a pixel neighborhood, it is common to
represent this with a kernel matrix. This kernel describes how the
pixels involved in the computation are combined in order to obtain the
desired result.
In image blur techniques, we use the kernel size.
cv::GaussianBlur(inputImage,outputImage,Size(1,1),0,0)
So, if I say the kernel size is Size(1,1) does that mean the kernel got only 1 pixel?
Please have a look at the following image
In here, what's the Kernel size? Size(3,3) ? If I say size Size(1,1) in this image, does that mean the kernel got only 1 pixel and the pixel value is 0 (The first value in the image)?
The kernel size in the example image you gave is 3-by-3 (Size(3,3)), yes. A kernel size of 1-by-1 is valid, although it wouldn't be very interesting.
The generic name for the operation being performed by GaussianBlur is a convolution.
The GaussianBlur function is creating a Gaussian kernel, which is basically a matrix that represents how you should combine a window of n-by-n pixels to get a single pixel value (using a Gaussian-shaped blurring pattern in this case).
A kernel of size 1-by-1 can't do anything other than scalar multiplication of an image; that is, convolution by the 1-by-1 matrix [c] is just c * inputImage.
Typically, you'll want to choose a n-by-n Gaussian kernel that satisfies:
spread of Gaussian (i.e. standard deviation or variance) such that it blurs the amount you want
larger number means more blurring; smaller number means less blurring
choose n sufficiently large as to not truncate the Gaussian too close to the mode
Links:
Convolution (Wikipedia)
Gaussian blur (Wikipedia)
this section in particular
The image you post is a 3x3 kernel, which would be specified by cv::Size(3,3). You are correct in saying that cv::Size(1,1) corresponds to a single pixel, but saying "cv::Size(1,1)" in reference to the image is not meaningful. A 1x1 kernel would simply have the value [1].
This image is a kernel and it's size is 3x3. Kernels are applied to image by multiplying corresponding pixel values and getting sum of 9 results. This is called convolution / filtering in literature. You can look at following resources for more information :
http://en.wikipedia.org/wiki/Kernel_(image_processing)
http://homepages.inf.ed.ac.uk/rbf/HIPR2/filtops.htm
http://www.cse.usf.edu/~r1k/MachineVisionBook/MachineVision.files/MachineVision_Chapter4.pdf

Detect clusters of circular objects by iterative adaptive thresholding and shape analysis

I have been developing an application to count circular objects such as bacterial colonies from pictures.
What make it easy is the fact that the objects are generally well distinct from the background.
However, few difficulties make the analysis tricky:
The background will present gradual as well as rapid intensity change.
In the edges of the container, the object will be elliptic rather than circular.
The edges of the objects are sometimes rather fuzzy.
The objects will cluster.
The object can be very small (6px of diameter)
Ultimately, the algorithms will be used (via GUI) by people that do not have deep understanding of image analysis, so the parameters must be intuitive and very few.
The problem has been address many times in the scientific literature and "solved", for instance, using circular Hough transform or watershed approaches, but I have never been satisfied by the results.
One simple approach that was described is to get the foreground by adaptive thresholding and split (as I described in this post) the clustered objects using distance transform.
I have successfully implemented this method, but it could not always deal with sudden change in intensity. Also, I have been asked by peers to come out with a more "novel" approach.
I therefore was looking for a new method to extract foreground.
I therefore investigated other thresholding/blob detection methods.
I tried MSERs but found out that they were not very robust and quite slow in my case.
I eventually came out with an algorithm that, so far, gives me excellent results:
I split the three channels of my image and reduce their noise (blur/median blur). For each channel:
I apply a manual implementation of the first step of adaptive thresholding by calculating the absolute difference between the original channel and a convolved (by a large kernel blur) one. Then, for all the relevant values of threshold:
I apply a threshold on the result of 2)
find contours
validate or invalidate contours on the grant of their shape (size, area, convexity...)
only the valid continuous regions (i.e. delimited by contours) are then redrawn in an accumulator (1 accumulator per channel).
After accumulating continuous regions over values of threshold, I end-up with a map of "scores of regions". The regions with the highest intensity being those that fulfilled the the morphology filter criteria the most often.
The three maps (one per channel) are then converted to grey-scale and thresholded (the threshold is controlled by the user)
Just to show you the kind of image I have to work with:
This picture represents part of 3 sample images in the top and the result of my algorithm (blue = foreground) of the respective parts in the bottom.
Here is my C++ implementation of : 3-7
/*
* cv::Mat dst[3] is the result of the absolute difference between original and convolved channel.
* MCF(std::vector<cv::Point>, int, int) is a filter function that returns an positive int only if the input contour is valid.
*/
/* Allocate 3 matrices (1 per channel)*/
cv::Mat accu[3];
/* We define the maximal threshold to be tried as half of the absolute maximal value in each channel*/
int maxBGR[3];
for(unsigned int i=0; i<3;i++){
double min, max;
cv::minMaxLoc(dst[i],&min,&max);
maxBGR[i] = max/2;
/* In addition, we fill accumulators by zeros*/
accu[i]=cv::Mat(compos[0].rows,compos[0].cols,CV_8U,cv::Scalar(0));
}
/* This loops are intended to be multithreaded using
#pragma omp parallel for collapse(2) schedule(dynamic)
For each channel */
for(unsigned int i=0; i<3;i++){
/* For each value of threshold (m_step can be > 1 in order to save time)*/
for(int j=0;j<maxBGR[i] ;j += m_step ){
/* Temporary matrix*/
cv::Mat tmp;
std::vector<std::vector<cv::Point> > contours;
/* Thresholds dst by j*/
cv::threshold(dst[i],tmp, j, 255, cv::THRESH_BINARY);
/* Finds continous regions*/
cv::findContours(tmp, contours, CV_RETR_LIST, CV_CHAIN_APPROX_TC89_L1);
if(contours.size() > 0){
/* Tests each contours*/
for(unsigned int k=0;k<contours.size();k++){
int valid = MCF(contours[k],m_minRad,m_maxRad);
if(valid>0){
/* I found that redrawing was very much faster if the given contour was copied in a smaller container.
* I do not really understand why though. For instance,
cv::drawContours(miniTmp,contours,k,cv::Scalar(1),-1,8,cv::noArray(), INT_MAX, cv::Point(-rect.x,-rect.y));
is slower especially if contours is very long.
*/
std::vector<std::vector<cv::Point> > tpv(1);
std::copy(contours.begin()+k, contours.begin()+k+1, tpv.begin());
/* We make a Roi here*/
cv::Rect rect = cv::boundingRect(tpv[0]);
cv::Mat miniTmp(rect.height,rect.width,CV_8U,cv::Scalar(0));
cv::drawContours(miniTmp,tpv,0,cv::Scalar(1),-1,8,cv::noArray(), INT_MAX, cv::Point(-rect.x,-rect.y));
accu[i](rect) = miniTmp + accu[i](rect);
}
}
}
}
}
/* Make the global scoreMap*/
cv::merge(accu,3,scoreMap);
/* Conditional noise removal*/
if(m_minRad>2)
cv::medianBlur(scoreMap,scoreMap,3);
cvtColor(scoreMap,scoreMap,CV_BGR2GRAY);
I have two questions:
What is the name of such foreground extraction approach and do you see any reason for which it could be improper to use it in this case ?
Since recursively finding and drawing contours is quite intensive, I would like to make my algorithm faster. Can you indicate me any way to achieve this goal ?
Thank you very much for you help,
Several years ago I wrote an aplication that detects cells in a microscope image. The code is written in Matlab, and I think now that is more complicated than it should be (it was my first CV project), so I will only outline tricks that will actually be helpful for you. Btw, it was deadly slow, but it was really good at separating large groups of twin cells.
I defined a metric by which to evaluate the chance that a given point is the center of a cell:
- Luminosity decreases in a circular pattern around it
- The variance of the texture luminosity follows a given pattern
- a cell will not cover more than % of a neighboring cell
With it, I started to iteratively find the best cell, mark it as found, then look for the next one. Because such a search is expensive, I employed genetic algorithms to search faster in my feature space.
Some results are given below:

Fast/Efficent Pixel Access in Magick++

As an educational excercise for myself I'm writing an application that can average a bunch of images. This is often used in Astrophotography to reduce noise.
The library I'm using is Magick++ and I've succeeded in actually writing the application. But, unfortunately, its slow. This is the code I'm using:
for(row=0;row<rows;row++)
{
for(column=0;column<columns;column++)
{
red.clear(); blue.clear(); green.clear();
for(i=1;i<10;i++)
{
ColorRGB rgb(image[i].pixelColor(column,row));
red.push_back(rgb.red());
green.push_back(rgb.green());
blue.push_back(rgb.blue());
}
redVal = avg(red);
greenVal = avg(green);
blueVal = avg(blue);
redVal = redVal*MaxRGB; greenVal = greenVal*MaxRGB; blueVal = blueVal*MaxRGB;
Color newRGB(redVal,greenVal,blueVal);
stackedImage.pixelColor(column,row,newRGB);
}
}
The code averages 10 images by going through each pixel and adding each channel's pixel intensity into a double vector. The function avg then takes the vector as a parameter and averages the result. This average is then used at the corresponding pixel in stackedImage - which is the resultant image. It works just fine but as I mentioned, I'm not happy with the speed. It takes 2 minutes and 30s seconds on a Core i5 machine. The images are 8 megapixel and 16 bit TIFFs. I understand that its a lot of data, but I have seen it done faster in other applications.
Is it my loop thats slow or is pixelColor(x,y) a slow way to access pixels in an image? Is there a faster way?
Why use vectors/arrays at all?
Why not
double red=0.0, blue=0.0, green=0.0;
for(i=1;i<10;i++)
{
ColorRGB rgb(image[i].pixelColor(column,row));
red+=rgb.red();
blue+=rgb.blue();
green+=rgb.green();
}
red/=10;
blue/=10;
green/=10;
This avoids 36 function calls on vector objects per pixel.
And you may get even better performance by using a PixelCache of the whole image instead of the original Image objects. See the "Low-Level Image Pixel Access" section of the online Magick++ documentation for Image
Then the inner loop becomes
PixelPacket* pix = cache[i]+row*columns+column;
red+= pix->red;
blue+= pix->blue;
green+= pix->green;
Now you have also removed 10 calls to PixelColor, 10 ColorRGB constructors, and 30 accessor functions per pixel.
Note, This is all theory; I haven't tested any of it
Comments:
Why do you use vectors for red, blue and green? Because using push_back can perform reallocations, and bottleneck processing. You could instead allocate just once three arrays of 10 colors.
Couldn't you declare rgb outside of the loops in order to relieve stack of unnecessary constructions and destructions?
Doesn't Magick++ have a way to average images?
Just in case anyone else wants to average images to reduce noise, and doesn't feel like too much "educational exercise" ;-)
ImageMagick can do averaging of a sequence of images like this:
convert image1.tif image2.tif ... image32.tif -evaluate-sequence mean result.tif
You can also do median filtering and others by changing the word mean in the above command to whatever you want, e.g.:
convert image1.tif image2.tif ... image32.tif -evaluate-sequence median result.tif
You can get a list of the available operations with:
identify -list evaluate
Output
Abs
Add
AddModulus
And
Cos
Cosine
Divide
Exp
Exponential
GaussianNoise
ImpulseNoise
LaplacianNoise
LeftShift
Log
Max
Mean
Median
Min
MultiplicativeNoise
Multiply
Or
PoissonNoise
Pow
RightShift
RMS
RootMeanSquare
Set
Sin
Sine
Subtract
Sum
Threshold
ThresholdBlack
ThresholdWhite
UniformNoise
Xor