openCV AdaptiveThreshold versus Otsu Threshold. ROI - c++

I'm tried to use both of the methods but it seems like Adaptive threshold seems to be giving a better result. I used
cvSmooth( temp, dst,CV_GAUSSIAN,9,9, 0);
on the original image then only i used the threshold.
Is there anything I can tweak with the Otsu method to make the image better like adaptive thresholding? And 1 more thing, there are some unwanted fingerprint residue on the side, any idea how i can dispose them off?
I read from a journal that by comparing the percentage of the white pixels in a self-defined square, I can get the ROI. However this method requires me to have a threshold value which can be found using OTSU method but I'm not too sure about AdaptiveThresholding.
cvAdaptiveThreshold( temp, dst, 255,CV_ADAPTIVE_THRESH_MEAN_C,CV_THRESH_BINARY,13, 1 );
Result :
cvThreshold(temp, dst, 0, 255, CV_THRESH_BINARY | CV_THRESH_OTSU);

To get rid of the unwanted background, you can do a simple masking operation. The Otsu threshold function provides a threshold value that cuts the foreground image from the background. Use that threshold value in order to create a binary mask by iterating through the entire input image, checking if the current pixel value is greater than the threshold, and setting it to 1 if true or 0 if it is false.
Then, you can apply the binary mask to the original image by a simple matrix multiplication operation or a bitwise shift operation to remove the background.

Try dividing the image into ROIs and apply otsu individually, then merge them back. Dividing strategy can be static or dynamic depending on the max illumination.

Related

Thresholding an image in OpenCV using another image matrix of pixel standard deviations as threshold values

I have two videos, one of a background and one of that same background with a person sitting in the frame. I generated two images from the video of just the background: the mean image of the background video (by accumulating the frames and dividing by the number of frames) and an image of standard deviations from the mean per pixel, taken over the frames. In other words, I have two images representing the Gaussian distribution of the background video. Now, I want to threshold an image, not using one fixed threshold value for all pixels, but using the standard deviations from the image (a different threshold per pixel). However, as far as I understand, OpenCV's threshold() function only allows for one fixed threshold. Are there functions I'm missing, or is there a workaround?
A cv::Mat provides methodology to accomplish this.
The setTo() methods takes an optional InputArray as mask.
Assuming the following:
std is your standard deviations cv::Mat, in is the cv::Mat you want to threshold and thresh is the factor for your standard deviations.
Using these values the custom thresholding could be done like this:
// Computes threshold values based on input image + std dev
cv::Mat mask = in +/- (std * thresh);
// Set in.at<>(x,y) to 0 if value is lower than mask.at<>(x,y)
in.setTo(0, in < mask);
The in < mask expression creates a new MatExpr object, a matrix which is 0 at every pixel where the predicate is false, 255 otherwise.
This is a handy way to implement custom thresholding.

OpenCV - Removal of noise in image

I have an image here with a table.. In the column on the right the background is filled with noise
How to detect the areas with noise? I only want to apply some kind of filter on the parts with noise because I need to do OCR on it and any kind of filter will reduce the overall recognition
And what kind of filter is the best to remove the background noise in the image?
As said I need to do OCR on the image
I tried some filters/operations in OpenCV and it seems to work pretty well.
Step 1: Dilate the image -
kernel = np.ones((5, 5), np.uint8)
cv2.dilate(img, kernel, iterations = 1)
As you see, the noise is gone but the characters are very light, so I eroded the image.
Step 2: Erode the image -
kernel = np.ones((5, 5), np.uint8)
cv2.erode(img, kernel, iterations = 1)
As you can see, the noise is gone however some characters on the other columns are broken. I would recommend running these operations on the noisy column only. You might want to use HoughLines to find the last column. Then you can extract that column only, run dilation + erosion and replace this with the corresponding column in the original image.
Additionally, dilation + erosion is actually an operation called closing. This you could call directly using -
cv2.morphologyEx(img, cv2.MORPH_CLOSE, kernel)
As #Ermlg suggested, medianBlur with a kernel of 3 also works wonderfully.
cv2.medianBlur(img, 3)
Alternative Step
As you can see all these filters work but it is better if you implement these filters only in the part where the noise is. To do that, use the following:
edges = cv2.Canny(img, 50, 150, apertureSize = 3) // img is gray here
lines = cv2.HoughLinesP(edges, 1, np.pi / 180, 100, 1000, 50) // last two arguments are minimum line length and max gap between two lines respectively.
for line in lines:
for x1, y1, x2, y2 in line:
print x1, y1
// This gives the start coordinates for all the lines. You should take the x value which is between (0.75 * w, w) where w is the width of the entire image. This will give you essentially **(x1, y1) = (1896, 766)**
Then, you can extract this part only like :
extract = img[y1:h, x1:w] // w, h are width and height of the image
Then, implement the filter (median or closing) in this image. After removing the noise, you need to put this filtered image in place of the blurred part in the original image.
image[y1:h, x1:w] = median
This is straightforward in C++ :
extract.copyTo(img, new Rect(x1, y1, w - x1, h - y1))
Final Result with alternate method
Hope it helps!
My solution is based on thresholding to get the resulted image in 4 steps.
Read image by OpenCV 3.2.0.
Apply GaussianBlur() to smooth image especially the region in gray color.
Mask the image to change text to white and the rest to black.
Invert the masked image to black text in white.
The code is in Python 2.7. It can be changed to C++ easily.
import numpy as np
import cv2
import matplotlib.pyplot as plt
%matplotlib inline
# read Danish doc image
img = cv2.imread('./imagesStackoverflow/danish_invoice.png')
# apply GaussianBlur to smooth image
blur = cv2.GaussianBlur(img,(5,3), 1)
# threshhold gray region to white (255,255, 255) and sets the rest to black(0,0,0)
mask=cv2.inRange(blur,(0,0,0),(150,150,150))
# invert the image to have text black-in-white
res = 255 - mask
plt.figure(1)
plt.subplot(121), plt.imshow(img[:,:,::-1]), plt.title('original')
plt.subplot(122), plt.imshow(blur, cmap='gray'), plt.title('blurred')
plt.figure(2)
plt.subplot(121), plt.imshow(mask, cmap='gray'), plt.title('masked')
plt.subplot(122), plt.imshow(res, cmap='gray'), plt.title('result')
plt.show()
The following is the plotted images by the code for reference.
Here is the result image at 2197 x 3218 pixels.
As I know the median filter is the best solution to reduce noise. I would recommend to use median filter with 3x3 window. See function cv::medianBlur().
But be careful when use any noise filtration simultaneously with OCR. Its can lead to decreasing of recognition accuracy.
Also I would recommend to try using pair of functions (cv::erode() and cv::dilate()). But I'm not shure that it will best solution then cv::medianBlur() with window 3x3.
I would go with median blur (probably 5*5 kernel).
if you are planning to apply OCR the image. I would advise you to the following:
Filter the image using Median Filter.
Find contours in the filtered image, you will get only text contours (Call them F).
Find contours in the original image (Call them O).
isolate all contours in O that have intersection with any contour in F.
Faster solution:
Find contours in the original image.
Filter them based on size.
Blur (3x3 box)
Threshold at 127
Result:
If you are very worried of removing pixels that could hurt your OCR detection. Without adding artefacts ea be as pure to the original as possible. Then you should create a blob filter. And delete any blobs that are smaller then n pixels or so.
Not going to write code, but i know this works great as i use this myself, though i dont use openCV (i wrote my own multithreaded blobfilter out of speed reasons). And sorry but i cannot share my code here. Just describing how to do it.
If processing time is not an issue, a very effective method in this case would be to compute all black connected components, and remove those smaller than a few pixels. It would remove all the noisy dots (apart those touching a valid component), but preserve all characters and the document structure (lines and so on).
The function to use would be connectedComponentWithStats (before you probably need to produce the negative image, the threshold function with THRESH_BINARY_INV would work in this case), drawing white rectangles where small connected components where found.
In fact, this method could be used to find characters, defined as connected components of a given minimum and maximum size, and with aspect ratio in a given range.
I had already faced the same issue and got the best solution.
Convert source image to grayscale image and apply fastNlMeanDenoising function and then apply threshold.
Like this -
fastNlMeansDenoising(gray,dst,3.0,21,7);
threshold(dst,finaldst,150,255,THRESH_BINARY);
ALSO use can adjust threshold accorsing to your background noise image.
eg- threshold(dst,finaldst,200,255,THRESH_BINARY);
NOTE - If your column lines got removed...You can take a mask of column lines from source image and can apply to the denoised resulted image using BITWISE operations like AND,OR,XOR.
Try thresholding the image like this. Make sure your src is in grayscale. This method will only retain the pixels which are between 150 and 255 intensity.
threshold(src, output, 150, 255, CV_THRESH_BINARY | CV_THRESH_OTSU);
You might want to invert the image as you are trying to negate the gray pixels. After the operation, invert it again to get your desired result.

Applying adaptive thresholding to inrange function opencv c++

I want to take a video and create a binary from it, I want it so that if the pixel is within a certain range it will be included within the binary. In other words I want an upper and lower bound like in the inRange() function as opposed to a simple cutoff point like in the threshold() function.
I also want to use adaptive thresholding to account for differences in lighting in my video. Is there a way to do this? I know there is inRange() that does the former and adaptiveThreshold() that does the latter, but I don't know if there is a way to do both.
Apply adaptiveThreshold() to the whole original image, then apply inRange() to the original image and use the result of inRange() as a mask:
adaptiveThreshold(original_image, dst_image ... );
inRange(original_image, minArray, maxArray, mask);
Mat output = dst_image.mul(mask);

Normalization in Harris Corner Detector in Opencv C++?

I was reading the documentation on finding corners in an image using the Harris Corner detector. I couldn't understand why they normalized the image after applying the Harris Corner function. I am a bit confused about the idea of normalization. Can someone explain me why do we normalize an image? Also, what does convertscaleabs() do. I am still a starter in opencv and so its kind of hard to understand from the documentation.
cornerHarris( src_gray, dst, blockSize, apertureSize, k, BORDER_DEFAULT );
/// Normalizing
normalize( dst, dst_norm, 0, 255, NORM_MINMAX, CV_32FC1, Mat() ); //(I couldnot understand this line)
convertScaleAbs( dst_norm, dst_norm_scaled ); // ???
Thanks
I recommend you get familiar with image processing fundamentals before diving into OpenCV. Having a solid notion of basic concepts like contrast, binarization, clipping, threshold, slice, histogram, etc. (just to mention a few) will make your journey through OpenCV easier.
That said, Wikipedia defines it like this: "Normalization is a process that changes the range of pixel intensity values... Normalization is sometimes called contrast stretching or histogram stretching".
To illustrate this definition I will try to give you a simple example:
Imagine you have an 8 bit gray scale image, possible pixel intensity values go from 0 to 255. If the image has "low contrast", its histogram will show a concentration of pixels around a certain value like you can see in the following picture:
Now what normalization does is to "take" this histogram and stretch it. It applies an intensity transformation map that correlates the original minimum and maximum intensity values with a new pair of minimum and maximum values within the full range of intensities (0 to 255).
As for convertScaleAbs() the documentation says that it scales, calculates absolute values, and converts the result to 8-bit. Not much more to say about it.
Full prototype is void convertScaleAbs(InputArray src, OutputArray dst, double alpha=1, double beta=0), so used like that it will just calculate the absolute value of each element in the matrix and convert it to a valid 8-bit unsigned value.
You're probably looking at the code of the OpenCV tutorials.
By normalizing the corner response (that are in some unknown interval) in the interval [0, 255] with
normalize( dst, dst_norm, 0, 255, NORM_MINMAX, CV_32FC1, Mat() );
it's easier to select a threshold (since you now know the the threshold will also be in the interval [0, 255]) to keep only strongest responses. The range [0,255] is useful when, such in this case, you get the threshold value through a slider, which can have only integer values.
convertScaleAbs(dst_norm, dst_norm_scaled);
is needed only to convert dst_norm to a Mat of type CV_8U, so it can be shown correctly by imshow.

Scanning and Detecting Object Color in Image

I'm developing a software that detects boxers punching motion. At the moment i used color based segmentation using inRange function and set it to detect blue Minimum value and Blue Maximum value. The problem is that the range is quite wide and my cam at times picks out noise and segments objects of no interest. To improve the software i though of scanning image of a boxing glove and establishing exact Blue color Value before further processing.
It would make sens to me to store that value in a Vector and call it in inRange fiction
// My current function which takes the Minimum and Maximum values of Blue Color
Mat range_out;
inRange(blur_out, Scalar(100, 100, 100), Scalar(120, 255, 255), range_out);
So i would image the vector to go somewhere here.
Scan this above image compute the Blue value
Store this value in an array
recall the array in a inRange function
Could someone suggest a solution to this problem or direct me to a source of information where I can look for answers ?
since you are detecting the boxer gloves in motion so first use motion to separate it from other elements in the scene...use frame differentiation or optical flow to separate the glove and other moving areas from non moving areas...now in those moving area try for some colour detection...
Separe luminosity and cromaticity - your fixed range will not work very well in different light conditions. Your range is wide probably because you are trying to see "blue" in dark and on light at the same time. Convert your image to HSV (or La*b*) and discard V (or L), keeping H and S (or a* and b*).
Learn a color distribution instead a simple range - take some samples and compute a 2D
color histogram on H and S (a* or b*) for pixels on the glove. This histogram will be a model for the color distribution of your object. Then, use c2.calcBackProjection to detect the pixels of interest in your scene.
Clean the result using morphological close operation
Important: on step 2, play a little with different quantization values (ie, different numbers of bins).