I am sorry it seems a starting question but just wondering can i use rectangular dimensions for training opencv haar cascade. I tried with square samples and the resultant image was detected fine but when i tried with rectangular width and height as for license plate the aspect ratio is 2:1 between width and height so i am using the same aspect ratio while training but the resultant classifier is not detecting anything in the image.
nStages = 14
nPositive = 1780
minHitrate = 0.996
maxFalseAlarm = 0.2
nNegatives = 14000
width = 48
height = 24
Haar classifier type = BASIC
Boost type = gentle adaboost
The above are the parameters i have set for training of the classifier. Can anyone please confirm that can i use rectangular parameters for positive samples or not. Also please suggest some modifications to have the training done properly.
The size of negative images for training is around 240x161 annd 420x240
Thank you.
EDIT 1:
I am using the call as follows.
f_cascade.detectMultiScale( image, detected_objects, pyramidScale, 2, 0|CV_HAAR_SCALE_IMAGE);
The key parts of this are making sure your positive samples and training dimensions are the same. There is no reason why you won't be able to detect a rectangular object.
They key thing to remember is that traincascade is running whatever dimensions you specify over your images.
See here for some proof that rectangular objects should be detected just fine:
http://coding-robin.de/2013/07/22/train-your-own-opencv-haar-classifier.html
Also I wrote a tutorial about object detection if anyone gets stuck on this stuff:
http://johnallen.github.io/opencv-object-detection-tutorial/
Related
So this should be straight forward but I a not very familiar with OpenCV.
Can someone suggest a method to measure the distance in pixels (red line) as shown in the image below? Preferably it had some options like width of measurement (as demonstrated at the end and begining of the red line) or something of sorts. This kind of measurement is very common in software like ImageJ, I can imagine it should be somewhat trivial to do it in OpenCV.
I would like to take several samples accros the image width as well.
Greets
I am using openCV and learning about it
Your task is quite simple.
optional smoothing (Gauss filter) - you have to experiment with your data to see if it helps
edge detection (will transform image to lines representing edges) - for example cv::Canny
Hough transform to detect lines - openCV.
Find two maximum values (longest lines) in Hough transform
you will have two questions of straight lines, then you can use this information to calculate distance between them
Note that whit this approach image doesn't have to be straight. You will have line equations which you have to manipulate in smart way. If those two lines are parallel this there is simple formula to get distance between them. If they are not perfectly parallel then you have to take this int account and use information about image area to get average distance.
A simple way to find the width of the channel would be the following:
distance = []
h = img.shape[0]
for j in range(img.shape[1]):
line_top = 0
line_bottom = img.shape[0]
found_top = False
found_bottom = False
for i in range(h):
if img[i,j,0] > 0 and not found_top:
line_top = i
found_top = True
if img[h-i-1,j,0] > 0 and not found_bottom:
line_bottom = h-i
found_bottom = True
if found_top and found_bottom:
distance.append(line_bottom-line_top)
break
But this would cause the distance to take into acount the very small white speckles.
To solve this there are several options:
Preprocess the image using opencv morphological transformation.
Preprocess the image using opencv gaussian filter or similar.
Update the code to use a larger window.
Another solution would be to apply opencv's findContours.
I have a dataset of a single class (rectangular object) with a size of 130 images. My goal is to detect the object & draw a circle/dot/mark in the centre of the object.
Because the objects are rectangular, my idea is to get the dimensions of the predicted bounding box and take the circle/dot/mark as (width/2, height/2).
However, if I were to do transfer learning, would YOLO be a good choice to detect a single class of objects in a small dataset?
YOLO should be fine. However it is old now. Try YoloV4 for better results.
People have tried transfer learning from FasterRCNN to detect single objects with 300 images and it worked fine. (Link). However 130 images is a bit smaller. Try augmenting images - flipping, rotating etc if you get inferior results.
Use same augmentation for annotation as well while doing translation, rotation, flip augmentations. For example in pytorch, for segmentation, I use:
if random.random()<0.5: # Horizontal Flip
image = T.functional.hflip(image)
mask = T.functional.hflip(mask)
if random.random()<0.25: # Rotation
rotation_angle = random.randrange(-10,11)
image = T.functional.rotate(image,angle = rotation_angle)
mask = T.functional.rotate(mask ,angle = rotation_angle)
For bounding box you will have to create coordinates, x becomes width-x for horizontal flip.
Augmentations where object position is not changing: do not change annotations e.g.: gamma intensity transformation
I'm trying to to add noise to an Image & then denoised to see the difference in my object detection algorithm. So I developed OpenCV code in C++ for detection some objects in the image. I would like to test the robustness of the code, so tried to add some noises. In that way would like to check how the object detection rate changed when add noises to the image. So , first added some random Gaussian Noises like this
cv::Mat noise(src.size(),src.type());
float m = (10,12,34);
float sigma = (1,5,50);
cv::randn(noise, m, sigma); //mean and variance
src += noise;
I got this images:
The original:
The noisy one
So is there any better model for noises? Then how to Denoise it. Is there any DeNoising algorithms?
OpenCV comes with Photo package in which you can find an implementation of Non-local Means Denoising algorithm. The documentation can be found here:
http://docs.opencv.org/3.0-beta/modules/photo/doc/denoising.html
As far as I know it's the only suitable denoising algorithm both in OpenCV 2.4 and OpenCV 3.x
I'm not aware of any other noise models in OpenCV than randn. It shouldn't be a problem however to add a custom function that does that. There are some nice examples in python (you should have no problem rewriting it to C++ as the OpenCV API remains roughly identical) How to add noise (Gaussian/salt and pepper etc) to image in Python with OpenCV
There's also one thing I don't understand: If you can generate noise, why would you denoise the image using some algorithm if you already have the original image without noise?
Check this tutorial it might help you.
http://docs.opencv.org/trunk/d5/d69/tutorial_py_non_local_means.html
Specially this part:
OpenCV provides four variations of this technique.
cv2.fastNlMeansDenoising() - works with a single grayscale images
cv2.fastNlMeansDenoisingColored() - works with a color image.
cv2.fastNlMeansDenoisingMulti() - works with image sequence captured
in short period of time (grayscale images)
cv2.fastNlMeansDenoisingColoredMulti() - same as above, but for color
images.
Common arguments are:
h : parameter deciding filter strength. Higher h value removes noise
better, but removes details of image also. (10 is ok)
hForColorComponents : same as h, but for color images only. (normally
same as h)
templateWindowSize : should be odd. (recommended 7)
searchWindowSize : should be odd. (recommended 21)
And to add gaussian noise to image, maybe this thread will be helpful:
How to add Noise to Color Image - Opencv
I am working on project what detect hematoma from skin. I am having issue with color after convertion from RGB to HSV. My algorithm detect hematoma by its color.
With some images I have good results like here:
Original img: http://imgur.com/WHiOWdj
Result img: http://imgur.com/PujbnHa
But with some images i have bad result like this:
Original img: http://imgur.com/OshB99r
Result img: http://imgur.com/CuNzAId
The same original image after convertion to HSV: http://imgur.com/lkVwtCs
Do you have any ideas how to fix it?
Thanks
Looking at your result image I think that you are only using the H channel of the original image in your algorithm. The false positive detection can inherit from that the some part of the healty skin has quite the same H value than the hematoma has. You can see on the qrey-scale image of H channel that both parts have similar values:
The difference between the two parts is the saturation value. On the following image you can see the S channel of the original image and it shows perfectly that at the hematoma the saturation is much higher than at other the part of the arm:
This was expected because the hematoma has much stronger color than the healty skin has.
So, I suggest you to use both H and S channel in your algorithm that is you have to take into account only that parts of H image where the S image contains high saturation values. A possible and simple solution to do that is that you binarize both H and S images and with an AND operation you can execute this filtering:
H image after binarisation:
S image after binarisation:
Image after H&S operation:
You can see that on the result image only the hematoma part is white (except some noise but you can eliminate easily, for example by size or by morphological filtering).
EDIT
Important to note that binarization is one of most important (and sometimes also very complicated) step in the object detection algorithms namely binarization is the first highlight of the objects to detect.
If the the external conditions (lighting, color of objects etc.) do not change significantly from image to image you can use fix binaraziation thresholds. If this constant environment can not be issured you have to use more complicated methods. There are a lot of possibilies you can use, here you can read some examples:
Wikipedia - Thresholding
Wikipedia - Balanced histogram thresholding
Several solutions are based on the histogram analysis: on the histograms with objects there are always more local maximums which positions can vary depend on the environment and if you find them you can adapt the binarization threshold easily.
For example the histogram of the H channel of the original image is the following:
The first maximum belongs to the background, the second to the skin and the last to the hematome. It can be supposed that these 3 thresholds can be found in each image only their positions vary depend on the lighting or on other conditions. To put a threshold between the 2nd and the 3rd local maximum it can be a good choice to highlight the hematome.
Finally I offer you the read the following articel about thresholding in OpenCV:
OpenCV - Thresholding
I have been using OpenCV's SVM and RF for a multi-class face recognition problem with 11 classes and only 5 images per class. I used two kinds of features - initially a toy intensity image feature (just each image resized to 32x32 grayscale) and then the second feature was simply another toy feature using Tan Triggs preprocessing(link). Here is the feature code:
void Feature::makeFeature(cv::Mat &image, cv::Mat &result)
{
cv::resize( image, image, cv::Size(32, 32), 0, 0, cv::INTER_CUBIC );
cv::equalizeHist(image, image);
// Images must be aligned - Only pitch executed, yaw and roll assumed negligible
algmt->getAlignedImage( image, image ); // image alignment
// tan triggs
{
tan_triggs_preprocessing(image, result);
result = result.reshape(0, 1); // make a single row vector, needed for the training samples matrix
}
// if plain intensity
{
// image.copyTo(result);
// result.convertTo(result, CV_32F, 1.0f/255.0f);
// result = result.reshape(0, 1); // make a single row vector, needed for the training samples matrix
}
}
Where the tan_triggs_preprocessing function is the same as the Tan Triggs preprocessing function given in the link. I added one step - i normalized the result between 0 and 1.
The results on test for both were not very good, as expected, but then I made a silly mistake and discovered something strange: When I accidentally gave the training directory as input for both training and test, I get 100% results on the plain intensity feature, but the Tan Triggs feature gives the following as result:
SVM Training Complete
Total number of correct: 51 and accuracy: 92.7273
RF Training Complete
Total number of correct: 53 and accuracy: 96.3636
I do know however much you overfit the result should be perfect when the training set is input to test. Everything else is standard, both SVM and RF are standard as in the OpenCV examples. Besides I get 100% for plain intensity feature so of course I am mucking something up here when using Tan Triggs. Anyone has any idea what mistake I am making?
I have used other complex features like LTPs and LQPs without issue, but this preprocessing method is something I want to use. I use the Jain-Learned Miller congealing algorithm for alignment as I assume frontals for face recognition, no pose correction.