Face Detector Parameters for OpenCV cv_haar_scale_image - c++

What does cv_haar_scale_image do in opencv's function cvhaardetectobjects?

It enables more optimization.
The face detect implementation is optimized for CV_HAAR_SCALE_IMAGE more than CV_HAAR_DO_CANNY_PRUNING.
Because CV_HAAR_SCALE_IMAGE method is more DMA (direct memory access) friendly. Default method (CV_HAAR_DO_CANNY_PRUNING) implementation needs random access to main memory area widely.

The flag CV_HAAR_SCALE_IMAGE, tells the algorithm to scale the image rather than the detector.
There is an example of its use here: Face detection: How to find faces with openCV

According to EMGU, which is an .NET wrapper for OpenCV, and sometimes has way better documentation than OpenCV,
DO_CANNY_PRUNING
If it is set, the function uses Canny edge detector
to reject some image regions that contain too few or too much edges
and thus can not contain the searched object. The particular threshold
values are tuned for face detection and in this case the pruning
speeds up the processing
SCALE_IMAGE
For each scale factor used the function will downscale
the image rather than "zoom" the feature coordinates in the classifier
cascade. Currently, the option can only be used alone, i.e. the flag
can not be set together with the others
FIND_BIGGEST_OBJECT
If it is set, the function finds the largest
object (if any) in the image. That is, the output sequence will
contain one (or zero) element(s)
DO_ROUGH_SEARCH
It should be used only when
CV_HAAR_FIND_BIGGEST_OBJECT is set and min_neighbors > 0. If the flag
is set, the function does not look for candidates of a smaller size as
soon as it has found the object (with enough neighbor candidates) at
the current scale. Typically, when min_neighbors is fixed, the mode
yields less accurate (a bit larger) object rectangle than the regular
single-object mode (flags=CV_HAAR_FIND_BIGGEST_OBJECT), but it is much
faster, up to an order of magnitude. A greater value of min_neighbors may be specified to improve the accuracy.
Source

CV_HAAR_DO_CANNY_PRUNING causes flat regions that have no lines to be skipped by the classifier

Related

OpenCV edge based object detection C++

I have an application where I have to detect the presence of some items in a scene. The items can be rotated and a little scaled (bigger or smaller). I've tried using keypoint detectors but they're not fast and accurate enough. So I've decided to first detect edges in the template and the search area, using Canny ( or a faster edge detection algo ), and then match the edges to find the position, orientation, and size of the match found.
All this needs to be done in less than a second.
I've tried using matchTemplate(), and matchShape() but the former is NOT scale and rotation invariant, and the latter doesn't work well with the actual images. Rotating the template image in order to match is also time consuming.
So far I have been able to detect the edges of the template but I don't know how to match them with the scene.
I've already gone through the following but wasn't able to get them to work (they're either using old version of OpenCV, or just not working with other images apart from those in the demo):
https://www.codeproject.com/Articles/99457/Edge-Based-Template-Matching
Angle and Scale Invariant template matching using OpenCV
https://answers.opencv.org/question/69738/object-detection-kinect-depth-images/
Can someone please suggest me an approach for this? Or a code snipped for the same if possible ?
This is my sample input image ( the parts to detect are marked in red )
These are some software that are doing this and also how I want it should be:
This topic is what I am actually dealing for a year on a project. So I will try to explain what my approach is and how I am doing that. I assume that you already did the preprocess steps(filters,brightness,exposure,calibration etc). And be sure you clean the noises on image.
Note: In my approach, I am collecting data from contours on a reference image which is my desired object. Then I am comparing these data with the other contours on the big image.
Use canny edge detection and find the contours on reference
image. You need to be sure here about that it shouldn't miss some parts of
contours. If it misses, probably preprocess part should have some
problems. The other important point is that you need to find an
appropriate mode of findContours because every modes have
different properties so you need to find an appropriate one for your
case. At the end you need to eliminate the contours which are okey
for you.
After getting contours from reference, you can find the length of
every contours using outputArray of findContours(). You can compare
these values on your big image and eliminate the contours which are
so different.
minAreaRect precisely draws a fitted, enclosing rectangle for
each contour. In my case, this function is very good to use. I am
getting 2 parameters using this function:
a) Calculate the short and long edge of fitted rectangle and compare the
values with the other contours on the big image.
b) Calculate the percentage of blackness or whiteness(if your image is
grayscale, get a percentage how many pixel close to white or black) and
compare at the end.
matchShape can be applied at the end to the rest of contours or you can also apply to all contours(I suggest first approach). Each contour is just an array so you can hold the reference contours in an array and compare them with the others at the end. After doing 3 steps and then applying matchShape is very good on my side.
I think matchTemplate is not good to use directly. I am drawing every contour to a different mat zero image(blank black surface) as a template image and then I compare with the others. Using a reference template image directly doesnt give good results.
OpenCV have some good algorithms about finding circles,convexity etc. If your situations are related with them, you can also use them as a step.
At the end, you just get the all data,values, and you can make a table in your mind. The rest is kind of statistical analysis.
Note: I think the most important part is preprocess part. So be sure about that you have a clean almost noiseless image and reference.
Note: Training can be a good solution for your case if you just want to know the objects exist or not. But if you are trying to do something for an industrial application, this is totally wrong way. I tried YOLO and haarcascade training algorithms several times and also trained some objects with them. The experiences which I get is that: they can find objects almost correctly but the center coordinates, rotation results etc. will not be totally correct even if your calibration is correct. On the other hand, training time and collecting data is painful.
You have rather bad image quality very bad light conditions, so you have only two ways:
1. To use filters -> binary threshold -> find_contours -> matchShape. But this very unstable algorithm for your object type and image quality. You will get a lot of wrong contours and its hard to filter them.
2. Haarcascades -> cut bounding box -> check the shape inside
All "special points/edge matching " algorithms will not work in such bad conditions.

Why in CNN for image recognition tasks, the filters are always chosen to be extremely localized?

In CNN, the filters are usually set as 3x3, 5x5 spatially. Can the sizes be comparable to the image size? One reason is for reducing the number of parameters to be learnt. Apart from this, is there any other key reasons? for example, people want to detect edges first?
You answer a point of the question. Another reason is that most of these useful features may be found in more than one place in an image. So, it makes sense to slide a single kernel all over the image in the hope of extracting that feature in different parts of the image using the same kernel. If you are using big kernel, the features could be interleaved and not concretely detected.
In addition to yourself answer, reduction in computational costs is a key point. Since we use the same kernel for different set of pixels in an image, the same weights are shared across these pixel sets as we convolve on them. And as the number of weights are less than a fully connected layer, we have lesser weights to back-propagate on.

Opencv Optical flow tracking: stop condition

I'm currently trying to implement a face tracking by using optical flow with opencv.
To achieve this, I detect faces with the openCV face detector, I determine features to track on the detected areas by calling goodFeaturesToTrack and I operate tracking by calling calcOpticalFlowPyrLK.
It gives good results.
However, I'd like to know when the face I'm currently tracking is not visible anymore (the person leaves the room, is hidden behind an object or another person, ...) but calcOpticalFlowPyrLK tells me nothing about it.
The status parameter of the calcOpticalFlowPyrLK function rarely reports errors concerning a tracked feature (so, if the person disappear, I will still have a good amount of valid features to track).
I've tried to calculate the directional vectors for each feature to determine the move between the previous and the actual frame for each feature of the face (for example, determining that some point of the face has move to the left between the two frames) and to calculate the variance of these vectors (if vectors are mostly different, variance is high, otherwise it is not) but it did not give the expected results (good in some situation, but bad in other cases).
What could be a good condition to determine whether the optical flow tracking has to be stopped or not?
I've thought of some possible solutions like these ones:
variance of the distance for the vectors of each tracked feature (if the move is linear, distances should be nearly the same, but if something happened, distances will be different).
Comparing the shape and size of the area containing the original position of the tracked features with the area containing the current one. At the beginning we have a square containing the features of the face. But if the person leaves the room, it can lead to a deformation of the shape.
You can try a bidirectional confidenze measure of your track points.
Therefore estimate the feature positions from img0 to img1 and than the tracked positions backwards from img1 to img0. If the double tracked features near the original ( distance should be less than 1 or 0.5 pixel) than they are successfully tracked. This is a little bit more relyable than the SSD which is used by the status flag of opencv's plk. If a certain amount of features could not been tracked the event raises.

OpenCV, C++: How to use cv::Meanshift

I have a vector of 2-D points, I am trying to use the meanshift algorithm to detect multiple modes in the data but am a bit confused by the method signature.
1) Can I pass in my vector (if so in what form) or must I conver to cv::Mat (if so how? if I have points with negative values).
2) How do I extract the multiple modes, from what I can see the function only returns an int
Thanks
OpenCV's implementation of mean shift is for tracking a single object (as part of the CamShift algorithm) and therefore I don't believe it has been extended to track multiple objects using multi-modal distributions. It will give you a bounding box centered on the mode of a probability image (returned by the reference pass of cv::Rect window).
Is your data represented as a mixture of Gaussians (or some other symmetric distribution)? If so you might be able to use k-means clustering to find the means of your distribution (which will be the mode for a symmetric distribution), although choosing k will be problematic.
Alternatively, a hack that might enable tracking of multiple objects (or finding multiple modes) could involve repeated calling this function, retrieving the mode and then zeroing this section from the back projected histogram.
As for your data's form, the function input is through a cv::Mat so you will have to convert your data. However, you claim to have negative values and this opencv function expects a probability histogram (which typically you calculate from an image using cv::calcBackProject()) so I expect it will complain if you try to pass it a cv::Mat containing negative values.

OpenCV: Fast way to compare frames for similarity

I'm looking for a fast way to compare a frame with a running average, and determine the difference between them (in terms of giving a high value if they're very similar, and a lower value if they're not that similar). I need to compare the entire frame, not just a smaller region.
I'm already using Otsu thresholding on the images to filter out the background (not interested in the background, nor the features of the foreground - just need shapes). Is there a nice, fast way to do what I want?
The classic method for this is Normalized Cross Correlation (try cv::matchTemplate()). You will need to set a treshold to decided if images are a match. Also you can use the output (which is thresholded) to compare several images.
In OpenCV, this method in matchTemplate is explained here, and the parameter you need to pass to the function.