Haar cascade for face detection xml file code explanation OpenCV - c++

I am performing face detection using opencv haar cascade.
I wanted to know the explanation of the xml code of the haar cascade that I have included in my program. Can someone help me to understand the values presented in the XML file, for instance: weakcount, maxcount, threshold, internal nodes, leaf values etc.
I have made use of the haarcascade_frontalface_alt2.xml file. I have already performed face detection. Currently I am working on counting the number of faces detected.

As I understand, in general you already know about haarcascade structure and OpenCV implementation of it. If no, please first look into OpenCV manual and read something about cascade of boosted trees, for example, Lienhart's paper.
Now about xml structure itself.
<maxWeakCount>3</maxWeakCount>
This parameter describe amount of simple classifiers (trees) at the stage.
<stageThreshold>3.5069230198860168e-01</stageThreshold>
It is stage threshold, i. e. threshold score for exiting from cascade at the stage. During all stages, we compute final score from trees, and when final score is less then threshold, we exit from entire cascade and consider result as non-object.
<weakClassifiers>
Start of trees parameters in the stage.
<_>
<internalNodes>
0 1 0 4.3272329494357109e-03 -1 -2 1 1.3076160103082657e-02
</internalNodes>
<leafValues>
3.8381900638341904e-02 8.9652568101882935e-01 2.6293140649795532e-01
</leafValues>
</_>
This is tree description. internalNodesparameter contains the following:
0 1 or 1 0 defines leaf index in current node where we should go. In a first case we go to the left if value is below threshold and to the right if above, and in a second case we go to the right leaf if value is above threshold.
feature index
threshold for choosing leaf
there is one more -1 -2 1 ... parameters list - as I see from OpenCV sources, it is just another node with leaf indexes, but negative values are ignored according to evaluation code (also from OpenCV sources).
Consider cascade evaluation code:
do
{
CascadeClassifierImpl::Data::DTreeNode& node = cascadeNodes[root + idx];
double val = featureEvaluator(node.featureIdx);
idx = val < node.threshold ? node.left : node.right;
}
while( idx > 0 );
leafValues contains left value (i. e. left leaf score), right value (right leaf score) and tree threshold.
<_>
<rects>
<_>
6 3 1 9 -1.</_>
<_>
6 6 1 3 3.</_></rects></_>
<_>
It is feature description itself according to HAAR paradigm. Feature index from previous section describes index of rects pair.

Related

Finding shortest circuit in a graph that visits X nodes at least once

Even though I'm still a beginner, I love solving graph related problems (shortest path, searches, etc). Recently I faced a problem like this :
Given a non-directed, weighted (no negative values) graph with N nodes and E edges (a maximum of 1 edge between two nodes, an edge can only be placed between two different nodes) and a list of X nodes that you must visit, find the shortest path that starts from node 0, visits all X nodes and returns to node 0. There's always at least one path connecting any two nodes.
Limits are 1 <= N <= 40 000 / 1 <= X <= 15 / 1 <= E <= 50 000
Here's an example :
The red node ( 0 ) should be the start and finish of the path. You must visit all blue nodes (1,2,3,4) and return. The shortest path here would be :
0 -> 3 -> 4 -> 3 -> 2 -> 1 -> 0 with a total cost of 30
I thought about using Dijkstra to find the shortest path between all X (blue) nodes and then just greedy picking the closest unvisited X (blue) node, but it doesn't work (comes up with 32 instead of 30 on paper). Also I later noticed that just finding the shortest path between all pairs of X nodes will take O(X*N^2) time which is too big with so much nodes.
The only thing I could find for circuits was Eulerian circuit that only allows visiting each node once (and I don't need that). Is this solveable with Dijkstra or is there any other algorithm that could solve this?
Here is a solution which likely to be fast enough:
1)Run shortest path search algorithm from every blue node(this can be done in O(X * (E log N))) to compute pairwise distances.
2)Build a new graph with zero vertex and blue vertices only(X + 1 vertices). Add edges using pairwise distances computed during the first step.
3)The new graph is small enough to use dynamic programming solution for TSP(it has O(X^2 * 2^X) time complexity).

Searching jpeg/bmp/pdf image for straight lines, circles and text

I want to create an image parser that shall read an image having following:
1. Straight Lines
2. Circles
3. Arcs
4. Text
I am open for solutions for any type of image format either jpeg, bmp, or PDF format.
I have seen QImage documentation. It shall provide me with pixel data that I can store in the form of a 2D matrix. At the moment I shall assume that there are only two colours black and white. White represents empty pixel and black represents a drawn pixel.
So I will have a sparse matrix like
0 1 1 1 0 0 0
0 0 0 0 0 0 1
0 1 1 0 0 0 1
1 0 0 1 0 0 1
1 0 0 1 0 0 0
0 1 1 0 0 0 0
Now I want to decode this matrix and search for the elements. Searching for horizontal and vertical lines is easy because for each element I can just scan its neighbouring row elements and column elements.
How can I search for other elements (angled lines, circles, arcs and possibly text)?
For text I read that QImage has text() function but I don't know for what type of input file it works.
Is there any other library that I can consider?
Please note that I just want to be able to read the image, processing does not need to be done.
Is there any other way I can accomplish this? Or am I being too ambitious?
Thanks
Take a look at the OpenCV library.
It provides most of the standard algorithms used in image detection and vision and the code quality of its implementation is quite high in general.
Notice though that this is a very difficult problem in general, so you will probably need to do a fair amount of research before getting satisfactory solutions.
One interesting way of tackling this would be with machine learning systems, such as neural networks and genetic algorithms. Neural nets in particular are very good at pattern matching and are often seen being used for tasks such as handwriting recognition.
There's a lot of information on this if you search for it. Here's one such article that is an introduction to NNs.
If your input images are always black and white, I don't think it would be too difficult to adapt a code example to get it working.
I suggest Viola-Jones object detection algorithm.
Though the approach is usually implemented on face detection - the original article discusses general object detection, such as your text, circles and lines.

openCV filter image - replace kernel with local maximum

Some details about my problem:
I'm trying to realize corner detector in openCV (another algorithm, that are built-in: Canny, Harris, etc).
I've got a matrix filled with the response values. The biggest response value is - the biggest probability of corner detected is.
I have a problem, that in neighborhood of a point there are few corners detected (but there is only one). I need to reduce number of false-detected corners.
Exact problem:
I need to walk through the matrix with a kernel, calculate maximum value of every kernel, leave max value, but others values in kernel make equal zero.
Are there build-in openCV functions to do this?
This is how I would do it:
Create a kernel, it defines a pixels neighbourhood.
Create a new image by dilating your image using this kernel. This dilated image contains the maximum neighbourhood value for every point.
Do an equality comparison between these two arrays. Wherever they are equal is a valid neighbourhood maximum, and is set to 255 in the comparison array.
Multiply the comparison array, and the original array together (scaling appropriately).
This is your final array, containing only neighbourhood maxima.
This is illustrated by these zoomed in images:
9 pixel by 9 pixel original image:
After processing with a 5 by 5 pixel kernel, only the local neighbourhood maxima remain (ie. maxima seperated by more than 2 pixels from a pixel with a greater value):
There is one caveat. If two nearby maxima have the same value then they will both be present in the final image.
Here is some Python code that does it, it should be very easy to convert to c++:
import cv
im = cv.LoadImage('fish2.png',cv.CV_LOAD_IMAGE_GRAYSCALE)
maxed = cv.CreateImage((im.width, im.height), cv.IPL_DEPTH_8U, 1)
comp = cv.CreateImage((im.width, im.height), cv.IPL_DEPTH_8U, 1)
#Create a 5*5 kernel anchored at 2,2
kernel = cv.CreateStructuringElementEx(5, 5, 2, 2, cv.CV_SHAPE_RECT)
cv.Dilate(im, maxed, element=kernel, iterations=1)
cv.Cmp(im, maxed, comp, cv.CV_CMP_EQ)
cv.Mul(im, comp, im, 1/255.0)
cv.ShowImage("local max only", im)
cv.WaitKey(0)
I didn't realise until now, but this is what #sansuiso suggested in his/her answer.
This is possibly better illustrated with this image, before:
after processing with a 5 by 5 kernel:
solid regions are due to the shared local maxima values.
I would suggest an original 2-step procedure (there may exist more efficient approaches), that uses opencv built-in functions :
Step 1 : morphological dilation with a square kernel (corresponding to your neighborhood). This step gives you another image, after replacing each pixel value by the maximum value inside the kernel.
Step 2 : test if the cornerness value of each pixel of the original response image is equal to the max value given by the dilation step. If not, then obviously there exists a better corner in the neighborhood.
If you are looking for some built-in functionality, FilterEngine will help you make a custom filter (kernel).
http://docs.opencv.org/modules/imgproc/doc/filtering.html#filterengine
Also, I would recommend some kind of noise reduction, usually blur, before all processing. That is unless you really want the image raw.

How to find/detect optimal parameters of a Grid Search in Libsvm+Weka?

I'm trying to use SVM with Weka framework. So i'm using Libsvm. I'm new to SVM and reading the guide on the site of Libsvm I read that is possible to discover optimal parameters for SVM (cost and gamma) using GridSearch. So i choose Grid Search on Weka and I obtained a bad classification results (TN rate around 1%). So how do I have to interpret these results? If using optimal parameter I got bad results is there no chance for me to get better classification?What I mean is: Grid Search give me the Best results that i can obtain using SVM?
My dataset is formed by 1124 instances (89% negative class, 11% positive class) and there are 31 attributes (2 of them are nominal others are numeric). I'm using a cross validation (10-fold) on the whole dataset to test the model.
I tried to use GridSearch (I normalized each attribute values between 0 and 1, no features selection but I change class value from 0 and 1 to 1 and -1 accroding to SVM theory but T don't know if it useful) with these parameters: cost from 1 to 18 with 1.0 step and gamma from -5 to 10 with 1.0 step. Results are sensitivity 93,6% and specificity 64.8% but these takes around 1 hour to complete computation!!
I'd like to get better results compared with decision tree. Using Features Selection (Info Gain ranking) + SMOTE oversampling + Cost Sensitive Learning I obtained sensitivity 91% and specificity 80%. Is there a way to tune SVM without trying every possible range of values for cost and gamma?

Extracting segments from a list of 8-connected pixels

Current situation: I'm trying to extract segments from an image. Thanks to openCV's findContours() method, I now have a list of 8-connected point for every contours. However, these lists are not directly usable, because they contain a lot of duplicates.
The problem: Given a list of 8-connected points, which can contain duplicates, extract segments from it.
Possible solutions:
At first, I used openCV's approxPolyDP() method. However, the results are pretty bad... Here is the zoomed contours:
Here is the result of approxPolyDP(): (9 segments! Some overlap)
but what I want is more like:
It's bad because approxPolyDP() can convert something that "looks like several segments" in "several segments". However, what I have is a list of points that tend to iterate several times over themselves.
For example, if my points are:
0 1 2 3 4 5 6 7 8
9
Then, the list of point will be 0 1 2 3 4 5 6 7 8 7 6 5 4 3 2 1 9... And if the number of points become large (>100) then the segments extracted by approxPolyDP() are unfortunately not duplicates (i.e : they overlap each other, but are not strictly equal, so I can't just say "remove duplicates", as opposed to pixels for example)
Perhaps, I've got a solution, but it's pretty long (though interesting). First of all, for all 8-connected list, I create a sparse matrix (for efficiency) and set the matrix values to 1 if the pixel belongs to the list. Then, I create a graph, with nodes corresponding to pixels, and edges between neighbouring pixels. This also means that I add all the missing edges between pixels (complexity small, possible because of the sparse matrix). Then I remove all possible "squares" (4 neighbouring nodes), and this is possible because I am already working on pretty thin contours. Then I can launch a minimal spanning tree algorithm. And finally, I can approximate every branch of the tree with openCV's approxPolyDP()
To sum up: I've got a tedious method, that I've not yet implemented as it seems error-prone. However, I ask you, people at Stack Overflow: are there other existing methods, possibly with good implementations?
Edit: To clarify, once I have a tree, I can extract "branches" (branches start at leaves or nodes linked to 3 or more other nodes) Then, the algorithm in openCV's approxPolyDP() is the Ramer–Douglas–Peucker algorithm, and here is the Wikipedia picture of what it does:
With this picture, it is easy to understand why it fails when points may be duplicates of each other
Another edit: In my method, there is something that may be interesting to note. When you consider points located in a grid (like pixels), then generally, the minimal spanning tree algorithm is not useful because there are many possible minimal trees
X-X-X-X
|
X-X-X-X
is fundamentally very different from
X-X-X-X
| | | |
X X X X
but both are minimal spanning trees
However, in my case, my nodes rarely form clusters because they are supposed to be contours, and there is already a thinning algorithm that runs beforehand in the findContours().
Answer to Tomalak's comment:
If DP algorithm returns 4 segments (the segment from the point 2 to the center being there twice) I would be happy! Of course, with good parameters, I can get to a state where "by chance" I have identical segments, and I can remove duplicates. However, clearly, the algorithm is not designed for it.
Here is a real example with far too many segments:
Using Mathematica 8, I created a morphological graph from the list of white pixels in the image. It is working fine on your first image:
Create the morphological graph:
graph = MorphologicalGraph[binaryimage];
Then you can query the graph properties that are of interest to you.
This gives the names of the vertex in the graph:
vertex = VertexList[graph]
The list of the edges:
EdgeList[graph]
And that gives the positions of the vertex:
pos = PropertyValue[{graph, #}, VertexCoordinates] & /# vertex
This is what the results look like for the first image:
In[21]:= vertex = VertexList[graph]
Out[21]= {1, 3, 2, 4, 5, 6, 7, 9, 8, 10}
In[22]:= EdgeList[graph]
Out[22]= {1 \[UndirectedEdge] 3, 2 \[UndirectedEdge] 4, 3 \[UndirectedEdge] 4,
3 \[UndirectedEdge] 5, 4 \[UndirectedEdge] 6, 6 \[UndirectedEdge] 7,
6 \[UndirectedEdge] 9, 8 \[UndirectedEdge] 9, 9 \[UndirectedEdge] 10}
In[26]:= pos = PropertyValue[{graph, #}, VertexCoordinates] & /# vertex
Out[26]= {{54.5, 191.5}, {98.5, 149.5}, {42.5, 185.5},
{91.5, 138.5}, {132.5, 119.5}, {157.5, 72.5},
{168.5, 65.5}, {125.5, 52.5}, {114.5, 53.5},
{120.5, 29.5}}
Given the documentation, http://reference.wolfram.com/mathematica/ref/MorphologicalGraph.html, the command MorphologicalGraph first computes the skeleton by morphological thinning:
skeleton = Thinning[binaryimage, Method -> "Morphological"]
Then the vertex are detected; they are the branch points and the end points:
verteximage = ImageAdd[
MorphologicalTransform[skeleton, "SkeletonEndPoints"],
MorphologicalTransform[skeleton, "SkeletonBranchPoints"]]
And then the vertex are linked after analysis of their connectivity.
For example, one could start by breaking the structure around the vertex and then look for the connected components, revealing the edges of the graph:
comp = MorphologicalComponents[
ImageSubtract[
skeleton,
Dilation[vertices, CrossMatrix[1]]]];
Colorize[comp]
The devil is in the details, but that sounds like a solid starting point if you wish to develop your own implementation.
Try math morphology. First you need to dilate or close your image to fill holes.
cvDilate(pimg, pimg, NULL, 3);
cvErode(pimg, pimg, NULL);
I got this image
The next step should be applying thinning algorithm. Unfortunately it's not implemented in OpenCV (MATLAB has bwmorph with thin argument). For example with MATLAB I refined the image to this one:
However OpenCV has all needed basic morphological operations to implement thinning (cvMorphologyEx, cvCreateStructuringElementEx, etc).
Another idea.
They say that distance transform seems to be very useful in such tasks. May be so.
Consider cvDistTransform function. It creates to an image like that:
Then using something like cvAdaptiveThreshold:
That's skeleton. I guess you can iterate over all connected white pixels, find curves and filter out small segments.
I've implemented a similar algorithm before, and I did it in a sort of incremental least-squares fashion. It worked fairly well. The pseudocode is somewhat like:
L = empty set of line segments
for each white pixel p
line = new line containing only p
C = empty set of points
P = set of all neighboring pixels of p
while P is not empty
n = first point in P
add n to C
remove n from P
line' = line with n added to it
perform a least squares fit of line'
if MSE(line) < max_mse and d(line, n) < max_distance
line = line'
add all neighbors of n that are not in C to P
if size(line) > min_num_points
add line to L
where MSE(line) is the mean-square-error of the line (sum over all points in the line of the squared distance to the best fitting line) and d(line,n) is the distance from point n to the line. Good values for max_distance seem to be a pixel or so and max_mse seems to be much less, and will depend on the average size of the line segments in your image. 0.1 or 0.2 pixels have worked in fairly large images for me.
I had been using this on actual images pre-processed with the Canny operator, so the only results I have are of that. Here's the result of the above algorithm on an image:
It's possible to make the algorithm fast, too. The C++ implementation I have (closed source enforced by my job, sorry, else I would give it to you) processed the above image in about 20 milliseconds. That includes application of the Canny operator for edge detection, so it should be even faster in your case.
You can start by extraction straight lines from your contours image using HoughLinesP which is provided with openCV:
HoughLinesP(InputArray image, OutputArray lines, double rho, double theta, int threshold, double minLineLength = 0, double maxLineGap = 0)
If you choose threshold = 1 and minLineLenght small, you can even obtain all single elements. Be careful though, since it yields many results in case you have many edge pixels.