Why is haar cascade very slow opencv c++ - c++

I am using haar cascading to detect frontal faces. I have below code:
int main()
{
Mat image;
cv::VideoCapture cap;
cap.open(1);
int frame_idx = 0;
time_t fpsStartTime, fpsEndTime;
time(&fpsStartTime);
for (;;)
{
frame_idx = frame_idx + 1;
cap.read(image);
CascadeClassifier face_cascade;
face_cascade.load("<PATH");
std::vector<Rect> faces;
face_cascade.detectMultiScale(image, faces, 1.1, 2, 0 | cv::CASCADE_SCALE_IMAGE, Size(30, 30));
// Draw circles on the detected faces
for (int i = 0; i < faces.size(); i++)
{
Point center(faces[i].x + faces[i].width*0.5, faces[i].y + faces[i].height*0.5);
ellipse(image, center, Size(faces[i].width*0.5, faces[i].height*0.5), 0, 0, 360, Scalar(255, 0, 255), 4, 8, 0);
}
cv::imshow("Detected Face", image);
char k = cv::waitKey(1);
if (k == 27)
break;
time(&fpsEndTime);
double seconds = difftime(fpsEndTime, fpsStartTime);
double fps = frame_idx / seconds;
std::string fps_txt = "FPS: " + std::to_string(fps); // fps_str.str();
cout << "FPS : " << fps_txt << endl;
}
return 0;
}
This code is working fine but giving very low FPS. FPS is ~1fps which is very slow. I am running this on Windows 10 laptop with intel i5 CPU. I believe this should not be this much slow.
In debug mode, it gives ~1fps but in release mode it is 4-5fps which again is very slow. I have run some openvino demo's like pedestrian detection which uses 2 openvino model on same hardware and it gives ~17-20fps which is very good.
I am using USB 3.0 logitech brio 4k camera so this cannot be a reason of low fps. My question is why haar cascading is performing very slow. Is there anyway we can enhance its speed and make it more usable. Please help. Thanks

You should not (re)load the classifier on every frame. It should load once before processing frames.
Move the following statements out of the for loop.
CascadeClassifier face_cascade;
face_cascade.load("<PATH");
See a demo on OpenCV Docs.

Can you confirm if you are using right .lib and .dll file?
I have checked and seen that the opencv_world440.lib & opencv_world440.dll provide great speed compared to opencv_world440d.lib & opencv_world440d.dll files.
My guess is that opencv_world440d.lib & opencv_world440d.dll are for debugging so slow speed.
Note::Your lib name may vary ie.., opencv_world<"SomeNumber">d.lib & opencv_world<"SomeNumber">.lib

Related

Slow speed in stream webcam to do "detectMultiScale"

I'm new using OpenCV. I'm doing a sample face detector application console. I'm using haarcascade to detect the face from the webcam.
I did next code:
int main(int, char**)
{
CascadeClassifier face_cascade;
face_cascade.load("haarcascade_frontalface_alt2.xml");
vector<Rect> faces;
Mat frame_gray;
const double scale_factor = 1.1;
const int min_neighbours = 2;
const int flags = 0 | CV_HAAR_SCALE_IMAGE;
VideoCapture cap(0); // open the default camera
if (!cap.isOpened()) // check if we succeeded
return -1;
Mat frame;
for (;;)
{
cvtColor(frame, frame_gray, CV_BGR2GRAY);
equalizeHist(frame_gray, frame_gray);
face_cascade.detectMultiScale(frame_gray, faces, scale_factor, min_neighbours, flags, Size(30, 30));
if (faces.size() == 0)
{
cout << "No face detected" << endl;
}
else
{
for (unsigned int i = 0; i<faces.size(); i++)
{
Point pt1(faces[i].x, faces[i].y);
Point pt2(faces[i].x + faces[i].width, faces[i].y + faces[i].height);
rectangle(frame, pt1, pt2, Scalar(0, 255, 0), 1.5, 8, 0);
}
}
if (waitKey(30) >= 0) break;
}
return 0;
}
I tested the speed from the webcam is slow. I imagine that could be by the resolution from the image (640x480). I want to know if there are any way to keep the resolution and improving the speed between every frame to do the detection.
Thanks!
You can:
Increase minimal face size from Size(30, 30) to Size(50, 50) (it improves performance in 2-3 times).
Change value of scale_factor from 1.1 to 1.2; (it improves performance in 2 times).
Use LBP detector instead of Haar detector (it is faster in 2-3 times).
Check compiler options (may be you use Debug mode).

OpenCV set FPS to camera not working

I am currently doing real-time face evaluation and is trying to set the FPS of the Camera of my computer to 1 frame per second, followed by calling the cascade functions only once per second. (Currently using a While(true) loop) This is due to the limitation of my GPU.
I have tried to set the FPS of the camera by using
VideoCapture cap(0);
cap.set(CV_CAP_PROP_FPS, 1);
namedWindow("webcam",CV_WINDOW_AUTOSIZE);
but it is not working. The camera still process at a relative high FPS.
For the cascade function calling, I am doing it as below:
while ( true ){
cap >> frame;
vector<Rect> faces;
face_cascade.detectMultiScale( frame, faces, 1.1, 2, 0|CV_HAAR_SCALE_IMAGE, Size(30, 30) );
// Draw circles on the detected faces
for( int i = 0; i < faces.size(); i++ )
{
Point center( faces[i].x + faces[i].width*0.5, faces[i].y + faces[i].height*0.5 );
cout<<"Face location: "<<faces[i].x<<","<<faces[i].x + faces[i].width<<","<<faces[i].y<<","<<faces[i].y + faces[i].height;
ellipse( frame, center, Size( faces[i].width*0.5, faces[i].height*0.5), 0, 0, 360, Scalar( 255, 0, 255 ), 4, 8, 0 );
}
waitKey(30);
if ( !frame.data ){
cerr << "Cannot acquire frame from the webcam " << endl;
break;
}
imshow("webcam", frame);
}
I need the camera to go for only 1 frame per second, followed by calling the cascade functions once per second.
Edit: I have tried to display the FPS of the camera by using
int FPS = cap.get(CV_CAP_PROP_FPS);
It did show that FPS is currently at 1, but it seems that the camera is still moving at a relative high frame rate.
Setting the frame rate does not always work. Sometimes the camera simply does not respond to this change. However, you can do something to solve your problem in a tricky way. Measure the time that it takes to processing a frame then subtract it from 1000 mSec (1000 - Elapsce_Time) and make it wait for this time cv::waitKey(1000-Elapsce_Time). Finally, this is not a very way to do it. You should search for the actual problem with the camera and try to solve it.

Opencv - Haar cascade - Face tracking is very slow

I have developed a project to tracking face through camera using OpenCV library.
I used haar cascade with haarcascade_frontalface_alt.xml to detect face.
My problem is if image capture from webcame doesn't contain any faces, process to detect faces is very slow so images from camera, which are showed continuosly to user, are delayed.
My source code:
void camera()
{
String face_cascade_name = "haarcascade_frontalface_alt.xml";
String eye_cascade_name = "haarcascade_eye_tree_eyeglasses.xml";
CascadeClassifier face_cascade;
CascadeClassifier eyes_cascade;
String window_name = "Capture - Face detection";
VideoCapture cap(0);
if (!face_cascade.load(face_cascade_name))
printf("--(!)Error loading\n");
if (!eyes_cascade.load(eye_cascade_name))
printf("--(!)Error loading\n");
if (!cap.isOpened())
{
cerr << "Capture Device ID " << 0 << "cannot be opened." << endl;
}
else
{
Mat frame;
vector<Rect> faces;
vector<Rect> eyes;
Mat original;
Mat frame_gray;
Mat face;
Mat processedFace;
for (;;)
{
cap.read(frame);
original = frame.clone();
cvtColor(original, frame_gray, CV_BGR2GRAY);
equalizeHist(frame_gray, frame_gray);
face_cascade.detectMultiScale(frame_gray, faces, 2, 0,
0 | CASCADE_SCALE_IMAGE, Size(200, 200));
if (faces.size() > 0)
rectangle(original, faces[0], Scalar(0, 0, 255), 2, 8, 0);
namedWindow(window_name, CV_WINDOW_AUTOSIZE);
imshow(window_name, original);
}
if (waitKey(30) == 27)
break;
}
}
Haar classifier is relatively slow by nature. Furthermore, there is not much of optimization you can do to the algorithm itself because detectMultiScale is parallelized in OpenCV.
The only note about your code: do you really get some faces ever detected with minSize which equals to Size(200, 200)? Though surely, the bigger the minSize - the better the performance is.
Try scaling the image before detecting anything:
const int scale = 3;
cv::Mat resized_frame_gray( cvRound( frame_gray.rows / scale ), cvRound( frame_gray.cols / scale ), CV_8UC1 );
cv::resize( frame_gray, resized_frame_gray, resized_frame_gray.size() );
face_cascade.detectMultiScale(resized_frame_gray, faces, 1.1, 3, 0 | CASCADE_SCALE_IMAGE, Size(20, 20));
(don't forget to change minSize to more reasonable value and to convert detected face locations to real scale)
Image size reducing for 2, 3, 5 times is a great performance relief for any image processing algorithm, especially when it comes to some costly stuff like detection.
As it was mentioned before, if resizing won't do the trick, try fetching some other bottlenecks using a profiler.
And you can also switch to LBP classifier which is comparably faster though less accurate.
Hope it will help.
May be it will useful for you:
There is a Simd Library, which has an implementation of HAAR and LBP cascade classifiers. It can use standard HAAR and LBP casscades from OpenCV. This implementation has SIMD optimizations with using of SSE4.1, AVX2 and NEON(ARM), so it works in 2-3 times faster then original OpenCV implementation.
I use Haar cascade classifiers regularly, and easily get 15 frames/second for face detection on 640x480 images, on an Intel PC/Mac (Windows/Ubuntu/OS X) with 4GB Ram and 2GHz CPU. What is your configuration?
Here are a few things that you can try.
You don't have to create the window (namedWindow(window_name, CV_WINDOW_AUTOSIZE);) within each frame. Just create it first and update the image.
You can try how fast it runs without histogram equalization. Not always required with a webcam.
As suggested by Micka above, you should check whether your program runs in Debug mode or release mode.
Use a profiler to see whether the bottleneck is.
In case you haven't done it yet, have you measured the frame rate you get if you comment out face detection and drawing rectangles?
You can use LBP Cascade to detect faces. It is much more lightweight. You can find lbpcascade_frontalface.xml in OpenCV source directory.

Object Counter in Opencv + BeagleBone Black Performance Issue

I am facing performance issue in BeagleBone Black + Opencv Object Counter. I am using BackgroundSubtractorMOG2 for background subtraction and Contours Detection. Here is the code below:
cv::Mat frame;
cv::Mat resizedFrame;
cv::Mat back;
cv::Mat fore;
bool objStart = false;
bool objEnd = false;
bool start = true;
cv::Point startLine(0, 50); // this is the start of the line where I take decision
cv::Point endLine(1000, 50); // this is the end of the line
cv::VideoCapture cap("/home/moonzai/Videos/test.avi");
cv::BackgroundSubtractorMOG2 bg;
bg.set("nmixtures", 3);
vector<vector<cv::Point> > contours;
for(;;)
{
cap >> resizedFrame;
cv::resize(resizedFrame, frame, cv::Size(320, 240), 0, 0, cv::INTER_LINEAR); // I wrote this line when there were only 1 frame per second processing, I resized the frame to 320 X 240
if(start)
{
bg.operator ()(frame,fore);
bg.getBackgroundImage(back);
cv::erode(fore,fore,cv::Mat());
cv::dilate(fore,fore,cv::Mat());
cv::findContours(fore,contours,CV_RETR_EXTERNAL,CV_CHAIN_APPROX_NONE);
vector<cv::Rect> boundRect( contours.size() );
cv::Rect mainRect;
for( unsigned int i = 0; i < contours.size(); i++ )
{
boundRect[i] = boundingRect( cv::Mat(contours[i]) );
if(mainRect.area() < boundRect[i].area())
{
mainRect = boundRect[i];
}
}
if(LineIntersectsRect(startLine, endLine, mainRect)) // this function actually returns boolean, if rectangle is touching the line
{
objStart = true;
}
else
{
if(objStart)
{
objEnd = true;
}
}
if(objEnd && objStart)
{
counter ++;
cout << "Object: " << counter << endl;
objEnd = false;
objStart = false;
}
}
usleep(1000 * 33);
}
this code is working perfect on my desktop machine. but when I run this code on BeagleBone Black with Ubuntu 13.04 linux installed, this distribution has no GUI at all, I am working on terminal, it give me CPU usage of 80% with 2 frames per second of processing. Memory Usage is very low, about 8%, I am not getting my desired performance. so please guide me if I am doing something wrong.
The objective of my question is, is there any coding related issue or, BackgroundSubtractorMOG2 is resource hungry, so I have to use another way. If there is another way, then guide me what is that way?
thanks in advance...
I think that the best option is to use profiler(Very sleepy is quite easy to use, but still enough powerful for me, but i'm not sure whether there is linux version) and check in which part of your code there is a problem - take a look at this discussion How can I profile C++ code running in Linux? (accepted answer may not be good option in your situation, so look carefully at other answers too).
Also you may just try to decrease sleep time, it should increase fps and CPU usage.
C++ application performance is highly dependent on compiler options. Could you please provide gcc options that used to compile opencv library and your application?

OpenCV 2.4.2 calcOpticalFlowPyrLK doesn't find any points

I am using OpenCV 2.4.2 on Linux. I am writing in C++. I want to track simple objects (e.g. black rectangle on the white background). Firstly I am using goodFeaturesToTrack and then calcOpticalFlowPyrLK to find those points on another image. The problem is that calcOpticalFlowPyrLK doesn't find those points.
I have found code that does it in C, which does not work in my case: http://dasl.mem.drexel.edu/~noahKuntz/openCVTut9.html
I have converted it into C++:
int main(int, char**) {
Mat imgAgray = imread("ImageA.png", CV_LOAD_IMAGE_GRAYSCALE);
Mat imgBgray = imread("ImageB.png", CV_LOAD_IMAGE_GRAYSCALE);
Mat imgC = imread("ImageC.png", CV_LOAD_IMAGE_UNCHANGED);
vector<Point2f> cornersA;
goodFeaturesToTrack(imgAgray, cornersA, 30, 0.01, 30);
for (unsigned int i = 0; i < cornersA.size(); i++) {
drawPixel(cornersA[i], &imgC, 2, blue);
}
// I have no idea what does it do
// cornerSubPix(imgAgray, cornersA, Size(15, 15), Size(-1, -1),
// TermCriteria(TermCriteria::COUNT + TermCriteria::EPS, 20, 0.03));
vector<Point2f> cornersB;
vector<uchar> status;
vector<float> error;
// winsize has to be 11 or 13, otherwise nothing is found
int winsize = 11;
int maxlvl = 5;
calcOpticalFlowPyrLK(imgAgray, imgBgray, cornersA, cornersB, status, error,
Size(winsize, winsize), maxlvl);
for (unsigned int i = 0; i < cornersB.size(); i++) {
if (status[i] == 0 || error[i] > 0) {
drawPixel(cornersB[i], &imgC, 2, red);
continue;
}
drawPixel(cornersB[i], &imgC, 2, green);
line(imgC, cornersA[i], cornersB[i], Scalar(255, 0, 0));
}
namedWindow("window", 1);
moveWindow("window", 50, 50);
imshow("window", imgC);
cvWaitKey(0);
return 0;
}
ImageA: http://oi50.tinypic.com/14kv05v.jpg
ImageB: http://oi46.tinypic.com/4l3xom.jpg
ImageC: http://oi47.tinypic.com/35n3uox.jpg
I have found out that it works only for winsize = 11. I have tried using it on a moving rectangle to check how far it is from the origin. It hardly ever detects all four corners.
int main(int, char**) {
std::cout << "Compiled at " << __TIME__ << std::endl;
Scalar white = Scalar(255, 255, 255);
Scalar black = Scalar(0, 0, 0);
Scalar red = Scalar(0, 0, 255);
Rect rect = Rect(50, 100, 100, 150);
Mat org = Mat(Size(640, 480), CV_8UC1, white);
rectangle(org, rect, black, -1, 0, 0);
vector<Point2f> features;
goodFeaturesToTrack(org, features, 30, 0.01, 30);
std::cout << "POINTS FOUND:" << std::endl;
for (unsigned int i = 0; i < features.size(); i++) {
std::cout << "Point found: " << features[i].x;
std::cout << " " << features[i].y << std::endl;
}
bool goRight = 1;
while (1) {
if (goRight) {
rect.x += 30;
rect.y += 30;
if (rect.x >= 250) {
goRight = 0;
}
} else {
rect.x -= 30;
rect.y -= 30;
if (rect.x <= 50) {
goRight = 1;
}
}
Mat frame = Mat(Size(640, 480), CV_8UC1, white);
rectangle(frame, rect, black, -1, 0, 0);
vector<Point2f> found;
vector<uchar> status;
vector<float> error;
calcOpticalFlowPyrLK(org, frame, features, found, status, error,
Size(11, 11), 5);
Mat display;
cvtColor(frame, display, CV_GRAY2BGR);
for (unsigned int i = 0; i < found.size(); i++) {
if (status[i] == 0 || error[i] > 0) {
continue;
} else {
line(display, features[i], found[i], red);
}
}
namedWindow("window", 1);
moveWindow("window", 50, 50);
imshow("window", display);
if (cvWaitKey(300) > 0) {
break;
}
}
}
OpenCV implementation of Lucas-Kanade seems to be unable to track a rectangle on a binary image. Am I doing something wrong or does this function just not work?
The Lucas Kanade method estimates the motion of a region by using the gradients in that region. It is in a case a gradient descends methods. So if you don't have gradients in x AND y direction the method will fail. The second important note is that the Lucas Kanade equation
E = sum_{winsize} (Ix * u + Iy * v * It)²
is an first order taylor approximation of the intensity constancy constrain.
I(x,y,t) = I(x+u,y+v,t+1)
so an restriction of the method without level (image pyramids) is that the image needs to be a linear function. In practise this mean only small motions could be estimated, dependend from the winsize you choose. Thats why you use the levels, which linearise the images (It). So a level of 5 is a little bit to high 3 should be enough. The top level image has in your case a size of 640x480 / 2^5 = 20 x 15.
Finally the problem in your code is the line:
if (status[i] == 0 || error[i] > 0) {
the error you get back from the lucas kanade method is the resulting SSD that means:
error = sum(winSize) (I(x,y,0) - I(x+u,y+u,1)^2) / (winsize * winsize)
It is very unlikely that the error is 0. So finally you skip all features. I have good experiences by ignoring the error, that is just a confidence measure. There are very good alternative confidence measures as the Foreward/Backward confidence. You could also start experiments by ignoring the status flag if too much feaurtes are discard
KLT does point tracking by finding a transformation between two sets of points regarding a certain window. The window size is an area over which each point will be chased in order to match it on the other frame.
It is another algorithm based on gradient that find the good features to track.
Normally KLT uses a pyramidal approach in order to maintain tracking even with big movements. It probably uses at "maxLevel" times for the "window sized" you specified.
Never tried KLT on binary images. The problem might be on KLT implementation that begin the search in a wrong direction and then just lost the points. When you change the windows size then the search algorithm changes also. On you're picture you have only 4 interest point maximum and only on 1 pixel.
These are parameters you're interested in :
winSize – Size of the search window at each pyramid level
maxLevel – 0-based maximal pyramid level number. If 0, pyramids are not used (single level), if 1, two levels are used etc.
criteria – Specifies the termination criteria of the iterative search algorithm (after the specified maximum number of iterations criteria.maxCount or when the search window moves by less than criteria.epsilon
Suggestion :
Did you try with natural pictures ? (two photos for instance), you'll have much more features to track. 4 or less is quite hard to keep. I would try this first