OpenCV Polylines excessive CPU usage - c++

I am receiving a videostream (640x480p) via UDP and use OpenCV's imdecode() to decode every frame in the same thread. If correctly decoded the frame is passed to a newly started thread for image processing (findChessboardCorners() and polylines()) and the thread is detached.
The receiving and decoding part works perfectly but I logged the execution time for polylines() and it starts at about 5ms and gets worse the longer the program runs (up to 4000ms and more). Visual Studio's performance profiler reported, that polylines() uses ~98% of the CPU. the vector with points to draw using polylines() consists of 40 points.
Even though i am detaching each thread, what could cause this performance loss? (even tested it with an Intel Xeon)
void decode(Mat videoFrame) {
Mat rotationMat;
Mat translationMat;
Mat chessboard;
resize(videoFrame, chessboard, Size(), resizeFactor, resizeFactor);
Size patternSize(chessboardSize.front(), chessboardSize.back());
vector<Point2f> corners;
vector<Point2f> imagePoints;
bool patternFound = findChessboardCorners(chessboard, patternSize, corners, CALIB_CB_ADAPTIVE_THRESH + CALIB_CB_FAST_CHECK);
if (patternFound) {
solvePnP(objectPoints, corners, cameraMatrix, distCoeffs, rotationMat, translationMat);
vector<Point3d> path_3d = fahrspur.computePath(steeringAngle);
vector<Point2d> path_2d;
projectPoints(path_3d, rotationMat, translationMat, cameraMatrix, distCoeffs, path_2d);
Mat curve(path_2d, true);
curve.convertTo(curve, CV_32S);
double t4 = getCurrentTime();
polylines(chessboard, curve, false, Scalar(0, 255, 0), 10, CV_AA);
double t5 = getCurrentTime();
cout << "time to execute polylines: " << t5-t4 << "ms" << endl;
assignFrameVideo(chessboard);
}
A new thread with this decode method is started from another thread, used for receiving the frames, in a while loop:
Mat frameVideo;
while(1) {
//code for receiving a single frame, decode it and store it in frameVideo.
thread decodeThread = thread(decode, frameVideo);
decodeThread.detach();
}
I also used the second option to use polylines() that way:
const Point *pts = (const Point*)Mat(path_2d).data;
int npts = Mat(path_2d).rows;
polylines(chessboard, &pts, &npts, 1, false, Scalar(0, 255, 0), 5);
But that does not work at all, the image is displayed without any lines.

i solved it by replacing CV_AA with LINE_4 as a parameter in polylines().
apparently the anti aliasing of the drawn line is the heavy part, now it runs within 0-1 ms.

Related

Why is haar cascade very slow opencv c++

I am using haar cascading to detect frontal faces. I have below code:
int main()
{
Mat image;
cv::VideoCapture cap;
cap.open(1);
int frame_idx = 0;
time_t fpsStartTime, fpsEndTime;
time(&fpsStartTime);
for (;;)
{
frame_idx = frame_idx + 1;
cap.read(image);
CascadeClassifier face_cascade;
face_cascade.load("<PATH");
std::vector<Rect> faces;
face_cascade.detectMultiScale(image, faces, 1.1, 2, 0 | cv::CASCADE_SCALE_IMAGE, Size(30, 30));
// Draw circles on the detected faces
for (int i = 0; i < faces.size(); i++)
{
Point center(faces[i].x + faces[i].width*0.5, faces[i].y + faces[i].height*0.5);
ellipse(image, center, Size(faces[i].width*0.5, faces[i].height*0.5), 0, 0, 360, Scalar(255, 0, 255), 4, 8, 0);
}
cv::imshow("Detected Face", image);
char k = cv::waitKey(1);
if (k == 27)
break;
time(&fpsEndTime);
double seconds = difftime(fpsEndTime, fpsStartTime);
double fps = frame_idx / seconds;
std::string fps_txt = "FPS: " + std::to_string(fps); // fps_str.str();
cout << "FPS : " << fps_txt << endl;
}
return 0;
}
This code is working fine but giving very low FPS. FPS is ~1fps which is very slow. I am running this on Windows 10 laptop with intel i5 CPU. I believe this should not be this much slow.
In debug mode, it gives ~1fps but in release mode it is 4-5fps which again is very slow. I have run some openvino demo's like pedestrian detection which uses 2 openvino model on same hardware and it gives ~17-20fps which is very good.
I am using USB 3.0 logitech brio 4k camera so this cannot be a reason of low fps. My question is why haar cascading is performing very slow. Is there anyway we can enhance its speed and make it more usable. Please help. Thanks
You should not (re)load the classifier on every frame. It should load once before processing frames.
Move the following statements out of the for loop.
CascadeClassifier face_cascade;
face_cascade.load("<PATH");
See a demo on OpenCV Docs.
Can you confirm if you are using right .lib and .dll file?
I have checked and seen that the opencv_world440.lib & opencv_world440.dll provide great speed compared to opencv_world440d.lib & opencv_world440d.dll files.
My guess is that opencv_world440d.lib & opencv_world440d.dll are for debugging so slow speed.
Note::Your lib name may vary ie.., opencv_world<"SomeNumber">d.lib & opencv_world<"SomeNumber">.lib

OpenCV goodFeaturesToTrack's status are zeros

Trying to implement Optical Flow for iOS with OpenCV 3.1 .
I built the basic stuff as shown in the below code and I do get features points from goodFeaturesToTrack but the thing is no point is being tracked, and status results are always zeros (not successfully tracked).
cv::Mat gray; // current gray-level image
cv::Mat gray_prev;
std::vector<cv::Point2f> features; // detected features
std::vector<cv::Point2f> newFeatures;
std::vector<uchar> status; // status of tracked features
std::vector<float> err; // error in tracking
cv::TermCriteria _termcrit = cv::TermCriteria(cv::TermCriteria::COUNT|cv::TermCriteria::EPS,20,0.03);
-(void)processImage:(cv::Mat&)image {
//-------------------- Optical Flow ---------------------
cv::cvtColor(image, gray, CV_BGR2GRAY);
if(gray_prev.empty()) {
gray.copyTo(gray_prev);
}
cv::goodFeaturesToTrack(gray, features, 20, 0.01, 10);
cv::calcOpticalFlowPyrLK(gray_prev, gray, features, newFeatures, status, err, cv::Size(10, 10), 3, _termcrit, 0, 0.001);
// draw circles for features points
for (int i = 0; i < features.size(); i++) {
circle(image, features[i], 10, cv::Scalar(250,250,250));
}
for (int y = 0; y < status.size(); y++) {
NSLog(#"Status: %d", status[y]); // always zero
}
std::swap(newFeatures, features);
cv::swap(gray_prev, gray);
}
you should call
cv::goodFeaturesToTrack(gray, features, 20, 0.01, 10);
only for the first initialisation of the features list, and not on every cycle.
What your doing is reseting the features list to match the current frame on every cycle so there is no displacement in the features.
also, if you want just the displacement between two frames you should call
cv::goodFeaturesToTrack(**gray_prev**, features ....
The status gets 0 in two cases:
the feature is outside the image roi
the minimal eigenvalue the gradient matrix of the lucas-kanade window is above the minEigThreshold, i.e. in your window is not enough texture.
However, in my experience the status flags is rather a good guess but not a significant flag to know if the feature has been tracked successfully.
Ignore the status and draw or print your feautures and newFeatures vectors. If they are the same check if gray_prev and gray image are different from each other.

Perfomance Issues while capturing and processing a video

I'm currently working on a project where I need to display a processed live video capture. Therefore, I'm using something similar to this:
cv::VideoCapture cap(0);
if (!cap.isOpened())
return -1;
cap.set(CV_CAP_PROP_FRAME_WIDTH, 1280);
cap.set(CV_CAP_PROP_FRAME_HEIGHT, 720);
cv::namedWindow("Current Capture");
for (;;)
{
cv::Mat frame;
cap >> frame;
cv::Mat mirrored;
cv::flip(frame, mirrored, 1);
cv::imshow("Current Capture", process_image(mirrored));
if (cv::waitKey(30) >= 0) break;
}
The problem I have is, that process_image, which perfomes a circle detection in the image, needs some time to finish and causes the displaying to be rather a slideshow then a video.
My Question is: How can I speed up the processing without manipulating the process_image function?
I thought about performing the image processing in another thread, but I'm not really sure how to start. Do you have any other idea than this?
PS.: I'm not expecting you to write code for me, I only need a point to start from ;)
EDIT:
Ok, if there is nothing i can do about the performance while capturing, I will need to change the process_image function.
cv::Mat process_image(cv::Mat img)
{
cv::Mat hsv;
cv::medianBlur(img, img, 7);
cv::cvtColor(img, hsv, cv::COLOR_BGR2HSV);
cv::Mat lower_hue_range; // lower and upper hue range in case of red color
cv::Mat upper_hue_range;
cv::inRange(hsv, cv::Scalar(LOWER_HUE1, 100, 100), cv::Scalar(UPPER_HUE1, 255, 255), lower_hue_range);
cv::inRange(hsv, cv::Scalar(LOWER_HUE2, 100, 100), cv::Scalar(UPPER_HUE1, 255, 255), upper_hue_range);
/// Combine the above two images
cv::Mat hue_image;
cv::addWeighted(lower_hue_range, 1.0, upper_hue_range, 1.0, 0.0, hue_image);
/// Reduce the noise so we avoid false circle detection
cv::GaussianBlur(hue_image, hue_image, cv::Size(13, 13), 2, 2);
/// store all found circles here
std::vector<cv::Vec3f> circles;
cv::HoughCircles(hue_image, circles, CV_HOUGH_GRADIENT, 1, hue_image.rows / 8, 100, 20, 0, 0);
for (size_t i = 0; i < circles.size(); i++)
{
/// circle center
cv::circle(hsv, cv::Point(circles[i][0], circles[i][1]), 3, cv::Scalar(0, 255, 0), -1, 8, 0);
/// circle outline
cv::circle(hsv, cv::Point(circles[i][0], circles[i][1]), circles[i][2], cv::Scalar(0, 0, 255), 3, 8, 0);
}
cv::Mat newI;
cv::cvtColor(hsv, newI, cv::COLOR_HSV2BGR);
return newI;
}
Is there a huge perfomance issue I can do anything about?
If you are sure that the process_image function is what is causing the bottle neck in your program, but you can't modify it, then there's not really a lot you can do. If that function takes longer to execute than the duration of a video frame then you will never get what you need.
How about reducing the quality of the video capture or reducing the size? At the moment I can see you have it set to 1280*720. If the process_image function has less data to work with it should execute faster.

Mutex and thread in videoCapture

I have used a worker thread to get the latest frame in real time,the code is as follow.But in my code, there is a problem. The frame is the first frame all the time, It didn't update.As a result, the first frame do the remap(), and the remap result frame do the next loop remap...I don't know why the frame didn't update. If i remove the line remap() or replace this line as dilate(frame, frame..) ,the frame updates all the time. Also, if i copy the frame to image and use the image to do remap(),the frame can update.But why in this case the frame can't update.Can somebody help me?Thank you.
std::mutex mtxCam;
void task(VideoCapture cap, Mat& frame) {
while (true) {
mtxCam.lock();
cap >> frame;
mtxCam.unlock();
}
}
int main() {
Mat frame, image;
VideoCapture cap;
cap.open(0);
cap.set(CV_CAP_PROP_FRAME_WIDTH, 1600);
cap.set(CV_CAP_PROP_FRAME_HEIGHT, 1080);
cap >> frame;
thread t(task, cap, frame);
while (true) {
initUndistortRectifyMap(
cameraMatrix, // computed camera matrix
distCoeffs, // computed distortion matrix
Mat(), // optional rectification (none)
Mat(), // camera matrix to generate undistorted
Size(1920 * 1.3, 1080 * 1.3),
// image.size(), // size of undistorted
CV_32FC1, // type of output map
map1, map2); // the x and y mapping functions
mtxCam.lock();
remap(frame, frame, map1, map2, cv::INTER_LINEAR);
frame.copyTo(image);
mtxCam.unlock();
...//image processing loop
}
}
There are two problems here:
1) You pass a single frame and then the video capture is mapped to the same frame every time without clearing it once that frame is processed.
2) You need a signalling mechanism(semaphore) , not a locking mechanism(mutex).
Something along these lines:
while (true) {
frame.clear();
cap >> frame;
semCam.Give();
}
semCam.Take();
remap(frame, frame, map1, map2, cv::INTER_LINEAR);
frame.copyTo(image);
You are dealing with a producer-consumer problem here.
So, Thread 1 produces the frames and Thread2 consumes the frames for image processing.
Thread1 inserts the frames into queue,signals thread2 that frames are ready for processing and waits for thread2 to signal that the frames have been processed.
Algorithm:
Thread 1
FrameProcessed.Wait()
FrameQueue.insert()
FrameQueueReadyForProcessing.Give()
Thread 2
FrameQueueReadyForProcessing.Wait()
ConsumeFrames(FrameQueue.Pop())
FrameProcessed.Give()
unfortunately, C++11 has no out of the box implementation of semaphores.
But you can roll one of your own.
https://gist.github.com/yohhoy/2156481

Real Time Multiple object detection from mulitple videos, while correcting image distortion

I'm working on a project where I'm trying to detect multiple objects from multiple videos at the same time and I am also correcting the distortion in the videos so I can get an accurate reading for the bearing of the detections relative to the camera.
I'm working with OpenCV, in C++ using Visual Studio 2010
The code works but it is very slow, not real time. I am hoping someone may have suggestions as how it may be sped up, if it can. I'm not much of a coder at present and learning about image processing and don't know many tricks.
What the code does in a general sense is:
-Opens a video file
-Applies distortion correction to each image frame (for bearings)
-Crops the image (improves detections, time stamps give false positives)
-Applies a Haar type cascade to the cropped image to detect objects
-Draws a bounding box around the detections
-Displays the images
-It calculates the angle of the detections of the objects and prints to terminal
-It also draws an image like a radar, and displays the angle relative to the camera of each detection in each frame, the idea is to have it as a single source giving the detections from each camera on a single source mapping out the surrounding area.
I've included the main code that runs the video and detections for a single video, this is still pretty slow and takes approx. 18 seconds for each second of video. And when I have 4 videos attempting to run it's about 3 times longer.
Video dimensions are 704x576.
Any help or advice would be much appreciated, or even just knowing that it can only be sped up with purpose designed hardware.
Cheers,
Dave
int main(){
/////////////////////////////
//**Distortion Correction**//
/////////////////////////////
std::cout<< endl << "Reading:" << endl;
//stores a file
FileStorage fs;
//reads and stores the xml file with the camera calibration
fs.open("cal2.xml", FileStorage::READ);
if (fs.isOpened())
{
cout<<"File is opened\n";
}
//Mat objects to store the camera matrix and distortion coefficients
Mat CamMat, DistCoeff;
FileNode n = fs.root();
//takes the parameters from the xml file and stores them in the Mat objects
fs["Camera_Matrix"] >> CamMat;
fs["Distortion_Coefficients"] >> DistCoeff;
/////////////////////
//**Video Display**//
/////////////////////
//Mat objects to store the images
Mat Original, Vid1, Vid1Crop;
//Cropping Image to exclude time/camera stamps to improve detections
Rect roi(0, 35, 704, 490);
//for reading video or webcam
VideoCapture cap;
//for opening video file, give the location and name of the file seperating folders with "\\" instead of with a single "\"
cap.open("C:\\Users\\Desktop\\Run_01_005 two containers\\Video\\ch04_20140219124355.mp4");
//Windows for the images
namedWindow("New", CV_WINDOW_NORMAL);
namedWindow("Display", CV_WINDOW_NORMAL);
///////////////////////
//**Detection Setup**//
///////////////////////
// Cascade Classifier object
CascadeClassifier boat_cascade;
//loads the xml file for the classifier, put the address and name of the xml file in the brackets
boat_cascade.load( "boatcascadeAttp3.xml" );
/////////////////////////////////////
//**Single Source Display Image**////
/////////////////////////////////////
Mat Output;
//loop to continually capture/update images
while (1){
Output = Display();
cap>>Original;
//applies the distortion correction to input image and outputs to New image
undistort(Original, Vid1, CamMat, DistCoeff, noArray());
//Image excluding the time/camera stamps in video which caused a lot of false positives
Vid1Crop = Vid1(roi);
//Set.NewCrop(New, roi);
// Detect boats
std::vector<Rect> boats;
//Parameters may need some further adjustment, currently seems to work well
//Detection performed on Region of Interest that excludes the time stamp which caused a number of False Positives
boat_cascade.detectMultiScale( Vid1Crop, boats, 1.1, 15, 0|CV_HAAR_SCALE_IMAGE, Size(25, 25), Size(75,75) );
// Draw circles on the detected boats
for( int i = 0; i < boats.size(); i++ )
{
//Draws a box around the detected object
rectangle( Vid1Crop, Point(boats[i].x, boats[i].y), Point(boats[i].x+boats[i].width, boats[i].y+boats[i].height), Scalar( 0, 255, 0), 2, 8);
//finds the position of the detection along the X axis
int centreX = boats[i].x + boats[i].width*0.5;
int fromCent = Vid1Crop.cols - centreX;
float angle;
//calls Angle function
angle = Angle(centreX, fromCent, Vid1Crop);
//calls DisplayPoints function
Point XYpoints = DisplayPoints(angle);
//prints out the result, angle for cam ranges
cout << angle;
cout << " degrees" << endl;
//Draws red circles on the single source display corresponding to the detections
circle( Output, XYpoints, 5.0, Scalar( 0, 0, 255 ), 4, 8 );
}
//shows the New output image after correction
imshow("New", Vid1);
imshow("Display", Output);
//delay for 1ms between frames - Note 25 fps in video
waitKey(1);
}
fs.release();
return (0);
}