OpenCV: Should the write from VideoWriter run in an independent thread? - c++

I'm writing a program that reads multiple webcams, stitches the pictures together and saves them into a video file.
Should i use threads to capture the images and to write the resulting large image into a file?
If yes, should i use c+11 or boost?
Do you maybe have some simple example code which shows how to avoid racing conditions?
I already found this thread which uses threads for write but it doesn't seem to avoid racing conditions.
pseudocode of my program:
camVector //vector with cameras
VideoWriter outputVideo //outputs the video
while(true){
for(camera:camVector){
camera.grab() //send signal to grab picture
}
picVector = getFrames(camVector) //collect frames of all cameras into a Vector
Mat bigImg = stitchPicTogether(picVector)
outputVideo.write(bigImg)
}

That's what I'd do:
camVector //vector with cameras
VideoWriter outputVideo //outputs the video
for(camera:camVector) camera.grab(); // request first frame
while(true){
parallel for(camera:camVector) {
frame = getFrame(camera);
pasteFrame(/* destination */ bigImg,
/* geometry */ geometry[camera],
/* source */ frame
);
camera.grab(); // request next frame, so the webcam driver can start working
// while you write to disk
}
outputVideo.write(bigImg)
}
This way stiching is done in parallel, and if your webcam drivers have different timing, you can begin stiching what you recieved from one webcam while you wait for the other webcam's frame.
About implementation, you could simply go with OpenMP, already used in OpenCV, something like #pragma omp parallel. If you prefer something more C++ like, give Intel TBB a try, it's cool.
Or as you said, you could go with native C++11 threads or boost. Choosing between the two mainly depend on your work environment: if you intend to work with older compiler, boost is safer.
In all cases, no locking is required, except a join at the end, to wait for all threads to finish their work.
If your bottleneck is writing the output, the limitting factor is either disk IO or video compression speed. For disk, you can try changing codec and/or compressing more. For compression speed, have a look at GPU codecs like this.

Related

Speed up reading frames from a video file

Is there any way with OpenCV to read frames from a video file in parallel or speed up reading in some other way?
I have tried using the cap.read(frame) function in multiple threads, but application crashes.
I also tried with VideoCapture object array caps, all referencing the same video file, then in each thread I can use caps[i].read(frame) and so I can read in parallel, but I just read the same frame multiple times.
I have not found any other way to speed up the reading other than changing the video format. I changed it to HapQ (original format was Apple ProRes H422) and the performance was noticeably better, about 30% faster (20-25 ms for reading frames compared to 30-35 ms before).

how to use cv::VideoCapture::waitAny() in opencv

I've been trying to find a way to asynchronously check to see if the next frame of the camera that I get using videocapture is ready.
I came across waitAny() which is described to "Wait for ready frames from VideoCapture.".
in the OpenCV documentation, I didn't find any useful info on how to use this or what are the use cases.
I've been searching the net for two days now and the only thing I found is how to define the parameters this function needs(I'm new to c++) and I don't know how to fill them and what are their use cases.
here is the documentation: https://docs.opencv.org/master/d8/dfe/classcv_1_1VideoCapture.html#ade1c7b8d276fea4d000bc0af0f1017b3
An approach that IMO is fairly common is to have a pair of threads -- one that "produces" frames from VideoCapture (calls VideoCapture::read() in a loop), and another that actually uses these frames (a "consumer"). The producer pushes images onto a queue (that's shared across two threads), while the consumer pops them.
In this situation, checking whether a camera has produced an image amounts to checking whether the queue is empty.
By itself, VideoCapture does not provide such an async API.
That said, if you want to use waitAny, and if you have a single camera, you could do something like this:
VideoCapture cap = /* get the VideoCapture object from somewhere */
constexpr int64 kTimeoutNs = 1000;
std::vector<int> ready_index;
cv::Mat image;
if (VideoCapture::waitAny({cap}, ready_index, kTimeoutNs)) {
// Camera was ready; get image.
cap.retrieve(image);
} else {
// Camera was not ready; do something else.
}
Above, VideoCapture::waitAny will wait for the specified timeout (1 microsecond) for the camera to produce a frame, and will return after this period.
If the camera is ready, it will return true (and will also populate ready_index with the index of the ready camera. Since you only have a single camera, the vector will be either empty or non-empty).
That said, waitAny seems to only be supported by VideoCapture sources that use V4L in the backend.

OpenCV is missing frames while face detection takes place

I am using the
haarcascade_frontalface_alt2.xml
file for face detection in OpenCV 2.4.3 under the Visual Studio 10 framework.
I am using
Mat frame;
cv::VideoCapture capture("C:\\Users\\Xavier\\Desktop\\AVI\\Video 6_xvid.avi");
capture.set(CV_CAP_PROP_FPS,30);
for(;;)
{
capture >> frame;
//face detection code
}
The problem i'm facing is that as Haar face detection is computationally heavy, OpenCV is missing a few frames in the
capture >> frame;
instruction. To check it I wrote to a txt file a counter and found only 728 frames out of 900 for a 30 sec 30fps video.
Plz someone tell me how to fix it.
I am not an experienced openCV user, but you could try flushing the outputstream of capture to disk. Unfortunately, I think the VideoCapture class does not seem to support such an operation. Note that flushing to disk will have an impact on your performance, since it will first flush everything and only then continue executing. Therefore it might not be the best solution, but it is the easiest one if possible.
Another approach that requires more work but that should fix it is to make a separate low priority thread that writes each frame to disk. Your current thread then only needs to call this low priority thread each time it wants its data to be captured. Depending on whether the higher priority thread might change the data while the low priority thread still has to write it to disk, you might want to copy the data to a separate buffer first.

An efficient way to buffer HD video real-time without maxing out memory

I am writing a program that involves real-time processing of video from a network camera using OpenCV. I want to be able to capture (at any time during processing) previous images (e.g. say ten seconds worth) and save to a video file.
I am currently doing this using a queue as a buffer (to push 'cv::Mat' data) but this is obviously not efficient as a few seconds worth of images soon uses up all the PC memory. I tried compressing images using 'cv::imencode' but that doesn't make much difference using PNG, I need a solution that uses hard-drive memory and efficient for real-time operation.
Can anyone suggest a very simple and efficient solution?
EDIT:
Just so that everyone understands what I'm doing at the moment; here's the code for a 10 second buffer:
void run()
{
cv::VideoCapture cap(0);
double fps = cap.get(CV_CAP_PROP_FPS);
int buffer_lenght = 10; // in seconds
int wait = 1000.0/fps;
QTime time;
forever{
time.restart();
cv::mat image;
bool read = cap.read(image);
if(!read)
break;
bool locked = _mutex.tryLock(10);
if(locked){
if(image.data){
_buffer.push(image);
if((int)_buffer.size() > (fps*buffer_lenght))
_buffer.pop();
}
_mutex.unlock();
}
int time_taken = time.elapsed();
if(time_taken<wait)
msleep(wait-time_taken);
}
cap.release();
}
queue<cv::Mat> _buffer and QMutex _mutex are global variables. If you're familiar with QT, signals and slots etc, I've got a slot that grabs the buffer and saves it as a video using cv::VideoWriter.
EDIT:
I think the ideal solution will be for my queue<cv::Mat> _buffer to use hard-drive memory rather than pc memory. Not sure on which planet this is possible? :/
I suggest looking into real-time compression with x264 or similar. x264 is regularly used for real-time encoding of video streams and, with the right settings, can encode multiple streams or a 1080p video stream in a moderately powered processor.
I suggest asking in doom9's forum or similar forums.
x264 is a free h.264 encoder which can achieve 100:1 or better (vs raw) compression. The output of x264 can be stored in your memory queue with much greater efficiency than uncompressed (or losslessly compressed) video.
UPDATED
One thing you can do is store images to the hard disk using imwrite and update their filenames to the queue. When the queue is full, delete images as you pop filenames.
In your video writing slot, load the images as they are popped from the queue and write them to your VideoWriter instance
You mentioned you needed to use Hard Drive Memory
In that case, consider using the OpenCV HighGUI VideoWriter. You can create an instance of VideoWriter as below:
VideoWriter record("RobotVideo.avi", CV_FOURCC('D','I','V','X'),
30, frame.size(), true);
And write image captures to in as below:
record.write(image);
Find the documentation and the sample program on the website.

Image Stitching from a live Video Stream in OpenCv

I am trying to Stitch an image from a live video camera (more like a panorama) using OpenCv. The stitching is working fine. My problem is, i want the stitching to be done in real time say around 30 mph but the processing of the stitching is slow.
I want to use Threads to improve the speed but in order to use them do i need to store my live video stream or is there any way to directly use threads for the live stream.
Here is a sample code:
SapAcqDevice *pAcq=new SapAcqDevice("Genie_HM1400_1", false);
SapBuffer *pBuffer = new SapBuffer(20,pAcq);
SapView *pView=new SapView(pBuffer,(HWND)-1);
SapAcqDeviceToBuf *pTransfer= new SapAcqDeviceToB(pAcq,pBuffer,XferCallback,pView);
pAcq->Create();
pBuffer->Create();
pView->Create();
pTransfer->Create();
pTransfer->Grab();
printf("Press any key to stop grab\n");
getch();
pTransfer->Freeze();
pTransfer->Wait(5000);
printf("Press any key to terminate\n");
getch();
This above code is used to capture the live stream. The XferCallback function is used to do the processing of the frames. In this function i call my stitch engine. Since the processing of the engine is slow i want to use threads.
Here is a sample code of the callback function:
SapView *pView = (SapView *) pInfo->GetContext();
SapBuffer *pBuffer;
pBuffer = pView->GetBuffer();
void *pData=NULL;
pBuffer->GetAddress(&pData);
int width=pBuffer->GetWidth();
int height=pBuffer->GetHeight();
int depth=pBuffer->GetPixelDepth();
IplImage *fram;
fram = cvCreateImage(cvSize(width,height),depth,1);
cvSetImageData(fram,pData,width);
stitching(frame_num , fram);
cvWaitKey(1);
frame_num++;
I want many threads working on the stitch engine.
If you think you can get the stitching fast enough using threads, then go for it.
do i need to store my live video stream or is there any way to
directly use threads for the live stream.
You might benefit from setting up a ring buffer with preallocated frames. You know the image size isn't going to change. So your Sapera acquisition callback simply pushes a frame into the buffer.
You then have another thread that sits there stitching as fast as it can and maintaining state information to help optimize the next stitch. You have not given much information about the stitching process, but presumably you can make it parallel with OpenMP. If that is fast enough to keep up with frame acquisition then you'll be fine. If not, then you will start dropping frames because your ring buffer is full.
As hinted above, you can probably predict where the stitching for the next frame ought to begin. This is on the basis that movement between one frame and the next should be reasonably small and/or smooth. This way you narrow your search and greatly improve the speed.