openh264 Repeating frame - c++

I'm using the openh264 lib, in a c++ program, to convert a set of images into a h264 encoded mp4 file. These images represent updates to the screen during a session recording.
Lets say a set contains 2 images, one initial screen grab of the desktop and another one, 30 seconds later, when the clock changes.
Is there a way for the stream to represent a 30 seconds long video using only theses 2 images?
Right now, I'm brute forcing this by encoding multiple times the first frame to fill the gap. It there a more efficient and/or faster way of doing this.

Of course. Set a frame rate of 1/30 fps and you end up with 1 frame every 30 seconds. It doesn't even have to be in the H.264 stream - it can be done also when it gets muxed into an mp4 file afterwards for example.

Related

C++ OpenCV VideoWriter frame rate synchronization

I am about to capture frames from a video grabber card. Those frames are processed and written to the HDD.
The whole setting is in a multithreading environment, so the grabber writes the images to a queue, and in another thread, the images is processed and another one writes to hdd. If the image is good by the definition of the processor, the image gets written to hdd. If 10 images in a row are "bad", the file is completed. If there are 9 images or less "bad", all the images get written with the next good image, so the file writer gets informed.
Here the problem, if I do not do it this way, instead writing each file directly after it is processed, the video file is fine, But 9 "bad" images are written. If I do it the way in my description above, the speed/frame rate of the video is not suitable. This description is a bit weird, so here is just a simplified example, so you can see the problem:
void FrameWriter::writeFrameLoop() {
string path = getPath();
cv::Size2i size(1350, 1080);
cv::VideoWriter videoWriter(path, fourcc, 30, size);
while (this->isRunning) {
while (!this->frames.empty()) {
usleep(100000); // this effects the speed/frame
videoWriter.write(this->pop());
}
std::this_thread::sleep_for(10ms);
}
videoWriter.release();
}
The example is pretty simple, here I "block" the writing process with a sleep, remember this is a different thread. This means after the capturing is stopped, the file writing takes a bit longer.
But I would expect that this does not effect the video itself, because there is a framerate of 30 and the images are still in the same order. But for me it seems to effect the video file, when I call "videoWriter.write" not in time. In this case the video is much faster than expected.
I thought only the configured frames of 30 and the count of written images would effect the video speed, but it is not. Can anyone help me do understand what is going on here?
I am using openCV 4.4.0 with Ubuntu 18.04.
Thank you for your help.
BR Michael
I think I know the reason of fast-playing result videos.
In constructor cv::VideoWriter videoWriter(path, fourcc, 30, size); you set frame rate (FPS) of resulting video to 30. It means that CV library expects exactly 30 frames to be written by write() function for each 1 second of resulting video stream.
Also for CV library there's no difference how fast you call write() with new frame, you may call it 5 times per second or 10 or even 1000 times. The only thing that matters is that you have to provide exactly 30 frames for each second of video, and how fast you provide these 30 frames doesn't matter.
I mean that all your sleep(...) functionality doesn't matter for CV VideoWriter class. And it is always true for all video rendering/conversion libraries. So pausing thread doesn't change anything at all.
But in your case you're saying that you grab 10 frames per second from real-time video data of your grabber's video card. It means the your FPS is really 10 frames per second. So in order to solve your task correctly next things should be done:
Remove all pausing functionality, like calling sleep(). It is not needed at all. And doesn't change behavior of VideoWriter.
Then first way to solve the task is to change in your constructor cv::VideoWriter videoWriter(path, fourcc, 30, size); value 30 to 10. This already will solve your task, but you have to be sure that you grab 10 frames per second, not more not less. Then your video will be a correctly playing (correct speed) video with a frame rate of 10 frames per second. This is the simplest solution. Video doesn't need to be 30 FPS for correctly playing later, 10 FPS video will be correctly played later by any player.
Another solution, if you really want your resulting video to play 30 frames per second, not less, not more, then duplicate each frame of your grabbed video three times, thus you'll get 30 frames out of 10 frames of your grabbed video. By duplicating I just mean that you should call videWriter.write(...) three times (in a small loop) with same single frame, call this write without any pause (like sleep). Then again your resulting video will have exactly 30 frames per second.
I think you just miss understood how CV::VideoWriter works. You thought that write() renders resulting video in real time, meaning that if you feed it 10 frames but exactly within one second period, then it should render correct speed of video. But this writer renders video not in real time, meaning that it just supposes that 10 frames passed constitute just 1/3 of second of resulting videos, hence it expects 30 frames to be written for 1 resulting second.
If you have a camera that's not able to provide a constant frame rate of let's say 30 frames, you can also consider limiting the frame rate yourself to e.g. 25 and measure the time elapsed since writing the last frame. You can also change your framerate to arbitrary values, as long as the camera is able to provide it. An example of an implementation:
m_fps = 25;
std::chrono::steady_clock::time_point start = std::chrono::steady_clock::now();
std::chrono::steady_clock::time_point now;
while (1) {
now = std::chrono::steady_clock::now();
long duration = std::chrono::duration_cast<std::chrono::nanoseconds>(now - start).count();
if ((double)duration > (1e9 / m_fps)) {
start = std::chrono::steady_clock::now();
m_duration += duration;
// Capture frame-by-frame
if(m_cap->grab()) { m_cap->retrieve(m_frame); }
// write frame
m_writer->write(m_frame);
cv::waitKey(1);
}
}
I figured out my problem and it was me, not OpenCV.
Because of the multithreading environment, I was writing the images (cv::Mat) to a queue. Since there are a lot of transformations YUV -> RGB -> BGR -> Crop etc. I was expecting the cv::Mat object I put in the queue was a deep copy. Instead of this, the cv::Mat was always the same object. Which means, even if there was already 20 cv::Mat entries in my queue, all of them was the same and all of them "changed" when ever there was a new image.

Modifying CISCO openh264 to take image frames and out compressed frames

Has anyone tried to modify the CISCO openh264 library to take JPEG images as input and compress them into P and I frames (output as frames, NOT video) and similarly to modify decoder to take compressed P and I frames and generate uncompressed-frames ?
I have a camera looking at a static scene and taking pictures (1280x720p) every 30 second. The scene is almost static. Currenlty I am using JPEG compression to compress each frame individually and it is resulting in an image size of ~270KB. This compressed frame is transferred via internet to a storage server. Since there is very little motion in the scene, the 'I' frame size will be very small (I think it should be ~20-50KB). So it will be very cost effective to transmit I frames over internet instead of JPEG images.
Can anyone guide me to some project or about how to proceed with this task ?
You are describing exactly what a codec does. It takes images, and compresses them. There relationship in time is irrelevant to the compression step. The decoder than decides how to display or just write them to disk. You don't need to modify open264, what you want to do is exactly what it is designed to do.

Capturing an AVI video with DirectShow

I'm trying to capture an AVI video, using DirectShow AVIMux and FileWriter Filters.
When I connect SampleGrabber filter instead of the AVIMux, I can clearly see that the stream is 30 fps, however upon capturing the video, each frame is duplicated 4 time and I get a 120 frames instead of 30. The movie is 4 times slower than it should be and only the first frame in the set of 4 is a Key Frame.
I tried the same experiment with 8 fps and for each image I received, I had 15 frames in the video. And in case of 15 fps, I got each frame 8 times.
I tried both writing the code in C++ and testing it with Graph Edit Plus.
Is there any way I can control it? Maybe some restrictions on the AVIMux filter?
You don't specify your capture format which could have some bearing on the problem, but generally it sounds like the graph when writing to file has some bottleneck which prevents the stream from continuing to flow at 30fps. The camera is attempting to produce frames at 30fps, and it will do so as long as buffers are recycled for it to fill.
But here the buffers aren't available because the file writer is busy getting them onto the disk. The capture filter is starved and in this situation it increments the "dropped frame" counter which travels with each captured frame. AVIMux uses this count to insert an indicator into the AVI file which says in effect "a frame should have been available here to write to file, but isn't; at playback time repeat the last frame". So the file should have placeholders for 30 frames per second - some filled with actual frames, and some "dropped frames".
Also, you don't mention whether you're muxing in audio, which would be acting as a reference clock for the graph to maintain audio-video sync. When capture completes if also using an audio stream, AVIMux alters the framerate of the video stream to make the duration of the two streams equal. You can check whether AVIMux has altered the framerate of the video stream by dumping the AVI file header (or maybe right click on the file in explorer and look at the properties).
If I had to hazard a guess as to the root of the problem, I'd wager the capture driver has a bug in calculating the dropped frame count which is in turn messing up AVIMux. Does this happen with a different camera?

DirectShow filter graph using WMASFWriter creates video which is too short

I am attempting to create a DirectShow source filter based on the pushsource example from the DirectShow SDK. This essentially outputs a set of bitmaps to a video. I have set up a filter graph which uses Async_reader with a Wave Parser for audio and my new filter to push the video (the filter is a CSourceStream and I populate my frames in the FillBuffer function). These are both connected to a WMASFWriter to output a WMV.
Each bitmap can last for several seconds so in the FillBuffer function I'm calling SetTime on the passed IMediaSample with a start and end time several seconds apart. This works fine when rendering to the screen but writing to disk results in a file which is too short in duration. It seems like the last bitmap is being ignored when writing a WMV (it is shown as the video ends rather than lasting for the intended duration). This is the case both with my filter and a modified pushsource filter (in which the frame length has been increased).
I've seen additional odd behaviour in that it was not possible to have a video that wasn't a multiple of 10 seconds in length at one point whilst I was trying to make this work. I'm not sure what this was, but I though I'd mention it incase it's relevant.
I think the end time is simply ignored. Normally video samples only have a start time because they are a point in time. If there is movement in the video, the movement is fluent, though the video are just points in time.
I think the solution is simple. Because video stays the same until the next frame is received, you can just add a dummy frame at the end of your video. You can simply repeat the previous frame.

C/C++ library for seekable movie format

I'm doing some processing on some very large video files (often up to 16MP), and I need a way to store these videos in a format that allows seeking to specific frames (rather than to times, like ffmpeg). I was planning on just rolling my own format that concatenates all of the individually zlib compressed frames together, and then appends an index on the end that links frame numbers to file byte indices. Before I go about this though, I just wanted to check to make sure I'm not duplicating the functionality of another format/library. Has anyone heard of a format/library that allows lossless compression and random access of videos?
The reason it is hard to seek to a specific frame in most video codecs is that most frames depend on another frame or frames, so frames must be decoded as a group. For this reason, most libraries will only let you seek to the closest I-frame (Intra-frame - independently decodable frame). To actually produce an image from a non-I-frame, data from other frames is required, so you have to decode a number of frames worth of data.
The only ways I have seen this problem solved involve creating an index of some kind on the file. In other words, make a pass through the file and create an index of what frame corresponds to a certain time or section of the file. Since the seeking functions of most libraries are only able to seek to an I frame so you may have to seek to the closest I-frame and then decode from there to the exact frame you want.
If space is not of high importance, I would suggest doing it like you say, but use JPEG compression instead of zlib as it will give you a lot higher compression ratio since it exploits the fact you are dealing with image data.
If space is an issue, P frames (depend on previous frame/frames) can greatly reduce the size of the file. I would not mess with B frames (depend on previous and future frame/frames) since they make it much harder to get things right.
I have solved the problem of seeking to a specific frame in the presence of B and P frames in the past using ffmpeg (libavformat) to demux the video into packets (1 frame's worth of data per packet) and concatenate these into a single file. The important thing is to keep and index into that file so you can find packet bounds for a given frame. If the frame is an I-frame, you can just feed that frame's data into an ffmpeg decoder and it can be decoded. If the frame is a B or P frame, you have to go back to the last I-frame and decode forward from there. This can be quite tricky to get right, especially for B-frames since they are often sent in a different order than how they are displayed.
Some formats allow you to change the number of key frames per second.
For example, I've used ffmpeg to encode to flv at 25 frames per second with 25 key frames per second, and then used a player that was fine in moving to key frames. Basically this allowed me to do frame by frame seeking.
Also the last time I checked quicktime can do frame by frame seek without having to have each frame being a key frame.
May not be applicable to you but that's my thoughts.