Modifying CISCO openh264 to take image frames and out compressed frames

Modifying CISCO openh264 to take image frames and out compressed frames - c++

Has anyone tried to modify the CISCO openh264 library to take JPEG images as input and compress them into P and I frames (output as frames, NOT video) and similarly to modify decoder to take compressed P and I frames and generate uncompressed-frames ?
I have a camera looking at a static scene and taking pictures (1280x720p) every 30 second. The scene is almost static. Currenlty I am using JPEG compression to compress each frame individually and it is resulting in an image size of ~270KB. This compressed frame is transferred via internet to a storage server. Since there is very little motion in the scene, the 'I' frame size will be very small (I think it should be ~20-50KB). So it will be very cost effective to transmit I frames over internet instead of JPEG images.
Can anyone guide me to some project or about how to proceed with this task ?

You are describing exactly what a codec does. It takes images, and compresses them. There relationship in time is irrelevant to the compression step. The decoder than decides how to display or just write them to disk. You don't need to modify open264, what you want to do is exactly what it is designed to do.

Related

Changing the video frame quality(compress) in to JPEG and rendering

I'm totally new to OpenCv library and I'm implementing a simple client server application using Opencv and python. Here the client captures the video from the webcam and sends it to the server. I need to compress the video frame in order to reduce the bandwidth usage. As I could find we can save the frame in to a JPEG which is a loosy compression a technique. But using the provided method I have to write the frame into and JPEG image. What I need is without writing to an image rendering the low quality(compressed frame). What i'm currently doing is writing to a JPEG and reading it again. two IO cycles per a single frame is not efficient at all. Can anyone suggest a better solution?
cv2.imwrite('imageName.jpg', frame, [int(cv2.IMWRITE_JPEG_QUALITY), 90])
newFrame=cv2.imread('imageName.jpg')
cv2.imshow('preview',newFrame);
(frame= current image frame I captured,
newFrame=loading the saved image in to the programme)

How to insert a key frame(Iframe) to a h.264 video stream in ffmpeg C++ api?

I have a real time video stream, and want to cut some video clips from it by accurate timestamp(pts).
When I receiver an avpacket, I decode it, and do something and cache the avpacket. I don't want to re-encode all avpackets, it cost cpu resource.
There are many gop structure in H.264 stream, usually we should cut the video begin at the key frame, and end at the key frame. Otherwise the front some frames in the video clip would display error.
Now I use av_write_frame to make avpacket to video. But sometimes the length of gop is very long, such as it could be 250, 8.3s(30 frame per second). It means the distance between two I-frame could be 250 frames. The video clip is short, I don't want to add too many unused frames.
How should I do? I think i should insert a i-frame at the start position of video clip. Could I change a p-frame to i-frame?
Thanks your reading!

This is not possible in the generic case, but may be in specific cases. Even then, there are no open source/free tools to do this, and I am unaware of any commercial tools. The reason I say it is not possible in the generic case is each frame can reference up to 16 other frames. So you can not just replace a single frame, You will need to replace all referenced frames. Doing this will likely take almost as much CPU as encoding the whole GOP.

Most efficient way to store video data

In order to accomplish some specific editing on some .avi files, I'd like to create an application (in C++) that is able to load, edit, and save those .avi files. But, what is the most efficient way? When first thinking about it, a simple 3D-Array containing a 2D-array of pixels for every frame seems the simplest solution; But then its size would be ENORMOUS. I mean, let's assume that a pixel only needs a color. One color would mean 3bytes (1char r, 1char b, 1char g). If I now have a 1920x1080 video format, this would mean 2MEGABYTES for only one frame! This data may or may not be smaller if using pointers for the colors, so that alreay used colors wont take more size - I don't really know, since I'm pretty new to C++ and the whole low-level stuff. (As a comparison: One of my AVI files recorded with Xvid codec is 40seconds long, 30fps, and only has 2MB.)
So how would you actually store the video data (Not even the audio, just the video) efficiently (while still being easily able to perform per-frame-changes on it)?

As you have realised, uncompressed video is enormous and it is not practical to store an entire video in this way.
Video compression is an extremely complex topic, but more-or-less, it works as follows: certain "key-frames" are compressed using fairly standard compression techniques similar or identical to still-photo compression such as JPEG. Frames following key-frames are compressed by comparing the frame with the previous one and looking for changes (such as moving blocks). Every now and again, a new key-frame is used.
You don't really have to worry much about that as you are not going to write your own video coder/decoder (codec). There are standard ones.
What will happen is that your program will decode the compressed video frame-by-frame and keep a certain number of frames in memory while you are working on them and then re-encode them when it is finished. In the uncompressed form, you will have access to the individual pixels and can work on them how you want.
You are probably not going to do that either by yourself - it is very hard. You probably need to use a framework, such as OpenCV. There are a huge number of standard filters and tools built in to these frameworks, and it may be that what you want to do is already implemented somewhere.
The OpenCV framework can return individual frames in a Mat object and you can then access the pixels. See this post Get Pixels from Mat
OpenCV
Tutorial page: Open CV Tutorial

How to read .avi files C++

I want to read in an .avi video file for a program that I am making. I have the file location saved as a string. Is there any good tutorials on using .avi files in c++ or does anyone know who to read one in? Is it the same as normal files?
I have a previously asked SO question that goes into better detail but here is what I want to do:
I am making a program that will detect faces (though OpenCV) As of now I have been given a video processor program that will detect each face on a frame, and return the frame as a image and the CvRec of the faces. I want to take these faces and test them to validate that they are all actually faces.
After I have all the faces (tested) I want to then take the images and test them together. I test the faces on each frame for size and distance changes. If the faces pass this for a frame length of two seconds, then I want to crop the face and make it the subject of each frame.
After each frame is cropped I then want to save the new video file for the user.
Hopefully that helps. If anyone needs a better explanation please let me know.

First of all, a little background.
What is AVI?
AVI stands for Audio Video Interleave. It is a special case of the RIFF (Resource Interchange File Format). AVI is defined by Microsoft and it is the most common format for audio/video data.
I assume you would want to read a avi file and decode the compressed video frames. AVI file is just like any other normal file and you can use fread()(in C) or iostream(in C++) to open an avi file and read it contents. But the contents of an avi file are video frames in a compressed format. The compression allows video content of bigger sizes to be efficiently packed in less memory space.To make any sense of this compressed data you would have to decode the encoded data format.You will have to study the standard which describes how AVI encoding is done and then extract and decode the frames. this raw video data now when fed to a video device will be displayed in video format.

It seems you are staying within OpenCV so things are easy. If OpenCV is compiled properly it is capable of delegating io/coding/decoding to other libraries. Quicktime and others for example, but best is to use ffmpeg. You open, read and decode everything using the OpenCV API which gives you the video frame by frame.
Make sure your OpenCV is compiled with ffmpeg support and then read the OpenCV tutorial on how to read/write AVI files. It's really easy.
Getting OpenCV to be built with ffmpeg support might be hard though. You might want to switch to an older version of OpenCV if you can't get ffmpeg running with the current one.
Personally i would not spent time trying to read the video by yourself and delegate the task to OpenCV. That's how it is supposed to be used.

C/C++ library for seekable movie format

I'm doing some processing on some very large video files (often up to 16MP), and I need a way to store these videos in a format that allows seeking to specific frames (rather than to times, like ffmpeg). I was planning on just rolling my own format that concatenates all of the individually zlib compressed frames together, and then appends an index on the end that links frame numbers to file byte indices. Before I go about this though, I just wanted to check to make sure I'm not duplicating the functionality of another format/library. Has anyone heard of a format/library that allows lossless compression and random access of videos?

The reason it is hard to seek to a specific frame in most video codecs is that most frames depend on another frame or frames, so frames must be decoded as a group. For this reason, most libraries will only let you seek to the closest I-frame (Intra-frame - independently decodable frame). To actually produce an image from a non-I-frame, data from other frames is required, so you have to decode a number of frames worth of data.
The only ways I have seen this problem solved involve creating an index of some kind on the file. In other words, make a pass through the file and create an index of what frame corresponds to a certain time or section of the file. Since the seeking functions of most libraries are only able to seek to an I frame so you may have to seek to the closest I-frame and then decode from there to the exact frame you want.
If space is not of high importance, I would suggest doing it like you say, but use JPEG compression instead of zlib as it will give you a lot higher compression ratio since it exploits the fact you are dealing with image data.
If space is an issue, P frames (depend on previous frame/frames) can greatly reduce the size of the file. I would not mess with B frames (depend on previous and future frame/frames) since they make it much harder to get things right.
I have solved the problem of seeking to a specific frame in the presence of B and P frames in the past using ffmpeg (libavformat) to demux the video into packets (1 frame's worth of data per packet) and concatenate these into a single file. The important thing is to keep and index into that file so you can find packet bounds for a given frame. If the frame is an I-frame, you can just feed that frame's data into an ffmpeg decoder and it can be decoded. If the frame is a B or P frame, you have to go back to the last I-frame and decode forward from there. This can be quite tricky to get right, especially for B-frames since they are often sent in a different order than how they are displayed.

Some formats allow you to change the number of key frames per second.
For example, I've used ffmpeg to encode to flv at 25 frames per second with 25 key frames per second, and then used a player that was fine in moving to key frames. Basically this allowed me to do frame by frame seeking.
Also the last time I checked quicktime can do frame by frame seek without having to have each frame being a key frame.
May not be applicable to you but that's my thoughts.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js