Is there a way to distinguish dual-mono and stereo in Gstreamer? - gstreamer

I have an audio decoder library and I am writing a Gstreamer plugin for it.
I am setting the plugin's source cap as
caps = gst_caps_new_simple ("audio/x-raw",
"format",G_TYPE_STRING,"S16LE",
"layout", G_TYPE_STRING,"interleaved",
"rate", G_TYPE_INT, sample_freq,
"channels", G_TYPE_INT, channels,
NULL);
My question is, how do I inform Gstreamer framework that whether the audio is stereo or dual-mono as in both the cases, channels will be 2?

I have seen elements to use
channel-mode=dual
as an extra cap flag.
I have seen mono, stereo, dual, joint as options. Of course it depends on the downstream elements to understand these fields and act accordingly.
This is a private good-will agreement. It is not any official define in any way. The format doesn't declare this in any way (as far as I know).
The correct way would be to have two mono tracks.

Related

Do ffmpeg libs know how to correctly set number of samples per frame, according to the encoder used?

I am trying to build a simple transcoder that can take MP3 and WAV files, and segment them using the segment formatting option, while also possibly changing the sample rate, bit rate and channel layout.
For this, I followed the code in the transcoding.c example. The issue is that when trying to transcode from a 32K HZ MP3 to 48K HZ MP3. The problem is that the MP3 encoder expects 1152 frame size, but libavfilter provides me with frames that contain 1254 number of samples. So when I try to do the encoding, I get this message: more samples than frame size. This problem can also be reproduced using the example code, just set the sample rate of the encoder to 48K.
One option is to use the asetnsamples filter, and set it to 1152, that will fix upsampling to 48K, but then downsampling to 24K won't work, because the encoder expects frame sizes of 576.
I wouldn't want to set this filter's value depending on the input information, it may become messy later if I support more file types, such as AAC.
Is there any way of making the libavfilter libraries know about this flow, and trigger proper filtering and transcoding without having to use lower level APIs, like libswresample or doing frame buffering?

V4l2 MJPEG Chroma Subsumpling

My cam gives me jpeg with chroma sub-sampling 4:2:2, but I need 4:2:0.
Can I change MJPEG default chroma sub-sampling with v4l2?
v4l2 itself provides a very thin layer around the actual video data that is transferred: it will simply give you the formats that the camera (the hardware!!) delivers.
so if your hardware offers two distinct formats, then there is no way that v4l2 will offer you anything else.
you might want to checkout out libv4l2 library that does some basic colorspace conversion: in general it will have conversion from most exotic hardware formats to a handful of "standard" formats, so your application does not need to support all formats any hardware manufacturer can come up with. however, it is not very likely that these standard formats include a very specific (compressed) format, like the one you need.

C++ ffmpeg real-time video transmisson

I am a student currently working on my final project. Our project is focusing on new type network coding research. Now my task is to do a real-time video transmission to test the network coding. I have learned something of ffmepg and opencv and have finished a c++ program which can divide the video into frames and send it frame by frame. However, by this way, the transmission data (the frames)size are quite much more than the original video file size. My prof advise me try to find the keyframe and inter frame diff of the video (mjpeg format), so that transmit the keyframe and interframe diff only instead of all the frames with large amount of redundancy, and therefore reduce the transmission data. I have no idea in how to do this in c++ and ffmpeg or opencv. Can any one give any advice?
For my old program, please refer to here. C++ Video streaming and transimisson
I would recommend against using ffmpeg/libav* at all. I would recommend using libx264 directly. By using x264 you can have greater control of NALU slice sizes as well as lower encoder latency by utilizing callbacks.
Two questions which already may help yourself:
How are you interfacing from c++ to ffmpeg? ffmpeg generally refers to the command line tool, from c++ you generally use the individual libs which are part of ffmpeg. You should use libavcodec to encode your frames and possibly libavformat to packetize them into a container format.
Which codec do you use?

Video players questions

Given that FFmpeg is the leading multimedia framework and most of the video/audio players uses it, I'm wondering somethings about audio/video players using FFmpeg as intermediate.
I'm studying and I want to know how audio/video players works and I have some questions.
I was reading the ffplay source code and I saw that ffplay handles the subtitle stream. I tried to use a mkv file with a subtitle on it and doesn't work. I tried using arguments such as -sst but nothing happened. - I was reading about subtitles and how video files uses it (or may I say containers?). I saw that there's two ways putting a subtitle: hardsubs and softsubs - roughly speaking hardsubs mode is burned and becomes part of the video, and softsubs turns a stream of subtitles (I might be wrong - please, correct me).
The question is: How does they handle this? I mean, when the subtitle is part of the video there's nothing to do, the video stream itself shows the subtitle, but what about the softsubs? how are they handled? (I heard something about text subs as well). - How does the subtitle appears on the screen and can be configured changing fonts, size, colors, without encoding everything again?
I was studying some video players source codes and some or most of them uses OpenGL as renderer of the frame and others uses (such as Qt's QWidget) (kind of or for sure) canvas. - What is the most used and which one is fastest and better? OpenGL with shaders and stuffs? Handling YUV or RGB and so on? How does that work?
It might be a dump question but what is the format that AVFrame returns? For example, when we want to save frames as images first we need the frame and then we convert, from which format we are converting from? Does it change according with the video codec or it's always the same?
Most of the videos I've been trying to handle is using YUV720P, I tried to save the frames as png and I need to convert to RGB first. I did a test with the players and I put at the same frame and I took also screenshots and compared. The video players shows the frames more colorful. I tried the same with ffplay that uses SDL (OpenGL) and the colors (quality) of the frames seems to be really low. What might be? What they do? Is it shaders (or a kind of magic? haha).
Well, I think that is it for now. I hope you help me with that.
If this isn't the correct place, please let me know where. I haven't found another place in Stack Exchange communities.
There are a lot of question in one post:
How are 'soft subtitles' handled
The same way as any other stream :
read packets from a stream to the container
Give the packet to a decoder
Use the decoded frame as you wish. Here with most containers supporting subtitles the presentation time will be present. All you need at this time is get the text and burn it onto the image at the same presentation time. There are a lot of ways to print the text on the video, with ffmpeg or another library
What is the most used renderer and which one is fastest and better?
most used depend on the underlying system. For instance Qt only wrap native renderers, and even has a openGL version
You can only be as fast as the underlying system allows. Does it support ouble-buffering? Can it render in your decoded pixel format or do you have to perform color conversion before? This topic is too broad
Better only depend on the use case. this is too broad
what is the format that AVFrame returns?
It is a raw format (enum AVPixelFormat), and depends on the codec. There is a list of YUV and RGB FOURCCs which cover most formats in ffmpeg. Programmatically you can access the table AVCodec::pix_fmts to obtain the pixel format a specific codec support.

Custom avi/MP4 file writer

I am writing some video files under Windows from a camera.
I need the data unaltered - not MP4's 'uncompressed' ie. no YUV, no color interpolation - just the raw camera sensor bytestream.
At the moment I am writing this direct to disk and re-reading it later to recode into a usable video. But with no header I have to keep track of image size, frame rate, color balance etc separately.
I could add a custom header but even if the actual video data is unreadable by anything other than my app, using an AVI file would at least give me a relatively standard header to store all the camera parameters and also means that resolution, length etc would show up in explorer.
Is there an easy way of generating an AVI header/footer without sending all the data through directshow or vfw? The data is coming in at >250MB/s and I can't lose any frames so I don't have time to do much more than dump each frame to disk.
edit: Perhaps MP4 would be better I have a lot of metadata about the camera config that isn't in the AVI standard
Well, after figuring out what 'reasonable' AVI headers would be for your stream (e.g. if you use a custom codec fourcc, no application would probably be able to do useful things with it -- so why bother with AVI?)
you could just write a prebuild RIFF-AVI header at the beginning of your file. It's not too hard to figure out the values.
Each frame then has to be enclosed in its own RIFF chunk (4 Byte type: "00db" + 4 byte length + your data).
After the fact you have to fix the num_frames and some length fields in the header. And for files >2GB don't forget the OpenDML extension for the header.
Martin, since you are proficient in OpenCV, couldn't you just use cvCreateVideoWriter() for creating an uncompressed .avi?
CvVideoWriter* cvCreateVideoWriter(const char* filename, int fourcc, double fps, CvSize frame_size, int is_color=1)
Regarding the foucc param, the documentation states:
Under Win32 if 0 is passed while using an avi filename it will create a video writer that creates an uncompressed avi file.
It sounds like you could really benefit from using opencv, it could probably handle alot of this nicely for you. Take a look and see if it suits your needs: http://opencv.willowgarage.com/documentation/cpp/reading_and_writing_images_and_video.html#videocapture
You can use OpenCV to read and write avi file.
see http://opencv.willowgarage.com/documentation/cpp/reading_and_writing_images_and_video.html
Note that OpenCV can also be used to grab images from a camera.