Gstreamer subtitles while recording - gstreamer

I am new to GStreamer and am trying to encode a video stream (for now v4l2src) with a subtitle stream and muxed into an MPEG ts container. I am able to use 'textoverlay' to set the data but I don't want to burn the data into the image. However I am wanting to use the subtitle stream to encode 'metadata' that is generated while the video is being recorded.
Is there a way that I can add subtitles into the MPEG ts as time passes? The content of the subtitle text is not known before hand, for example the gps coords of a moving camera.
There is the 'subtitleoverlay' plugin but I do not fully understand this one. Does it burn the text into the image like the 'textoverlay' or does it add a separate stream?

I think that subtitleoverlay renders and burn text into video frames.. check the example pipeline there is no magic - after subtitle overlay there is videoconvert which works with video frames..
I guess you can just attach subtitle stream into mpegtsmux element. I hope this is possible now - there is this bug/feature request which I hope makes this possible..
I checked the capabilities of mpegtsmux and it supports:
subpicture/x-dvb
 application/x-teletext
If you can somehow manage to input subtitles in form of subpicture/x-dvb then later on receiver you can use dvbsuboverlay element to display them..
I didnt find a way how you can actualy create such stream from text file(I found this question but no answer, maybe ask on IRC)..
I have a feeling that teletext was capable of showing subtitles.. but this may not be what you want (I dont know)..
In both cases I think that if you had rendered stream with rendered subtitles (only subtitles) in form of subtitles.mpg you could use that.. I guess there are some tools out there in wild that you can use for that..
Hope you can use that somehow

Related

Calculating bitrate box bufferSizeDB value for AAC track in MP4 file

I am writing a custom fragmented MPEG-4 sink in media foundation. I have been able to figure out and get info on nearly everything but I am stuck on the bufferSizeDB item for an audio AAC "esds" container. The bitRateBox used in MP4 format includes the bufferSizeDB member to assist decoders. Does anyone know how this value is calculated or what the guidelines there are for setting it?

selectively 'turn off' more than one out pin stream directshow filter

I'm sure this question would have been asked before but I've searched and can't find anything specific to help a solution.
I'll start out outlining the initial concerns and if more indepth technical information is needed then I can give it. Hopefully there is enough information for the initial question(s).
I'm writing an app using c++ and directshow in visual studio 2010. The main project specification is for a live preview and, at any time of choosing, record the video to mpeg2 to harddrive then to dvd to be played in a standard dvd player, all the time the live preview is not to be interrupted.
The capturing seems a pretty trivial standard straight forward thing to do with directshow.
There are a couple of custom filters that i wrote. Nothing amazing but we wanted our own custom screen overlay information - time and date etc - this must be in the preview and the recorded file. I use the avi decompressor connected to the capture card video out pin, and connect the avi decompressor to my filter to give me an rgb image that i can manipulate. The output from this filter is then split via an inftee filter, one goes to the screen, the other goes into the ms mpeg2 encoder. The audio goes from the capture card audio out into the same mpeg2 encoder. Output from the mpeg2 encoder then goes to a file. That file then gets authored for dvd and burnt to dvd.
So my questions are...
Where and how would be the best place to allow starting and stopping of only mpeg2 file output, to be done via user action?
I have tried using smart tee filters - 1 for video and 1 for audio as the last filter BEFORE the mpeg2 encoder, then using the iamstreamcontrol interface to turn off the pins at the appropriate time. Should this cause any timing issues with the final mpeg2? as the output file will play via mplayer and vlc etc but doesnt get converted to be mpeg2 dvd compliant ( for testing - via any dvd authoring software - complaints of a broken file and somteimes gives time references ) - is it possible that time stamps in the file are a problem and giving an error? If the file is captured from the first moment that capture commences ( as opposed to say after 5 mins of streaming ) then everything is ok.
I did think of using the streambuffer route - http://msdn.microsoft.com/en-gb/library/windows/desktop/dd693041(v=vs.85).aspx - but I'm not sure on the best direction to takes things. It seems that are possibly a few choices for the best direction.
Any help and tips would be greatly appreciated. Especially if theres websites/books/information of DirectShow filters,pins,graphs and how they all flow together.
EDIT: I was thinking of making my own copy of the 'Smart Tee' filter that in that I would have 2 pins coming in - audio and video - and 4 out pins - 2 video ( 1 preview and 1 capture ) and 2 of the same for audio, but would I end up with the same issue? And what is the correct way to handle 'switching off' the capture pins of that custom filter. Would I be wasting my time to work on something like this? Is it a simple case of overriding the Active/Inactive methods of the output pin(s) and either send or not send the sample? I feel its not that easy?
Many thanks!
Where and how would be the best place to allow starting and stopping of only mpeg2 file output, to be done via user action?
For this kind of action I would recommend GMFBridge. Creating your own filter is not easy. GMFBridge allows you to use two separate graphs with a dynamic connection. Use the first graph for the preview and the second graph for the file output. And only connect the graphs after a user action.

How to cut MP3 without re-encoding? Should i just copy the frames?

I need to implement a feature that could transmit parts of a large mp3 file over the TCP/IP in a way that would allow user to listen each part without having the entire file (using libmpg123). I would like to allow users to transmit parts as small as possible without re-encoding the stream. I would like to forget about re-encoding, because i don't want sound quality to degrade with each transmission. Each time i want to cut mp3 i do have the splitting coordinates in samples: "from what sample to what sample", so each time i should translate this to an IDs of an mp3-frames. So my question is:
Does each mp3 frame has enough information (bps/samplerate/bits-per-sample/channels) to play it without entire mp3-file header just by feeding them to an mp3 decoder?
Is there any BSD/MIT-licensed small library that could work as mp3 splitter using samples-coordinates and supporting VBR?
You can just cut binary file!
The only problem of this solution... problem with Tags
Or try this: http://www.codeproject.com/Articles/8295/MPEG-Audio-Frame-Header
Each mp3 frame is stand-alone, and can survive by itself. So you don't have to worry about it.

MJPEG Video from IP Camera too fast

I'm just trying to read a video Stream out of an IP Camera (Basler BIP-1280c).
The stream I want to have is saved in a buffer on the camera, has a length of 40 seconds and is decoded in MJPEG.
Now if I access the stream via my webbrowser it shows me the 40 seconds without any problems.
But actually I need an application which is capable of downloading and saving the stream by itself.
The camera is accessed via http, so I am using libcurl to access it. This works fine and I also can download the stream without any troubles. I have chosen to save the stream data into an *.avi file (hope that's correct…?).
But now to the problem: I can open the video (tried with Totem Video Player and VLC) and also view all that has been recorded — BUT it's way too fast. The whole video lasts like 5 seconds (instead of 40). Is there in MJPEG anything in a header where to put information like the total video length or the fps? I mean there must be some information missing for the video players, so that they play it way to fast?
Update:
As suggested in the answers, I opened the file with a hexeditor and what I found was this:
--myboundary..Content-Type: image/jpeg..Content-Length: 39050.........*Exif..II*...............V...........................2...................0210................FrameNr=000398732
6.AOI=(0800x0720)#(0240,0060)/(1280x0720).Motion=00000 (no)
[00000 | 00000 | 00000 | 00000 | 00000].Alarm=0000 (no) .IO
=000.RtTrigger=0...Basler..BIP2-1280c..1970:01:05 23:08:10.8
98286......JFIF.................................. ....&"((
This header reoccurs in the file all over ( followed by a a lot of Bytes of binary Data ). This is actually okay, since I read in the camera manual that all MJPEG Pictures get this Header.
More interesting ins the JFIFin the last line. As in the answers suggested this is maybe the indicator of the file format. But afaik JFIF is a single picture format just like jpg. So does this maybe even mean that the whole video file is just some "brainless" chained pictures? And my Player just assumes that he should show this pictures one after another, without any knowledge about the framerate?
There is not a single format to use with MJPEG. From Wikipedia:
[...] there is no document that defines a single exact format that is
universally recognized as a complete specification of “Motion JPEG”
for use in all contexts.
The formats differ by vendor. My advice would be to closely inspect the file you download. Check if it is really an AVI container. (Some cameras can send the frames wrapped in a MIME container).
After the container format is clear, you can check out the documentation of that container and look for a file which has that format and the desired fps. Then you can start adjusting your downloaded file to have the desired effect.
You might also find this project useful: http://mjpeg.sourceforge.net/
Edit:
According to your sample data your camera sends the frames packed into a MIME container. (The first line is the boundary, then the headers until you encounter an empty line, then the file data itseld, followe by the boundary and so on).
These are JPEG files as the header suggests: image/jpeg. JFIF is the standard file format to store JPEG data.
I recommend you to:
Extract the contents of the file into multiple jpeg files (with munpack for instance), then
use ffmpeg or mplayer to create a movie file out of the series of jpegs.
This way you can specify the desired frame rate too.
It can make things more complicated if the camera dynamically canges AOI (area of interest), meaning it can send only a smaller part of the image where change occured. But you should check first if the simple approach works.
on un*x systems (linux, osx,...), you can use the file cmdline tool to make a (usually good) guess about the file format.
--myboundary is an indication that the stream is regular M-JPEG streamed as multipart content over HTTP. There is no well known file format which can hold this stream "as is" and be playable (that is if you rename this to AVI it is not supposed to play back).
The format itself is a sequence of (boundary, subheader, JPEG image), (boundary, subheader, JPEG image), ... etc. The stream does not have time stamps, so playback speed completely depends on the player.

How to hack ffmpeg to consider I-Frames as key frames?

I'm trying to get ffmpeg to seek h264 interlaced videos, and i found that i can seek to any frame if i just force it.
I already hacked the decoder to consider I - Frames as keyframes, and it works nicely with the videos i need it to work with. And there will NEVER be any videos encoded with different encoders.
However, i'd like the seek to find me an I - Frame and not just any frame.
What i'd need to do is to hack The AVIndexEntry creation so that it marks any frame that is an I-Frame to be a key frame.
Or alternatively, hack the search thing to return I - Frames.
The code does get a tad dfficult to follow at this point.
Can someone please point me at the correct place in ffmpeg code which handles this?
This isn't possible as far as i can tell..
But if you do know where the I-Frames are, by either decoding the entire video or by just knowing, you can insert stuff into the AVIndexEntry information stored in the stream.
AVIndexEntries have a flag that tells if it's a keyframe, just set it to true on I-Frames.
Luckily, i happen to know where they are in my videos :)
-mika