Extract subtitles from DVD - gstreamer

So I have a DVD, part of a boxset, and I would like to extract the subtiles
Information about the disk:
$ gst-discoverer-1.0 /the/mounted/disk/VIDEO_TS/VIDEO_TS.VOB
Analyzing file:///the/mounted/disk/VIDEO_TS/VIDEO_TS.VOB
Done discovering file:///the/mounted/disk/VIDEO_TS/VIDEO_TS.VOB
Topology:
container: MPEG-2 System Stream
audio: DVD AC-3 (ATSC A/52)
audio: AC-3 (ATSC A/52)
video: MPEG-2 Video (Main Profile)
Properties:
Duration: 0:06:59.480000000
Seekable: yes
Live: no
Tags:
audio codec: DVD AC-3 (ATSC A/52)
bitrate: 384000
video codec: MPEG-2 Video
$ gst-discoverer-1.0 /the/mounted/disk/VIDEO_TS/VTS_01_0.VOB
Analyzing file:///the/mounted/disk/VIDEO_TS/VTS_01_0.VOB
Done discovering file:///the/mounted/disk/VIDEO_TS/VTS_01_0.VOB
Topology:
container: MPEG-2 System Stream
subtitles: DVD subpicture
subtitles: DVD subpicture
video: MPEG-2 Video (Main Profile)
Properties:
Duration: 0:00:00.049444444
Seekable: yes
Live: no
Tags:
video codec: DVD subpicture
Ideally I'd end up with something like a vtt file at the end. I do not want the audio or video
I've played around with gst-launch and have watched the disk with playbin as well as doing various filesrc experiments, but looking at docs and old mailing list posts hasn't got me very far
I see webvttenc exists, but I'm really not sure how I get to the point where I can use it (how do I get from subpicture/x-dvd to text/x-raw?)
Really I've got no idea what I'm doing

Related

Change the default audio and video codec loaded by avformat_alloc_output_context2

I'm using ffmpeg library for live streaming via RTMP. I want to know how to give my choice of audio and video codec for the particular format in avformat_alloc_output_context2.
In Detail:
The following command works perfectly for me.
ffmpeg -re -stream_loop -1 -i ~/Downloads/Microsoft_Surface.mp4 -vcodec copy -c:a aac -b:a 160k -ar 44100 -strict -2 -f flv -flvflags no_duration_filesize rtmp://192.168.1.7/live/surface
In the output, I have set my audio codec to be aac and copied the video codec from input, which is H264.
I want to emulate this in the library, but don't know how to.
avformat_alloc_output_context2(&_ctx, NULL, "flv", NULL);
Above code sets oformat audio codec to ADPCM_SWF and video codec to FLV1. How to change that to AAC and H264 ?
So far, used av_guess_format to construct AVOutputFormat. It accepts only format as input. And I don't know where to mention audio and video codec.
AVOutputFormat* output_format = av_guess_format("flv", NULL, NULL);
Also tried giving filename to avformat_alloc_output_context2 with the rest of the parameters NULL.
AVOutputFormat* output_format = av_guess_format(NULL, "flv_acc_sample.flv", NULL);
This file has AAC audio and H264 video. But still ffmpeg loads oformat with ADPCM_SWF audio and FLV1 video codecs.
Searched stackoverflow for similar questions, but could not find the solution I was looking for.
Any hint/guidance is hugely appreciated. Thank you.

Google Cloud Speech to text returning empty result or error

Working hard for 4 days now to fix the google cloud speech to text api to work, but still see no light at the end of the tunnel. Searched on the net a lot, read the documentations a lot but see no result.
Our site is bbsradio.com, we are trying to auto extract transcript from our mp3 files using google speech-to-text api. Code is written on PHP and almost exact copy of this: https://github.com/GoogleCloudPlatform/php-docs-samples/blob/master/speech/src/transcribe_async.php
I see process is completed and its reached out here "$operation->pollUntilComplete();" but its not showing it was successful at "if ($operation->operationSucceeded()) {" and its not returning any error either at $operation->getError().
I am converting the mp3 to raw file like this: ffmpeg -y -loglevel panic -i /public_html/sites/default/files/show-archives/audio-clips-9-23-2020/911freefall2020-05-24.mp3 -f s16le -acodec pcm_s16le -vn -ac 1 -ar 16000 -map_metadata -1 /home/mp3_to_raw/911freefall2020-05-24.raw
While tried with FLAC format as well, not worked. I tested converted FLAC file using windows media player, I can listen conversation clearly. I checked the files its Hz 16000, channel = 1 and its 16 bit. I see file is uploaded in cloud storage. Checked this:
https://cloud.google.com/speech-to-text/docs/troubleshooting and
https://cloud.google.com/speech-to-text/docs/best-practices
There are lot of discussion and documentation, seems nothing is helpful at this moment. If some one can really help me out to find out the issue, it will be really really really great!
TLDR; convert from MP3 to a 1-channel FLAC file with the same sample rate as your MP3 file.
Long explanation:
Since you're using MP3 files as your process input, probably you MP3 compression artifacts might be hurting you when you resample to to 16KHz (you cannot hear this, but the algoritm will).
To confirm this theory:
Execute ffprobe -hide_banner filename.mp3 it will output something like this:
Metadata:
...
Duration: 00:02:12.21, start: 0.025057, bitrate: 320 kb/s
Stream #0:0: Audio: mp3, 44100 Hz, stereo, s16p, 320 kb/s
Metadata:
encoder : LAME3.99r
In this case, the sample rate is OK for Google-Spech-Api. Just transcode the file without changing the sample rate (remove the -ar 16000 from your ffmpeg command)
You might get into trouble if the original MP3 bitrate is low. 320kb/s seems safe (unless the recording has a lot of noise).
Take into account that voice recoded under 64kb/s (ISDN line quality) can be understood only by humans if there is some noise.
At last I found the solution and reason of the issue. Actually getting empty results is a bug of the php api code. What you need to do:
Replace this:
$operation->pollUntilComplete();
by this:
while(!$operation->isDone()){
$operation->pollUntilComplete();
}
Read this: enter link description here

Missing element: MPEG4-GENERIC audio RTP depayloader Gstreamer

When I try to record an RTSP stream with audio and video using gstreamer I get the above error. When only video is recorded it works but when audio pipeline is added the file size becomes zero and the above error is displayed. Further following is also displayed
Missing element: MPEG4-GENERIC audio RTP depayloader
WARNING: from element /GstPlayBin:playbin0/GstURIDecodeBin:uridecodebin0: No decoder available for type 'application/x-rtp, media=(string)audio, payload=(int)96, clock-rate=(int)48000, encoding-name=(string)MPEG4-GENERIC, streamtype=(string)5, profile-level-id=(string)1, mode=(string)aac-hbr, sizelength=(string)13, indexlength=(string)3, indexdeltalength=(string)3, config=(string)1188, a-tool=(string)"LIVE555\ Streaming\ Media\ v2016.01.29", a-type=(string)broadcast, x-qt-text-nam=(string)"KMStreaming\ Server", x-qt-text-inf=(string)ch01, clock-base=(uint)3130203504, seqnum-base=(uint)34845, npt-start=(guint64)0, play-speed=(double)1, play-scale=(double)1, ssrc=(uint)3216157947'.
Additional debug info:
gsturidecodebin.c(921): unknown_type_cb (): /GstPlayBin:playbin0/GstURIDecodeBin:uridecodebin0
There are two different MPEG4 audio RTP formats in the wild. MP4A-LATM and MPEG4-GENERIC. See RFC 3016 and RFC 3640 respectively.
Looks like GStreamer only supports MP4A-LATM. So basically, yes, the format you are trying to receive is not supported.

Remux mp4 file containing data stream

I’m developing an app that needs to clone an MP4 video file with all the streams using FFmpeg C++ API and have successfully made it work based on the FFmpeg remuxing example.
This works great for video and audio streams, but when the video includes a data stream (actually a QuickTime Time Code according to MediaInfo) I get this error.
Output #0, mp4, to 'C:\Users\user\Desktop\shortOut.mp4':
Stream #0:0: Video: hevc (Main 10) (hev1 / 0x31766568), yuv420p10le(tv,progressive), 3840x2160 [SAR 1:1 DAR 16:9], q=2-31, 1208 kb/s
Stream #0:1: Audio: mp3 (mp4a / 0x6134706D), 48000 Hz, stereo, s16p, 32s
Stream #0:2: Data: none (tmcd / 0x64636D74), 0 kb/s
[mp4 # 0000000071edf600] Could not find tag for codec none in stream #2, codec not currently supported in container
I’ve found this happens in the call to avformat_write_header().
It makes sense that if FFmpeg doesn’t know the codec it can’t write to the header about it, but I found out that using the ffmpeg command line I can make it to work perfectly using the copy command for the stream, something like:
ffmpeg -i input.mp4 -c:v copy -c:a copy -c:a copy output.mp4
I have been analyzing ffmpeg.c implementation to try to understand how they do a stream copy, but it’s been very painful following along the huge pipeline.
What would be a proper way to remux a data stream of this type with FFmpeg C++ API? Any tip or pointers?

Decodebin: skip a stream

I want to encode my TV recordings with Gstreamer on a raspberry pi. Inspired by this post, the following code works for a downloaded mkv:
/usr/bin/gst-launch-1.0 -e filesrc location=/media/Seagate/complete/TV/Better\ Call\ Saul/Season\ 01/Better\ Call\ Saul\ -\ S01E10\ -\ Marco.mkv ! decodebin name=demux ! queue ! audioconvert ! audio/x-raw ! audiorate ! avenc_ac3 bitrate=320000 ! mux. mpegtsmux name=mux ! filesink location=/media/Seagate/pvr/Buitenhof_compressed.mkv demux. ! queue ! videoconvert ! deinterlace ! omxh264enc target-bitrate=2000000 control-rate=1 inline-header=true periodicty-idr=250 interval-intraframes=250 ! "video/x-h264,profile=high" ! h264parse ! mux.
The used file has this structure (gst-discoverer output):
Topology:
container: Matroska
audio: AC-3 (ATSC A/52)
video: H.264
Properties:
Duration: 0:49:18.048000000
Seekable: yes
Tags:
container format: Matroska
audio codec: AC-3 audio
language code: und
video codec: H264
minimum bitrate: 7288
bitrate: 24263
maximum bitrate: 9206
My recording software (TVHeadend) outputs this format however:
Topology:
container: Matroska
subtitles: application/x-subtitle-unknown
subtitles: application/x-subtitle-unknown
audio: MPEG-1 Layer 2 (MP2)
audio: AC-3 (ATSC A/52)
audio: MPEG-1 Layer 2 (MP2)
video: H.264
Properties:
Duration: 0:00:06.440000000
Seekable: yes
Tags:
title: Buitenhof
extended comment: DATE_BROADCASTED=2015-05-24 10:05:00
container format: Matroska
audio codec: MPEG 1 Audio, Layer 2
language code: nl
nominal bitrate: 256000
has crc: true
channel mode: stereo
video codec: H264
minimum bitrate: 8972400
bitrate: 16546750
maximum bitrate: 22841600
How can I tell the pipeline to skip the subtitlestreams and use the AC-3 and H264 streams? Ive tried decodebin name=demux demux.audio_01 to no avail. The output is
Setting pipeline to PAUSED ...
Pipeline is PREROLLING ...
Missing element: application/x-subtitle-unknown decoder
Missing element: application/x-subtitle-unknown decoder
It would be nice to have the subtitles included as well, but I can certainly live without them.
The order of the streams had nothing to do with with the problem. There was insufficient video memory available. For a full hd mkv I needed to set the video memory split to 128MB.
One way of doing this is running sudo raspi-config For my raspbian version de video memory split was under the advanced menu. Reboot required.
After this, the command I posted works