Using GStreamer plugin from Alumae and the following pipeline :
appsrc source='appsrc' ! wavparse ! audioconvert ! audioresample ! queue ! kaldinnet2onlinedecoder <parameters snipped> ! filesink location=/tmp/test
I always get the following assert that I don't understand
KALDI_ASSERT(current_log_post_.NumRows() == info_.frames_per_chunk /
info_.opts.frame_subsampling_factor &&
current_log_post_.NumCols() == info_.output_dim);
What is this assert error about ? How to fix it ?
FYI, the data pushed into the pipeline come from a streamed wav file and replacing kaldinnetonlinedecoder with wavenc correctly generate a Wav file instead of a text file at the end.
EDIT
Here are the parameters used:
use-threaded-decoder=0
model=/opt/en/final.mdl
word-syms=<word-file>
fst=<fst_file>
mfcc-config=<mfcc-file>
ivector-extraction-config=/opt/en/ivector-extraction/ivector_extractor.conf
max-active=10000
beam=10.0
lattice-beam=6.0
do-endpointing=1
endpoint-silence-phones=\"1:2:3:4:5:6:7:8:9:10\"
traceback-period-in-secs=0.25
num-nbest=10
For your information, using the pipeline textual representation in python works but coding it (i.e using Gst.Element_Factory.make and so on) always throw the exception
SECOND UPDATE
Here is the full stack trace generated by the assert
ASSERTION_FAILED ([5.2]:AdvanceChunk():decodable-online-looped.cc:223) : 'current_log_post_.NumRows() == info_.frames_per_chunk / info_.opts.frame_subsampling_factor && current_log_post_.NumCols() == info_.output_dim'
[ Stack-Trace: ]
kaldi::MessageLogger::HandleMessage(kaldi::LogMessageEnvelope const&, char const*)
kaldi::MessageLogger::~MessageLogger()
kaldi::KaldiAssertFailure_(char const*, char const*, int, char const*)
kaldi::nnet3::DecodableNnetLoopedOnlineBase::AdvanceChunk()
kaldi::nnet3::DecodableNnetLoopedOnlineBase::EnsureFrameIsComputed(int)
kaldi::nnet3::DecodableAmNnetLoopedOnline::LogLikelihood(int, int)
kaldi::LatticeFasterOnlineDecoder::ProcessEmitting(kaldi::DecodableInterface*)
kaldi::LatticeFasterOnlineDecoder::AdvanceDecoding(kaldi::DecodableInterface*, int)
kaldi::SingleUtteranceNnet3Decoder::AdvanceDecoding()
I finally got it working, even with frame-subsampling-factor parameter.
The problem resides in the order of the parameters.
fst and model parameters have to be the last ones.
Thus the following textual chain works :
gst-launch-1.0 pulsesrc device=alsa_input.pci-0000_00_05.0.analog-stereo ! queue ! \
audioconvert ! \
audioresample ! tee name=t ! queue ! \
kaldinnet2onlinedecoder \
use-threaded-decoder=0 \
nnet-mode=3 \
word-syms=/opt/models/fr/words.txt \
mfcc-config=/opt/models/fr/mfcc_hires.conf \
ivector-extraction-config=/opt/models/fr/ivector-extraction/ivector_extractor.conf \
phone-syms=/opt/models/fr/phones.txt \
frame-subsampling-factor=3 \
max-active=7000 \
beam=13.0 \
lattice-beam=8.0 \
acoustic-scale=1 \
do-endpointing=1 \
endpoint-silence-phones=1:2:3:4:5:16:17:18:19:20 \
traceback-period-in-secs=0.25 \
num-nbest=2 \
chunk-length-in-secs=0.25 \
fst=/opt/models/fr/HCLG.fst \
model=/opt/models/fr/final.mdl \
! filesink async=0 location=/dev/stdout t. ! queue ! autoaudiosink async=0
I opened an issue on GitHub for this as for me, this can be really difficult to find and should at least be documented.
Related
While trying to implement a simple player (with gst-launch) for a CDN that uses the initial headers throughout all streams (probably to avoid bots), hlexdemux and adaptivedemux will not reuse the same initial headers from the initial source for the next requests.
Is it actually possible to have a pre-configured curlhttpsrc to be reused by hlsdemux and its super classes?
This is the pipeline I am using:
gst-launch-1.0 -v \
curlhttpsrc \
name=curl user-agent=my-user-agent \
location=http://localhost:8000/playlist.m3u8 curl. \
! hlsdemux \
! fakesink sync=false
the playlist was generated with:
gst-launch-1.0 -v \
videotestsrc is-live=true \
! x264enc \
! h264parse \
! hlssink2 max-files=5 playlist-root=http://localhost:8090
its output
#EXTM3U
#EXT-X-VERSION:3
#EXT-X-MEDIA-SEQUENCE:43
#EXT-X-TARGETDURATION:15
#EXTINF:15.000000953674316,
http://localhost:8090/segment00043.ts
#EXTINF:15.000000953674316,
http://localhost:8090/segment00044.ts
#EXTINF:15.000000953674316,
http://localhost:8090/segment00045.ts
#EXTINF:15.000000953674316,
http://localhost:8090/segment00046.ts
#EXTINF:15.000000953674316,
http://localhost:8090/segment00047.ts
#EXT-X-ENDLIST
And to mimic the CDN, I used this snippet to serve the playlist from port 8000 and the streams from 8090 as the CDN uses different hosts and with it, I put a user-agent validation to see when my pipeline breaks.
from http.server import SimpleHTTPRequestHandler, test
import sys
class Handler(SimpleHTTPRequestHandler):
def parse_request(self) -> bool:
rv = super().parse_request()
if self.headers['User-Agent'] != "my-user-agent":
self.send_error(404, "Wrong user-agent")
return False
return rv
test(Handler, port=int(sys.argv[1]))
My PIPELINE-DESCRIPTION only video works:
"rtspsrc protocols=tcp location=" + urlStream_ + " latency=300 ! decodebin3 ! autovideosink ! autoaudiosink";
But...
I would like receive video+audio. I only receive it on the first frame and no audio:
"rtspsrc protocols=tcp location=" + urlStream_ + " latency=300 ! decodebin3 ! autovideosink ! autoaudiosink";
You will need to connect the autoaudiosink the the decodebin3. Currently you are connecting the sink to the video sink - which obviously is bogus.
It it also advised to use a queue after each demuxer pad. Try:
"rtspsrc protocols=tcp location=" + urlStream_ + " latency=300 ! decodebin3 name=decodebin ! queue ! autovideosink decodebin. ! queue ! autoaudiosink";
Wish in the document of matroskamux there could be an example demonstrating how to mux subtitles. After a couple of days trying, I doubt it is doable. Maybe it is a bug that matroskamux can not mux subtitles, except when the text stream is in subtitle/x-kate format. Below is the pipeline description that failed. Can someone please tell me where it went wrong, or verify that it is indeed a bug. Thanks.
gst-launch-1.0 \
videotestsrc num-buffers=300 \
! videoconvert \
! theoraenc \
! MUXER.video_%u \
filesrc location=src.srt \
! subparse \
! text/x-raw,format=utf8 \
! MUXER.subtitle_0 \
matroskamux name=MUXER \
! filesink location=dst.mkv
Below is a .srt file that can be used to try the above gst-launch-1.0 command.
1
00:00:01,000 --> 00:00:02,000
one
2
00:00:02,000 --> 00:00:03,000
two
3
00:00:03,000 --> 00:00:04,000
three
4
00:00:04,000 --> 00:00:05,000
four
5
00:00:05,000 --> 00:00:06,000
five
6
00:00:06,000 --> 00:00:07,000
six
7
00:00:07,000 --> 00:00:08,000
seven
8
00:00:08,000 --> 00:00:09,000
eight
9
00:00:09,000 --> 00:00:10,000
nine
10
00:00:10,000 --> 00:00:11,000
ten
I'd like to use pipeline below to play content with sound and without sound. Problem is that content without sound PREROLLING pipeline, but doesn't play
gst-launch-1.0.exe uridecodebin uri=file:///home/mymediafile.ogv name=d1 ! tee name=t1 ! queue max-size-buffers=2 ! jpegenc ! appsink name=myappsink t1. ! queue ! autovideosink d1. ! queue ! audioconvert ! audioresample ! autoaudiosink
How can I solve such issue?
I found no way to get your pipeline going on the command line. If I put in the audio portion of the pipeline, the files with no audio hang.
In your application however, you'll be able to add a signal for the pad_added events, and only added the audio portion of the pipeline when needed. Some pseudo code:
void decodebin_pad_added(GstElement *decodebin, GstPad *new_pad, gpointer user_data) {
GstElement* pipeline = (GstElement*)user_data;
GstCaps* audio_caps = gst_caps_from_string("audio/x-raw");
GstCaps* pad_caps = gst_pad_get_current_caps(new_pad);
if(! gst_caps_can_intersect(pad_caps, audio_caps)) {
return;
}
GstElement* audio_pipeline = gst_parse_launch("queue ! audioconvert ! audioresample ! autoaudiosink", NULL);
gst_bin_add(GST_BIN(pipeline), audio_pipeline);
GstElement* decodebin = gst_bin_get_by_name(GST_BIN(pipeline), "d1");
gst_element_link(decodebin, audio_pipeline);
gst_object_unref(decodebin);
}
void decodebin_no_more_pads(GstElement *decodebin, gpointer user_data) {
GstElement* pipeline = (GstElement*)user_data;
gst_element_set_state(pipeline, GST_PLAYING);
}
GstElement* pipeline = gst_parse_launch("uridecodebin uri=file:///home/mymediafile.ogv name=d1 ! tee name=t1 ! queue max-size-buffers=2 ! jpegenc ! appsink name=myappsink t1. ! queue ! autovideosink", NULL);
GstElement* decodebin = gst_bin_get_by_name(GST_BIN(pipeline), "d1");
g_signal_connect(decodebin, "pad-added", G_CALLBACK(decodebin_pad_added), pipeline);
g_signal_connect(decodebin, "no-more-pads", G_CALLBACK(decodebin_no_more_pads), pipeline);
gst_element_set_state(pipeline, GST_STATE_PAUSED); //pause to make demuxer and decoders get setup and find out what's in the file
Add async-handling=true to the autoaudiosink.
gst-launch-1.0.exe uridecodebin uri=file:///home/mymediafile.ogv
name=d1 ! tee name=t1 ! queue max-size-buffers=2 ! jpegenc ! appsink
name=myappsink t1. ! queue ! autovideosink d1. ! queue ! audioconvert
! audioresample ! autoaudiosink async-handling=true
I wish to build a single gstreamer pipeline that does both rtp audio send and receive.
Based on the examples (few as they are) that I've found, here is my almost working code.
(the program is written in Rexx, but it's pretty obvious what is happening, I think. Here, it looks a lot like bash!). Line catenation char is comma. The "", bits just insert blank lines for readability.
rtp_recv_port = 8554
rtp_send_port = 8555
pipeline = "gst-launch -e",
"",
"gstrtpbin",
" name=rtpbin",
"",
"udpsrc port="rtp_recv_port, -- do-timestamp=true
' ! "application/x-rtp,media=audio,payload=8,clock-rate=8000,encoding-name=PCMA,channels=1" ',
" ! rtpbin.recv_rtp_sink_0",
"",
"rtpbin. ",
" ! rtppcmadepay",
" ! decodebin ",
' ! "audio/x-raw-int, width=16, depth=16, rate=8000, channels=1" ',
" ! volume volume=5.0 ",
" ! autoaudiosink sync=false",
"",
"autoaudiosrc ",
" ! audioconvert ",
' ! "audio/x-raw-int,width=16,depth=16,rate=8000,channels=1" ',
" ! alawenc ",
" ! rtppcmapay perfect-rtptime=true mtu=2000",
" ! rtpbin.send_rtp_sink_1",
"",
"rtpbin.send_rtp_src_1 ",
" ! audioconvert",
" ! audioresample",
" ! udpsink port="rtp_send_port "host="ipaddr
pipeline "> pipe.out"
If I comment out the lines after
" ! autoaudiosink sync=false",
The receive-only portion works just fine. However, if I leave those lines in place I get this error:
ERROR: from element /GstPipeline:pipeline0/GstUDPSrc:udpsrc0: Internal data flow error.
Additional debug info:
gstbasesrc.c(2582): gst_base_src_loop (): /GstPipeline:pipeline0/GstUDPSrc:udpsrc0:
streaming task paused, reason not-linked (-1)
So what's suddenly become unlinked? I'd understand if the error was in the autoaudiosrc portion, but suddenly showing up in the udpsrc section?
Suggestion of help, anyone?
(FWIW) After I get this part working I will go back in and add the rtcp parts or the pipeline.
Here is a pipeline that will send and receive audio (full duplex). I manually set the sources so that it is expandable(you can put video on this as well and I have a sample pipeline for you if you want to do both). I set the jitter buffer mode to BUFFER because mine is implemented on a network with a TON of jitter. Now, within this sample pipe, you could add all your variable changes (volume, your audio source, encoding and decoding etc.).
sudo gst-launch gstrtpbin \
name=rtpbin audiotestsrc ! queue ! audioconvert ! alawenc ! \
rtppcmapay pt=8 ! rtpbin.send_rtp_sink_0 rtpbin.send_rtp_src_0 ! \
multiudpsink clients="127.0.0.1:5002" sync=false async=false \
udpsrc port=5004 caps="application/x-rtp, media=audio, payload=8, clock-rate=8000, \
encoding-name=PCMA" ! queue ! rtpbin.recv_rtp_sink_0 \
rtpbin. buffer-mode=RTP_JITTER_BUFFER_MODE_BUFFER ! rtppcmadepay ! alawdec ! alsasink
I have had issues with the Control(RTCP) packets. I have found that a loop back test is not sufficient if you are utilizing RTCP. You will have to test on two computers talking to each other.
Let me know if this works for you as I have tested on 4 different machines and all have worked.