I'm using rtp_forward from the videoroom plugin in Janus-Gateway to stream WebRTC.
My target pipeline looks like this:
WebRTC --> Janus-Gateway --> (RTP_Forward) MediaLive RTP_Push Input
I've achieved this:
WebRTC --> Janus-Gateway --> (RTP-Forward) Janus-Gateway [Streaming Plugin]
I've tried multiple rtp_forward requests, like:
register = {"request": "rtp_forward", "publisher_id": 8097546391494614, "room": 1234, "video_port": 5000, "video_ptype": 100, "host": "medialive_rtp_input", "secret": "adminpwd"}
But medialive just doesn't receive any stream. Anything I'm missing?
I'm not familiar with AWS MediaLive: initially I thought that, since most media servers like this expect RTMP and not RTP, that was the cause of the issue, but it looks like it does indeed support a plain RTP input mode. At this point this is very likely a codec issue: probably MediaLive doesn't support the codecs your browser is sending (opus and vp8?). Looking at the supported codecs, this seems to be the issue: https://docs.aws.amazon.com/medialive/latest/ug/inputs-supported-containers-and-codecs.html
You can probably get video working if you use H.264 in the browser, but audio is always Opus and definitely not AAC, so you'll need an intermediate node to do transcoding.
Since you're using RTP PUSH, Are you pushing stream it to correct RTP endpoint provided by AWS ? If so, you can see alerts in health check if Medialive received it but it failed to read or corrupted. You'll see error is any of these pieplines where you're pushing the stream, if you don't see anything which means some Network problem, try RTMP as it's on TCP and should get something in packet capturer.
https://docs.aws.amazon.com/medialive/latest/ug/monitoring-console.html
Related
We are trying to receive customer calls through Amazon Connect and leave messages in Amazon Kinesis.
When we call Amazon Connect from our cell phones, the voice plays the expected message and the Beep tone sounds as expected. But then the call ends and we cannot leave a message. We tried removing Wait and Stop media streaming but the problem persisted. What are we doing wrong?
Set Voice: OK
Play prompt(Message): OK
Play prompt(Beep): OK
Start media streaming: NG
If you have a simple, easy to understand sample for this application, let me know!
Looks like the problem is your Wait block. Wait isn't supported for voice calls, so immediately errors.
Replace the Wait block with a Get Customer Input block. Use Text to speech for the prompt, Set the prompt value manually to <speak></speak> and set Interpret as to SSML. Set it to detect DTMF and set the timeout to however long the message is allowed to be. From your flow above that is 10 seconds.
This should get the customers voice sent to the Kinesis stream and you can process the stream from there.
There is a really thorough implementation guide for voice mail here. I've used this then altered it to suite my exact needs in the past.
I want to stream the microphone audio from the web browser to AWS S3.
Got it working
this.recorder = new window.MediaRecorder(...);
this.recorder.addEventListener('dataavailable', (e) => {
this.chunks.push(e.data);
});
and then when user clicks on stop upload the chunks new Blob(this.chunks, { type: 'audio/wav' }) as multiparts to AWS S3.
But the problem is if the recording is 2-3 hours longer then it might take exceptionally longer and user might close the browser before waiting for the recording to complete uploading.
Is there a way we can stream the web audio directly to S3 while it's going on?
Things I tried but can't get a working example:
Kineses video streams, looks like it's only for real time streaming between multiple clients and I have to write my own client which will then save it to S3.
Thought to use kinesis data firehose but couldn't find any client data producer from brower.
Even tried to find any resource using aws lex or aws ivs but I think they are just over engineering for my use case.
Any help will be appreciated.
You can set the timeslice parameter when calling start() on the MediaRecorder. The MediaRecorder will then emit chunks which roughly match the length of the timeslice parameter.
You could upload those chunks using S3's multipart upload feature as you already mentioned.
Please note that you need a library like extendable-media-recorder if you want to record a WAV file since no browser supports that out of the box.
I cannot find any information how to handle the situation like this:
Stream starts: about 3 o'clock.
1.Before the person who is streaming (let's call him a streamer) start to stream I would like to have static image saying something like: 'The event will start soon'.
2.Streamer start pushing his stream to RTMP endpoint but he's late and starts at 3.02. Up until 3.02 the same picture should be visible (as in point 1).
3.Streamer should finish at 4 o'clock but he finishes 5 minutes before 4 (pushing stop at his device).
4.Now, ending screen should be visible from 5 minutes to four and later.
I know that inputs should be switched in order to change a view and this can be scheduled for fixed time, but I would like this to be switched dynamically, ie. when streamer starts pushing to RTMP URL and stops pushing to RTMP URL (from eg. Larix software). How to handle that in AWS Media Live?
Thank you for asking this question on stackoverflow, the easiest way to achieve what you are looking to do is by using an Input Prepare Scheduled Action. The channel will then monitor the input and raise an alarm if the RTMP source is not there. When the RTMP source begins then the alarm will remit, you can send the alarms to a lambda that will look for these alarms and can do the switch from slate MP4 to the RTMP source when it sees the RTMP input missing alarm was cleared. This can also be done for when RTMP input goes away.
Information on Prepare Inputs:
https://docs.aws.amazon.com/medialive/latest/ug/feature-prepare-input.html
Global configuration - Input loss behavior:
https://docs.aws.amazon.com/medialive/latest/ug/creating-a-channel-step3.html
Zach
I am trying to implement producer as mentioned here(https://github.com/awslabs/amazon-kinesis-video-streams-producer-sdk-java/blob/master/src/main/demo/com/amazonaws/kinesisvideo/demoapp/PutMediaDemo.java).
I have a mkv file which i want to upload in loop to act as producer in Kinesis video stream. But program hangs on line 122 (latch.await()). Program stuck at this line without giving any error and i am not able to see any thing on amazon video preview tab.
What i am doing wrong?
The line 122 (latch.await()) is waiting for acknowledge or connection close event. It could be firewall or network condition causing this to wait for ever…. Before you try your own mkv file, were you able to get the demo running with the sample mkv files and see playback in the console? Let us know if that succeeds in your environment.
I am using a microphone which records sound through a browser, converts it into a file and sends the file to a java server. Then, my java server sends the file to the cloud speech api and gives me the transcription. The problem is that the transcription is super long (around 3.7sec for 2sec of dialog).
So I would like to speed up the transcription. The first thing to do is to stream the data (if I start the transcription at the beginning of the record. The problem is that I don't really understand the api. For instance if I want to transcript my audio stream from the source (browser/microphone) I need to use some kind of JS api, but I can't find anything I can use in a browser (we can't use node like this can we?).
Else I need to stream my data from my js to my java (not sure how to do it without breaking the data...) and then push it through streamingRecognizeFile from there : https://github.com/GoogleCloudPlatform/java-docs-samples/blob/master/speech/cloud-client/src/main/java/com/example/speech/Recognize.java
But it takes a file as the input, so how am I supposed to use it? I cannot really tell the system I finished or not the record... How will it understand it is the end of the transcription?
I would like to create something in my web browser just like the google demo there :
https://cloud.google.com/speech/
I think there is some fundamental stuff I do not understand about the way to use the streaming api. If someone can explain a bit how I should process about this, it would be owesome.
Thank you.
Google "Speech-to-Text typically processes audio faster than real-time, processing 30 seconds of audio in 15 seconds on average" [1]. You can use Google APIs Explorer to test exactly how long your each request would take [2].
To speed up the transcribing you may try to add recognition metadata to your request [3]. You can provide phrase hints if you are aware of the context of the speech [4]. Or use enhanced models to use special set of machine learning models [5]. All these suggestions would improve the accuracy and might have effects on transcribing speed.
When using the streaming recognition, in config you can set singleUtterance option to True. This will detect if user pause speaking and cease the recognition. If not streaming request will continue until to the content limit, which is 1 minute of audio length for streaming request [6].