I want to build a live streaming app.
My thought process:
Get the Video/Audio data from the
navigator.mediaDevices.getUserMedia(constraints); [client-streamer]
create rooms using sockets(Socket.IO or WebSockets from flask) [backend]
Send the data in 1 to the room members using sockets.
display the media on the client-side.
Is that correct? How should I do it?
how do I broadcast data to specific room members and not to everyone? (flask)
How to consistently send data from the streamer -> server -> room members. the stream is given from 1 is an object, where is the data?
any other better ideas will be great! thanks.
I need to implement the server-side by myself without help from libraries that will do the work for me.
Implementing a streaming platform is not trivial. Unfortunately, it is not as simple as emitting chunks received from the MediaRecorder with onndatavailable and forwarding them to users using a WebSocket server - this is not scalable nor efficient nor reliable.
Below are some strategies you can try for different types of scenarios:
P2P: If you want to have simple peer-to-peer streaming, you can use WebRTC to achieve that with a simple socket.io server for signaling purposes.
Conference: Here things start to get more complicated. You will need a media server if you want to be somewhat scalable. One approach is to route your stream to the users using an SFU or MCU. This will take care of forwarding/processing media to different peers efficiently.
Broadcast: Here things are also non-trivial. Common WebRTC-based architectures include ingesting the WebRTC stream and forward that to an HLS server which will let your stream chunks available for clients through a CDN, or perform RTP forwarding of the WebRTC stream, convert it to RTMP using something like FFmpeg and deliver it through Youtube Live or Twitch to leverage from their infrastructure.
Be aware that the last 2 items are resource-intensive and will certainly not be cheap to maintain.
Below are some open source projects that could help you along the way:
Janus
MediaSoup
AntMedia
Jitsi
Good luck!
Explaining all this is far beyond the scope of a Stack Overflow answer.
Here are a few hints:
You need to use the MediaRecorder API to capture compressed data from your gUM (getUserMedia) stream. MediaRecorder support is inconsistent between makes and models of browser. though.
It kicks a Blob into its onndatavailable handler every so often.
They're compressed as a webm data stream.
You can push those Blobs to a server with socket.io, and the server can turn around and push them to whatever clients you want to.
Playing the webm on the clients is tricky. You may, on some makes and models of browsers, be able to feed the webm stream to the Media Source API using appendBuffer(). But some browsers cannot consume the webm streams.
These webm streams are useless to a player without all their Blob data in order. You can't just start sending a new client the Blobs of the stream when they sign in; you have to restart the MediaRecorder.
(You may be able to make it work without a MediaRecorder restart if you send the first few k bytes of the stream to each new client before sending the current Blob. Extracting those bytes is an intricate programming job involving the ebml package to parse the webm stream and extract the prologue. I have not proven this concept.)
Because getting all this to work -- originator -- server -- viewer is such a pain in the xxx neck, you may want to investigate using something like mediasoup instead. It uses WebRTC transport rather than socket.io, and works cross-platform.
Related
I am trying to loop audio from my Icecast server 24/7.
I have seen examples where people talk about storing their audio files on the EC2 instance or in an S3 bucket.
Do I also need a source client running on my EC2 Instance to be able to stream audio to the server? Or is there a way to play static files from Icecast?
Icecast and SHOUTcast servers work by passing a live audio stream from a source on to the users. You need something to produce a single audio stream in realtime from those source files.
The flow looks something like this:
Basically, you'll need to do everything you would in a normal radio studio, but automated. You'll stream the files from your bucket, play them to a raw audio stream, send that stream to your encoder to be compressed with the codec, and then sent to your streaming servers for distribution.
You can't simply push your audio files as-is to the Icecast server, for a few reasons:
Stream must be realtimeThe server doesn't really know or care about the timing of the stream. It takes the data its given and sends that off to the client. Therefore, if you push data faster than realtime, the server will attempt to deliver it to the client at this faster rate. Some clients will attempt to buffer this fast stream, but most will put backpressure on the stream, causing the TCP window to close, causing the client to eventually get far enough behind that the server drops the connection.
Consistent format is requiredChances are, your source files have varying sample rate, channel count, and even codec. Most clients are unable to take a change in sample rate or channel count mid-stream. I don't know of any client that supports a codec change mid-stream. (Theoretically possible with Ogg and Matroska/WebM, but yeah... not worth messing with.)
Stream should be free of ID3 tags and other file format cruftIf you simply PUT your files directly to your Icecast server, the output stream will contain more than just the audio data. At a minimum, you'd want to remove all that. Depending on your container format, you'll need to deal with timestamps as well.
Solutions
There are a handful of ways to solve this:
Radio automation softwareMany folks simply run something like RadioDJ on cloud-based servers. If you already have a radio station that uses automation, this might be a good solution. It can be expensive though, and not as flexible. You could even go as low as VLC or something for playout, but then you wouldn't have music transitions and what not.
Custom playout script (recommended)I use a browser engine, such as Chromium, and script my channels with normal JavaScript. From there, I take the output stream and pass it off to FFmpeg to encode and send to the streaming servers. This works really well, as I can do all my work in a language everybody knows, and I have easy access to data on cloud-hosted services. I can use the Web Audio API to mix and blend audio based on what's happening in realtime. As an alternative, there is Liquidsoap, but I do not recommend it these days as its language is difficult to deal with and it is not as flexible as a browser engine.
I manged to run WebRTC peerconnection example, but it is not running on the browser.
I'm trying to find a way to stream both video and audio from browser to my native program.
Is there any way?
It can be done. WebRTC is designed to work in a peer-to-peer manner between two WebRTC agents (typically a Web Browser). Your native program needs to become the second peer.
If you need to rely on open source components a good starting point is:
OpenSSL for the DTLS key exchange.
libsrtp to encrypt the RTP packets.
ffmpeg to decode the PCM audio from the browser (libvpx if you need to do video).
You'll also need to handle the ICE negotiation which requires processing STUN messages. Also extract the media payloads from the RTP packets. All these steps are also after you've determined a signalling method to exchange the SDP offer and answer between you app and the browser.
As you've probably realised starting from scratch it's a major task. There are probably some commercial libraries that will do the job and save you a lot of pain.
If that doesn't scare you and you do still want to make an attempt using open source components this example "may" help. The sample is doing the reverse of what you've asked and is sending a video stream to Chrome rather than receiving an audio stream. The useful aspect is the connection negotiation. The sample program is able to get RTP packets flowing which is often the main problem.
The example is also using Windows Media Foundation which is Windows specific. It also has lots of shortcuts particularly with the RTP and STUN packet processing.
I am working on a live-streaming prototype, I have been reading a lot about how live-streaming works and many different approaches but I still can't find a live-streaming stack that suits my needs...
These are the requirements for my prototype:
1)The video/audio recording must come from a web browser using the webcam, the idea is that the client preferably shouldn't need to install plugins or do anything complicated(maybe installing Flash player plugin is acceptable, only for recording the video, the viewers should be able to view the stream without plugins).
2)It can't be peer to peer since I also need to store the entire video in my server (or in Amazon s3 servers for example) for viewing later.
3)The viewers should also be able to watch the stream without the need of installing anything, from their web browsers, say Chrome and Firefox for example. We want to use the HTML5 video tag if possible.
4)The prototype should be constructed without expending money preferably. I have seen that AWS-Cloudfront and Wowza offer free trials so we are thinking about using these 2 services.
5)The prototype should be able to maintain 1 live stream at a time and 2 viewers, just that, so there are no restrictions regarding this.
Any suggestions?
I am specially stuck/confused with the uploading/encoding video part of the architecture(I am new to streaming and all the formats/codecs/protocols/technologies are making it really hard to digest).
As of right now, I came across WebRTC that apparently allows me to do what I want, record and encode video from the browser using the webcam, but this API only works with HTTPS sites. Are there any alternatives that work with HTTP sites?
The other part that I am not completely sure about is the need for an encoding server, for example Wowza Streaming Engine, why do I need it? Isn't it enough if I use for example WebRTC for encoding the video and then I just send it to the distribution service (AWS-Cloudfront for example)? I do understand that the encoding server would allow me to support many different devices since it will create lots of different encodings and serve many different HTTP protocols, but do I need it for this prototype? I just want to make a 1 format (MP4 for example) live-stream that can be viewed in 2 web browsers, that's all, I don't need variety of formats nor support for different bandwidths or devices.
Base on your requirement, WebRTC is good way.
API only works with HTTPS sites. Are there any alternatives that work
with HTTP sites?
No. Currently Firefox is only browser is allow WebRTC on HTTP, but finally it need HTTPS
For doing this prototype you need to go with the Wowza WebRTC.
While going with wowza all the streams are delivered from the wowza only.So it become a routed WebRTC.
Install Wowza - https://www.wowza.com/docs/how-to-install-and-configure-wowza-streaming-engine
Enable the WebRTC - https://www.wowza.com/docs/how-to-use-webrtc-with-wowza-streaming-engine
Downaload and configure the Streamlock. or Selfsigned JKS file - https://www.wowza.com/docs/how-to-request-an-ssl-certificate-from-a-certificate-authority
Download the sample WebRTC - https://www.wowza.com/_private/webrtc/
Publish stream using the Publish HTML and Play through the Play HTML ( Supported Chrome,Firefox & Opera Browsers)
For MP4 files in WebRTC : you need to enable the transcoder with h264 & aac. Also you need to enable the option Record all the incoming Streams in the properties of application which you are creating for the WebRTC ( Not the DVR ).Using the File writer module save all the recorded files in a custom location.By using a custom script(Bash,Python) Move all the Transcoded files to the s3 bucket, Deliver through cloudfront.
Im working in a project that has jabber has communication platform.
The thing is that i need clients (a lot of clients) to communicate between each other not only for signalization, but to change data between them.
Imagine that the client A has 3 services available. The client B could request to A to start sending him info from each service (like a stream service) until the client B says to A to stop the services.
These services could only send one character with 100ms interval or 1000characters with 100ms interval or even send some data when its needed.
When the info sended to B, arrives it has to know what service corresponds, what action and the values (example), so im using json over jabber.
My problem is that im wasting a lot of bandwith with jabber xmpp protocol just to send a message with a body like:
{"s":"x", "x":5} //each 100ms (5 represents any number)
I really don't want to have parallel communication (like direct sockets), because jabber has all of that implemented and its easy scalable, firewall problems, sometimes i use http communications (im using BOSH in this case).
I know that there is some compression that i can do, but im wondering if you recommends something else that could not have such ammount of xml behind my message and still, using jabber.
Thanks a lot for your help.
Best Regards,
Eduardo
It sounds like, except for your significant data transfer, XMPP suits your application well.
As you probably know, XMPP was never designed or intended to be used as a big pipe for data transfer. Most applications that involve significant data transfer, such as file transfers and voice/video, use XMPP just for negotiation of a separate "out of band" stream. You say this might cause problems for you because of firewalls and web clients.
If your application is mostly transferring text, then you really should try out compression... it offers significant savings on bandwidth, if that's your most constrained resource. The downside is that it will take more client and server memory (around 300KB by default, but that can be reduced with marginal compression loss).
Alternatively you can look at tunneling your data base64-encoded using In-Band Bytestreams. I don't have your sample data, or know how you are wrapping them for transport, and this could come off worse or better. I would say it would come off better if you stripped out your JSON and made it into a more efficient binary format instead. Base64 data will not compress so well, and is roughly 33% larger than the raw data. The savings would be in being able to strip out JSON and any other extraneous wrappings, while keeping the data within the XMPP stream.
In the end scaling most applications is hard, whichever technologies you use. It requires primarily insight - you shouldn't change anything without testing it first, and you should be testing beforehand to find out what you ought to change. You should be analyzing your system for the primary bottlenecks (is it really the client's bandwidth??). Rarely in my experience has XML itself been the direct bottleneck. However ultimately all these things are unique to your application, it's not easy to give generic advice at scale.
No, Xml is no trash. Its human readable, very extensible and can be compressed extremely well.
XMPP supports stream compression, and this stream compression (mostly zlib) works extremely well according to all my tests. So if its important for you that you optimize the number of bytes you send over the wire or are on low bandwidth then use stream compression when you are on sockets. When you are on Bosh then you have to use either a server which supports HTTP compression or use a proxy in between to enable compression. But keep in mind that BOSH has also lots of overhead with all the HTTP headers.
So we have some server with some address port and ip. we are developing that server so we can implement on it what ever we need for help. What are standard/best practices for data transfer speed management between C++ windows client app and server (C++)?
My main point is in how to get how much data can be uploaded/downloaded from/to client via his low speed network to my relatively super fast server. (I need it for set up of his live stream Audio/Video bit rate)
My try on explaining number 3.
We do not care how fast is our server. It is always faster than needed. We care about client tyring to stream out to our server his media. he streams encoded (via ffmpeg) live video data to our server. But he has say ADSL with 500kb/s of outgoing traffic. Also he uses some ICQ or what so ever so he has less than 500 kb/s per second. And he wants to stream live video! So we need to set up our ffmpeg to encode video with respect to the bit rate user can provide. We develop server side and client side. We need a way of finding out how much user can upload per second currently (so value can change dynamically over time)
Check this CodeProject Article
it's dot-net but you can try figure out the technique from there.
I found what I wanted. "thrulay, network capacity tester" A C++ code library for Available bandwidth tracking in real time on clients. And there is "Spruce" and it is also oss. It is made using some of linux code but I use Boost library so it will be easy to rewrite.
Offtop: I want to report that there is some group of people on SO down voting on all questions on this topic - I do not know why they are so angry but they deffenetly exist.