Converting PCM-ALAW data to an audio file using ffmpeg - c++

In my project, I processed the received RTP packets with the payload, and extracted all the payload to a separate buffer. This payload is - PCM ALAW (Type 8). How do I implement a class that will take as arguments - the file name and a buffer with raw data to create an audio file. Exactly what steps do I have to go through in order to encode raw data into an audio file? As an example, I used this example.

That sounds way too complex. "PCM ALAW" is a bit misleading, but it's pretty clear that G.711 aLaw encoding is meant. That's a trivial "compression" which maps each 16 bits PCM sample to an 8 bits value. So a trivial lookup fixes that.
There's even a Free implementation of the aLaw encoding available. Just convert each sample to 16 bits PCM, stuff a standard Microsoft WAVE header in front of it, and call the result .WAV.
You'll need to fill in a few WAV headers based on the RTP type 8. Chiefly, that's "Mono, 8000 Hz, 16 bits per sample". One small problem with the header is that you can only write the full header once you know how many samples you have. You could update the header whenever you receive a RTP packet, but that's a bit I/O intensive. It might be nicer to do that once per 10 packets or so.

Related

[rtp/rtcp server]How to prepare a stored media file for steaming?

Now i'm trying to understand the rtp/rtcp protocol(RFC3550).
I knew that in common case,the audio and video steaming is separately.
But if i want to steaming a stored media file(such as *.mp4) in the server,
how does the server get those tracks from that media file?
RTP is all about carrying the real time data, how you break it up and put it into a RTP packet payload (Called "Packetizing") is up to the implementer, but let's look at a common use case of how you'd actually do this.
If you wanted to send your existing recorded MP4 file through an RTP stream you'd first break it into smaller chunks to be sent down the wire at regular intervals packed inside RTP packets.
Let's say you've got a 10 second MP4 file and you decide your packetization timer is 1 second, we'd split it into 10x 1 second long chunks of data we can put into our RTP payloads. (In practice you could use FFMPeg or something similar to split the MP4 into 1 second chunks)
Then we form our RTP header, we set the Payload Type to something custom, as there's no payload type for MP4 data assigned by IANA. We'd assign a starting sequence number, a Synchronization Source Identifier and a timestamp, and then we'd fill the payload with the first 1 second of data.
1 second after that we'd increment the sequence number by 1, add 1 second to the timestamp, add the next 1 second of data to the payload and send the next RTP header.
We'd then repeat this 8 more times until we've sent 10 RTP packets containing our 10x 1 second MP4 payloads.
If you actually wanted to go about implementing this I wrote this simple Python Library for creating RTP packets,
To learn more about RTP there's obviously RFC 3550, for a really in depth look at RTP there's a great book by Colin Perkins called "RTP: Audio and Video for the Internet" and I've written a bit about all the RTP headers and their meaning.
In practice if you want to get a pre-recorded MP4 file from point A to point B there's better protocols for it than RTP, RTP is focused on the real time transfer of media, as in live-streaming style, not transferring existing pre-recorded media files, FTP, HTTP or even some of the peer-to-peer protocols would be better suited at transferring this.

Avcodec : generate OPUS header for a stream

I'm using OPUS with avcodec to encode sounds and stream it using my own protocol.
It works with the MP2 codec so far but when I'm switching to OPUS, I have this issue :
[opus # 1b06d040] Error parsing the packet header.
I suppose that unlike MP2, I need to generate a header for my OPUS encoded data stream but I don't know how.
Can someone explain me how to do that? Thanks.
This error comes from ff_opus_parse_packet() failing, which handles the raw opus packet header, what the specification calls the 'TOC' (for table-of-contents) byte and optional subframe lengths. It means libavcodec couldn't find the packet duration where it expected.
So probably your custom protocol is corrupting the data, returning the wrong data length, or you're otherwise not splitting the opus packet out of your framing layer correctly.
You don't need to invent your own protocol if you don't want to. There are two established designs: Opus over RTP for interactive use (like live chat where latency matters) is documented in RFC 7587. For HTTP streaming, file storage for recording, playback and other applications like that use the Ogg container, documented here. There are implementations of both of these in libavformat. See rtpenc.c, oggenc.c and oggparseopus.c if you're curious about the details.

Resample PCM network stream to 8000Hz 8-bit mono via libsndfile sf_open_virtual function

My goal is to take a PCM stream in Node.js that is, even for example, 44100Hz 16 bit stereo, and then resample it to 8000 Hz 8 bit mono to then be encoded into Opus and then streamed.
My thought was to try making bindings for libsndfile in C++ and using sf_open_virtual function for resampling on the stream. However:
How can I reply to its callback function requesting a certain amount
of data (found here:
http://www.mega-nerd.com/libsndfile/api.html#open_virtual) if my
program is still receiving data from the network? Do I just let it
hang in a loop until the loop detects that the buffer is a certain
percent full?
Since the PCM data is going to be headerless, how can
I specify the format type for libsndfile to expect?
Or am I over-complicating things totally?

Sending a file via qextserialport

I'm using the qextserialport classes in Qt to implement serial transmissions between devices. Now, I'm required to send a file between devices connected via USB using the serialport. I have used the serialport for various functions in the past weeks but I have no idea where to start implementing this. I thought about reading data event-driven until there is no more data to read, determine the size (number of bytes) of the file before hand and send it together with the data so that it would be clear if data went missing. I also have a correct function that calculates the CCIITT 16 bit checksum so I can use that for checking as well. My question therefore is:
Can someone please send me a link to a site that could help solve my problem and explain to me what would be the most simple and effective way to send a file and receive a file via the qextserialport class in Qt. ANY help would be awsum!
You need a protocol. Simple one could be:
send length of file name as raw binary number, for example 2 bytes in network byte order
(max name length 65535 bytes)
send that many bytes of file name, encoded with UTF-8
send file size as raw binary number, for example 4 bytes in network byte order (max file size 4 gigs)
send that many bytes of file contents
You might want to add info like file date, and checksum. More advanced would be to split file to chunks, so if there is transmission error, you don't have to re-send everything. Etc.
Also, study protocols like Kermit, xmodem, zmodem to see how it's been done in the modem and BBS era. Maybe use an existing protocol like that instead of creating your own.
Note: while you could use QDataStream, it requires reliable channel, so it's not as easy as it may seem at first (requires extra buffering).

MPEG4 out of Raw RTP Payload

Okay I got the following problem:
I have an IP Camera which is able to stream MPEG4 data over RTP
I am able to connect to this camera via RTSP
I can receive the raw RTP data.
So what problems do I have now?
1. Extract Data
What is the data I actually want? I know that I have to trunkate the RTP Header - but is there anything else I need to cut from the RTP packets?
2. Packetization Mode
I read that I should expect a field Packetization Mode in my SDP- well it's not there. Does that mean I have to assume some kind of standard packetization mode?
3. Depacketization
If I got it right I need to buffer all incoming frames with the Marker Bit = false until I get a frame with Marker Bit = true to get a complete MPEG4 Frame. What exactly do I have to understand by MPEG4 Frame? Keyframe + data until next keyframe?
4. Decode
Do I have the decode the data any further then? In other threads I saw that people used another decoder - but what is there left to decode? I mean the camera should send the data already MPEG4 coded?
5. Libraries
If I really need to decode the data, are there any open libraries I could use for that? Or maybe there is even a library which has some functions where I can just dump my RTP data and then magic happens and I get my mp4. ( But I assume there will be nothing like that .. )
Note: Everything I want to do should be part of my own application, meaning for example, I can't use an external software to parse the data.
Well long story short - I'd really need some kind of step by step explanation for this to do. I know this is a broad question but I don't know any further. I also looked into the RFCs, but I couldnt extract much information out of them.
Also I already looked up these two Questions:
How to process raw UDP packets so that they can be decoded by a decoder filter in a directshow source filter
MPEG4 extract from RTP payload
But also the long answer from the first question could not make everything clear to me.
UPDATE: Well I informed a bit further and now I don't know where to look anymore. It seems that all the packetization stuff etc. is actually not needed for my purpose. I also recorded a stream with openRTSP. When I open those files in a Hex-Editor I see that there are 16 Bytes which I can't identify, followed by the config part of the SDP. Then the frame starts with the usual 00 00 01 B6. Also oprenRTSP adds some kind of tail to the MP4 - well I actually don't know what I need and whats just some "extra" stuff which isn't mandatory.
I know that I have to trunkate the RTP Header - but is there anything
else I need to cut from the RTP packets?
RTP packet might have stuff data from a file format (such as MP4) or it could have directly based on RFC 3640 or something similar. You need to find that out.
What exactly do I have to understand by MPEG4 Frame? Keyframe + data
until next keyframe? Do I have the decode the data any further then?
In other threads I saw that people used another decoder - but what is
there left to decode? I mean the camera should send the data already
MPEG4 coded?
You should explore basics of MPEG compression to appreciate this fully. The depacketization only give you a string of bits. This is compressed data. You need to uncompress it (decode it) to see it on the screen.
are there any open libraries I could use for that?
try ffmpeg or MPEG4IP