Search for i-frame in RTP Packet - rtp

I am implementing RTSP in C# using an Axis IP Camera. Everything is working fine but when i try to display the video, I am getting first few frames with lots of Green Patches. I suspect the issue that I am not sending the i-frame first to the client.
Hence, I want to know the algorithm required to detect an i-frame in RTP Packet.

when initiating a RTSP-Session the server normaly starts the RTP-stream with config-data followed by the first I-Frame.
It is thinkable, that your Axis-camera is set to "always multicast" - in this case the RTSP-communication leads to a SDP description which tells the client all necessary network and streaming details for receiving the multicast stream.
Since the multicast stream is always present, you most probably receive some P- or B- frames first (depending on GOP-size).
You can detect these P/B-frames in your RTP client the same way you were detecting the I-frames as suggested by Ralf by identyfieng them via the NAL-unit type. Simply skip all frames in the RTP client until you receive the first I-frame.
Now you can forward all following frames to the decoder.
or you gave to change you camera settings!
jens.
ps: don't forget that you have fragmentation in your RTP stream - that means that beside of the RTP header there are some fragmentation information. Before identifying a frame you have to reassemble it.

It depends on the video media type. If you take H.264 for instance, you would look at the NAL unit header to check the nal unit type.
The green patches can indeed be caused by not having received an iframe first.

Related

[rtp/rtcp server]How to prepare a stored media file for steaming?

Now i'm trying to understand the rtp/rtcp protocol(RFC3550).
I knew that in common case,the audio and video steaming is separately.
But if i want to steaming a stored media file(such as *.mp4) in the server,
how does the server get those tracks from that media file?
RTP is all about carrying the real time data, how you break it up and put it into a RTP packet payload (Called "Packetizing") is up to the implementer, but let's look at a common use case of how you'd actually do this.
If you wanted to send your existing recorded MP4 file through an RTP stream you'd first break it into smaller chunks to be sent down the wire at regular intervals packed inside RTP packets.
Let's say you've got a 10 second MP4 file and you decide your packetization timer is 1 second, we'd split it into 10x 1 second long chunks of data we can put into our RTP payloads. (In practice you could use FFMPeg or something similar to split the MP4 into 1 second chunks)
Then we form our RTP header, we set the Payload Type to something custom, as there's no payload type for MP4 data assigned by IANA. We'd assign a starting sequence number, a Synchronization Source Identifier and a timestamp, and then we'd fill the payload with the first 1 second of data.
1 second after that we'd increment the sequence number by 1, add 1 second to the timestamp, add the next 1 second of data to the payload and send the next RTP header.
We'd then repeat this 8 more times until we've sent 10 RTP packets containing our 10x 1 second MP4 payloads.
If you actually wanted to go about implementing this I wrote this simple Python Library for creating RTP packets,
To learn more about RTP there's obviously RFC 3550, for a really in depth look at RTP there's a great book by Colin Perkins called "RTP: Audio and Video for the Internet" and I've written a bit about all the RTP headers and their meaning.
In practice if you want to get a pre-recorded MP4 file from point A to point B there's better protocols for it than RTP, RTP is focused on the real time transfer of media, as in live-streaming style, not transferring existing pre-recorded media files, FTP, HTTP or even some of the peer-to-peer protocols would be better suited at transferring this.

How to obtain mp3 audio packets for streaming in C/C++

I want to be able to break a song into packets and have access to these individual packets.
The reason for that is that I want to send each individual packet over the network using an experimental network protocol called Named Data Network.
As the packets arrive at the destination I want to play them. So I want to implement a streaming functionality. The only difference is the network layer that I will use. This network layer is not based on IP.
Does anyone know any C/C++ implementation of breaking a song file into pieces and then playing these packets individually? I looked over Gstreamer, but it seems complicated to get individual packets from its pipeline structure.
I found this reference which was the closest to what I wanted, however it was not so clear for me: how can I parse audio raw data recorder with gstreamer?
Summarizing the points I need:
Break a song into packets
Play the audio content of a single packet (or a small set of packets).
Thank you very much for the help!
An MP3 file is just a succession of MP3 frames. Each frame is made of a header and a data block.
Splitting the MP3 file as MP3 frames will involve parsing the MP3 file. You can refer to this documentation for a good description of the format.
Note that in the case of mpeg layer 3 codec, frames are not independant. In the worst case, 9 input frames may be needed before beeing able to decode one single frame.
What I would do instead of this
I guess you could probably ignore most of these details and focus on the streaming problem itself. Here is what I would try to build first:
on the sender side, split a file into packets, and send them one by one using your system. Command example: send_stream test.mp3
on the receiver side, receive the packets and rebuild the original file. Command example: receive_stream test.mp3
Once you have this working fine, modify the receiver program so that it writes the packets in-order on the standard output. This will allow you to redirect stdout to a file
# sender side did not change
send_stream test.mp3
# receiver side
receive_stream > test.mp3
Then, you can use madplay to play the mp3 while it is received simply by redirecting receive_stream output to madplay:
# madplay - tells madplay to read its input from standard input.
receive_stream | madplay -
For a good mp3 decoder, take a look at MAD.

h.264 I-frame loss handling in rtsp streaming

I am developing a player which open rtsp stream using Live555 and using FFMPEG to decode video stream. I am stuck at a point, where IDR frame is getting lost over the network, so that after decoding its successor B/P frames, it shows a jittering effect in video. It gives a very bad performance in video.
So my question is, How can I handle I-frame packet loss? I would like to know if there is any strategy/algorithm to handle packet loss, so that video should be smooth or clear.
Any help will be appreciated.
Thank You.
If it's a first approach, I guess you decode the frame synchronously, I mean the Live555 afterGetting callback call directly the avcodec_decode_video2 of FFMPEG.
In such case the receiving socket is not read during decoding, then packets are buffered till it overflow.
You can try different workaround like increasing the socket buffer, using RTP over TCP, but a real solution need to be more asynchronous, for instance afterGetting can push data to a fifo and the decoding thread can get from it.
Well, once an I-frame is lost, it's lost. You can't really do anything on the client side. The only way we could attack this problem was to configure the server (ie: streamer) in a way that it will send either more frequently I-frames (ie: MORE I-frames in a stream) or more infrequent I-frames (ie_ LESS I-frames in the stream) (if you use ffmpeg/libx264 it can be fine tuned to an incredible level of precision when to send I-frames).

MPEG4 out of Raw RTP Payload

Okay I got the following problem:
I have an IP Camera which is able to stream MPEG4 data over RTP
I am able to connect to this camera via RTSP
I can receive the raw RTP data.
So what problems do I have now?
1. Extract Data
What is the data I actually want? I know that I have to trunkate the RTP Header - but is there anything else I need to cut from the RTP packets?
2. Packetization Mode
I read that I should expect a field Packetization Mode in my SDP- well it's not there. Does that mean I have to assume some kind of standard packetization mode?
3. Depacketization
If I got it right I need to buffer all incoming frames with the Marker Bit = false until I get a frame with Marker Bit = true to get a complete MPEG4 Frame. What exactly do I have to understand by MPEG4 Frame? Keyframe + data until next keyframe?
4. Decode
Do I have the decode the data any further then? In other threads I saw that people used another decoder - but what is there left to decode? I mean the camera should send the data already MPEG4 coded?
5. Libraries
If I really need to decode the data, are there any open libraries I could use for that? Or maybe there is even a library which has some functions where I can just dump my RTP data and then magic happens and I get my mp4. ( But I assume there will be nothing like that .. )
Note: Everything I want to do should be part of my own application, meaning for example, I can't use an external software to parse the data.
Well long story short - I'd really need some kind of step by step explanation for this to do. I know this is a broad question but I don't know any further. I also looked into the RFCs, but I couldnt extract much information out of them.
Also I already looked up these two Questions:
How to process raw UDP packets so that they can be decoded by a decoder filter in a directshow source filter
MPEG4 extract from RTP payload
But also the long answer from the first question could not make everything clear to me.
UPDATE: Well I informed a bit further and now I don't know where to look anymore. It seems that all the packetization stuff etc. is actually not needed for my purpose. I also recorded a stream with openRTSP. When I open those files in a Hex-Editor I see that there are 16 Bytes which I can't identify, followed by the config part of the SDP. Then the frame starts with the usual 00 00 01 B6. Also oprenRTSP adds some kind of tail to the MP4 - well I actually don't know what I need and whats just some "extra" stuff which isn't mandatory.
I know that I have to trunkate the RTP Header - but is there anything
else I need to cut from the RTP packets?
RTP packet might have stuff data from a file format (such as MP4) or it could have directly based on RFC 3640 or something similar. You need to find that out.
What exactly do I have to understand by MPEG4 Frame? Keyframe + data
until next keyframe? Do I have the decode the data any further then?
In other threads I saw that people used another decoder - but what is
there left to decode? I mean the camera should send the data already
MPEG4 coded?
You should explore basics of MPEG compression to appreciate this fully. The depacketization only give you a string of bits. This is compressed data. You need to uncompress it (decode it) to see it on the screen.
are there any open libraries I could use for that?
try ffmpeg or MPEG4IP

Service a live OpenCV H.264 stream through Live555 on Windows

Totally new to this! As the title says, I'm trying to serve a stream from OpenCV through Live555 using H.264 that is captured from a webcam.
I've tried something like:
#define LOCALADDRESS "rtsp://localhost:8081" // Address media is served
#define FOURCCCODEC CV_FOURCC('H','2','6','4') // H.264 codec
#define FPS 25 // Frame rate things run at
m_writer = cvCreateVideoWriter(LOCALADDRESS, FOURCCCODEC, FPS, cvSize(VIDEOWIDTH, VIDEOHEIGHT));
as reading a rtsp stream, is done similarly:
CvCapture *capture = cvCreateFileCapture(LOCALADDRESS);
which doesn't work so I'm turning to Live555. How do I feed a CvCapture encoded in H.264 to be served by Live555? There doesn't seem to be a straitforward way to serve a bytestream from one to another or perhaps I'm missing something.
There really isn't a straight-forward way I know of; certainly nothing that will happen in anything less than a few hundred lines of code.
I'm assuming you want to use an on-demand RTSP server (this is where the server's just sitting there, waiting for a client to connect, and then it starts streaming when the client establishes a connection and makes a request)? If so, this item in the Live555 FAQ applies.
However, Live555 is a weird (possibly misguided?) library, so it's unfortunately a bit more complicated than that. Live555 uses a single thread of operation with an event loop, so what you'll have to do is shove your raw bytestream into a buffer or queue, and then in your subsession class for streaming H.264, you'll check and see if there's available data in the queue and if so, pass it along. If not, schedule another check in a few milliseconds. You'll also need to strip off any NALU identifiers before you pass them along to live555.