H.264 over RTP - Identify SPS and PPS Frames - c++

I have a raw H.264 Stream from an IP Camera packed in RTP frames. I want to get raw H.264 data into a file so I can convert it with ffmpeg.
So when I want to write the data into my raw H.264 file I found out it has to look like this:
00 00 01 [SPS]
00 00 01 [PPS]
00 00 01 [NALByte]
[PAYLOAD RTP Frame 1] // Payload always without the first 2 Bytes -> NAL
[PAYLOAD RTP Frame 2]
[... until PAYLOAD Frame with Mark Bit received] // From here its a new Video Frame
00 00 01 [NAL BYTE]
[PAYLOAD RTP Frame 1]
....
So I get the SPS and the PPS from the Session Description Protocol out of my preceding RTSP communication. Additionally the camera sends the SPS and the PPSin two single messages before starting with the video stream itself.
So I capture the messages in this order:
1. Preceding RTSP Communication here ( including SDP with SPS and PPS )
2. RTP Frame with Payload: 67 42 80 28 DA 01 40 16 C4 // This is the SPS
3. RTP Frame with Payload: 68 CE 3C 80 // This is the PPS
4. RTP Frame with Payload: ... // Video Data
Then there come some Frames with Payload and at some point a RTP Frame with the Marker Bit = 1. This means ( if I got it right) that I have a complete video frame. Afer this I write the Prefix Sequence ( 00 00 01 ) and the NALfrom the payload again and go on with the same procedure.
Now my camera sends me after every 8 complete Video Frames the SPS and the PPS again. ( Again in two RTP Frames, as seen in the example above ). I know that especially the PPS can change in between streaming but that's not the problem.
My questions are now:
1. Do I need to write the SPS/PPS every 8th Video Frame?
If my SPS and my PPS don't change it should be enough to have them written at the very beginning of my file and nothing more?
2. How to distinguish between SPS/PPS and normal RTP Frames?
In my C++ Code which parses the transmitted data I need make a difference between the RTP Frames with normal Payload an the ones carrying the SPS/PPS. How can I distinguish them? Okay the SPS/PPS frames are usually way smaller, but that's not a save call to rely on. Because if I ignore them I need to know which data I can throw away, or if I need to write them I need to put the 00 00 01 Prefix in front of them. ? Or is it a fixed rule that they occur every 8th Video Frame?

If the SPS and PPS do not change, you could omit them except the 1st ones.
You need to parse the nal_unit_type field of each NAL, for SPS, nal_unit_type==7; for PPS, nal_unit_type==8.
As I remember, nal_unit_type is the lower 5 bits of the 1st byte of a frame.
nal_unit_type = frame[0] & 0x1f;

You should write SPS and PPS at the start of stream, and only when they change in the middle of stream.
SPS and PPS frames are packed in a STAP NAL unit (generally STAP-A) with NAL type 24 (STAP-A) or 25 (STAP-B) STAP format is described in RFC-3984 section 5.7.1
Don't rely on marker bit, use start bit and end bit in NAL header.
For fragmented video frames you should regenerate NAL unit using 3 NAL unit bits of first fragment (F, NRI) combined with 5 NAL type bits of first byte in payload (only for packets with start bit set to 1) check RFC-3984 section 5.8:
The NAL unit type octet of the fragmented
NAL unit is not included as such in the fragmentation unit payload,
but rather the information of the NAL unit type octet of the
fragmented NAL unit is conveyed in F and NRI fields of the FU
indicator octet of the fragmentation unit and in the type field of
the FU header.
EDIT: more explanation about NAL unit construction for fragmentation units:
this is first two bytes of a FU-A payload (right after rtp header):
| FU indicator | FU header |
+---------------+---------------+
|0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|F|NRI| Type |S|E|R| Type |
+---------------+---------------+
to construct the NAL unit you should take "Type" from "FU Header" and "F" and "NRI" from "FU indicator"
here is a simple implementation

Related

receiving and merging the file fragments over the CAN message

I made a simple code for streaming & transferring a file over 8 bytes CAN message via the CAN Bus,
my code in c is as follows, however, my question is how to merge the fragmented file without any sequence controller?
how do I check the CRC of the receiving file?
since the CAN standard has its own acknowledgment, would that be sufficient for such huge streaming of a file?
typedef struct {
union {
struct {
uint32_t extd: 1;
uint32_t rtr: 1;
uint32_t ss: 1;
uint32_t self: 1;
uint32_t dlc_non_comp: 1;
uint32_t reserved: 27;
};
uint32_t flags;
};
uint32_t identifier;
uint8_t data_length_code;
uint8_t data[TWAI_FRAME_MAX_DLC];
} CAN_message_t;
#define destinationNode 20000x
CAN_message_t msg;
msg.identifier=destinationNode;
msg.data_length_code=8
File date = file.open("/data.bin");
uint8_t *p=date;
while(p){
char buffer[8];
memcpy(buffer, (p+8), 8);
CAN_transmit(&msg);
p+=8;
}
=========================================================
edit code
open the file and send the size and start point to the following function and then close the file
#define SEQUENCE_START 1000000
bool stream(size_t filesize,uint8_t *p){
uint32_t identifer=SEQUENCE_START;
twai_message_t message;
while(filesize<8) {
memcpy(message.data, (p+8), 8);
message.identifier=identifer;
message.data_length_code=8;
if( twai_transmit(&messageOutRPM, pdMS_TO_TICKS(1)) == ESP_OK){
p+=8;
identifer++;
filesize-=8;
}
}
if(filesize>0) {
memcpy(message.data, (p+filesize), filesize);
message.identifier=identifer;
message.data_length_code=filesize;
if( twai_transmit(&messageOutRPM, pdMS_TO_TICKS(1)) == ESP_OK) return true;
}
return true;
}
how to merge the fragmented file without any sequence controller?
There is absolutely no guarantee by CAN bus that the sent frames will be received. They might have CAN errors on the bus preventing some frames to be sent out.
Automotive engineers need to send files over the CAN network in order to implement software updates. To do that, they need to send frames which are way larger than 8 bytes. They defined a small Transport Protocol on the top of CAN: ISO-15765, usually named ISO-TP.
In this protocol, the frames are sent by group. The number of elements in the group is defined during the exchange and can possibility change during the frame transfer.
To give you an example of the communication flow:
SENDER -> RECEIVER: request to send a 800 bytes frame
SENDER <- RECEIVER: accepted, please group the frames by 4
SENDER -> RECEIVER: send part 1
SENDER -> RECEIVER: send part 2
SENDER -> RECEIVER: send part 3
SENDER -> RECEIVER: send part 4
SENDER <- RECEIVER: well-received, continue
SENDER -> RECEIVER: send part 5
SENDER -> RECEIVER: send part 6
SENDER -> RECEIVER: send part 7
SENDER -> RECEIVER: send part 8
SENDER <- RECEIVER: well-received, continue but please group by 8
SENDER -> RECEIVER: send part 9
SENDER -> RECEIVER: send part 10
SENDER -> RECEIVER: send part 11
SENDER -> RECEIVER: send part 12
SENDER -> RECEIVER: send part 13
SENDER -> RECEIVER: send part 14
SENDER -> RECEIVER: send part 15
SENDER -> RECEIVER: send part 16
SENDER <- RECEIVER: well-received, continue
In order to identify which part of the frame is being transmitted, a byte is used as a frame counter. It's a rolling counter, the point is to make sure the completeness of the data. If the frames are not received in the correct order, it does not matter much, as the software is able to determine that no frame has been lost.
[...] long exchange
SENDER -> RECEIVER: FD 00 00 00 00 00 00 00 part N+0
SENDER -> RECEIVER: FE 00 00 00 00 00 00 00 part N+1
SENDER -> RECEIVER: FF 00 00 00 00 00 00 00 part N+2
SENDER -> RECEIVER: 00 00 00 00 00 00 00 00 part N+3
^^
Rolling counter, just 1 byte
This transport layer is usually quite generic, it's frequent to see it available as a library provided by the CAN tool provider. You can also find some Open Source implementations.
since the CAN standard has its own acknowledgment, would that be sufficient for such huge streaming of a file
Actually, CAN bus has its own CRC at physical level, it should be enough for most cases. But if one want to add a custom checksum, one just need to define its length and prepend or append it to the data. Then, the receiver can re-calculate the CRC just after the completion of the transfer.
This code is questionable for several reasons.
First of all, bit-fields are poorly standardized and the bit order may not be the one you expect.
Second, the struct as you posted it will very likely contain padding after data_length_code so writing/reading it to some binary file will be problematic and non-portable.
At any rate I doubt p+8 will ever be correct, because even if there is no padding sizeof(uint32_t)+sizeof(uint32_t) puts us at the data_length_code member, not the data. Why would you want to copy the DLC and 7 bytes into some buffer? This is a bug.
since the CAN standard has its own acknowledgment, would that be sufficient for such huge streaming of a file?
You may want something as CRC32 to ensure there are no corruptions of the file. For the CAN transfer itself you don't need CRC since CAN comes with CRC-15 built-in.
But note that a CRC in the CAN data may be necessary in case of ugly hardware solutions with external CAN controllers. Such legacy solutions involve an exposed SPI bus which has no built-in error control what so ever. Modern electronics only use external CAN controllers in case one is stuck with some exotic MCU that must be used for other reasons, but it doesn't come with CAN on-chip.

How to decode modbus frame to float value using Arduino code?

Hello please I need your help, I am currently working on an emodbus project with arduino, I want to read the data from energy meter to the serial monitor on the arduino board,
I send for example the following frame of the arduino towards the meter to recover the value of the tension:
01 03 00 12 00 02 64 0E
in response from the counter to the arduino card I receive the following frame:
01 03 04 43 54 19 9A 25 9C
which must have the value: 212.1
my problem is that i could not display on the serial monitor
how can i decode this frame with arduino code to get the true value
Read here about the modbus library
Frame formats (This answers my question from the comment - you should have known that)
A Modbus "frame" consists of an Application Data Unit (ADU), which
encapsulates a Protocol Data Unit (PDU):[10]
ADU = Address + PDU + Error check,
PDU = Function code + Data.
The byte order for values in Modbus data frames is most significant
byte of a multi-byte value is sent before the others. All Modbus
variants use one of the following frame formats.[1] Modbus RTU frame
format (primarily used on asynchronous serial data lines like
RS-485/EIA-485) Name Length (bits) Function Start 28 At least 3½
character times of silence (mark condition) Address 8 Station
address Function 8 Indicates the function code; e.g., read
coils/holding registers Data n × 8 Data + length will be filled
depending on the message type CRC 16 Cyclic redundancy check End 28
At least 3½ character times of silence between frames
Before using the library or building blocks from it read the issues first.
For the application to emodbus go here: Look into the files emodbus.h and emodbus.cpp and etools.h and etools.cpp

boost asio async_read() seems to be skipping some nulls

I'm going a bit crazy with a simple boost asio TCP conversation.
I have a server and a client. I use length-prefixed messges. The client sends "one" and the server responds with "two". So this is what I see happen:
The client sends, and the server receives, 00 00 00 03 6F 6E 65 (== 0x0003 one).
The server responds by sending 00 00 00 03 74 77 6F (== 0x0003 two).
Now here is where it is very strange (code below). If the client reads four bytes, I expect it to get 00 00 00 03. If it reads seven, I expect to see 00 00 00 03 74 77 6F. (In fact, it will read four (the length header), then three (the body).)
But what I actually see is that, while if I read seven at once I do see 00 00 00 03 74 77 6F, if I only ask for four, I see 74 77 6F 03. This doesn't make any sense to me.
Here is the code I'm using to receive it (minus some print statements and such):
const int kTcpHeaderSize = 4;
const int kTcpMessageSize = 2048;
std::array<char, kTcpMessageSize + kTcpHeaderSize> receive_buffer_;
void TcpConnection::ReceiveHeader() {
boost::asio::async_read(
socket_, boost::asio::buffer(receive_buffer_, kTcpHeaderSize),
[this](boost::system::error_code error_code,
std::size_t received_length) {
if (error_code) {
LOG_WARNING << "Header read error: " << error_code;
socket_.close(); // TODO: Recover better.
return;
}
if (received_length != kTcpHeaderSize) {
LOG_ERROR << "Header length " << received_length
<< " != " << kTcpHeaderSize;
socket_.close(); // TODO: Recover better.
return;
}
uint32_t read_length_network;
memcpy(&read_length_network, receive_buffer_.data(),
kTcpHeaderSize);
uint32_t read_length = ntohl(read_length_network);
// Error: read_length is in the billions.
ReceiveBody(read_length);
});
}
Note that kTcpHeaderSize is 4. If I change it to 7 (which makes no sense, but just for the experiment) I see the stream of 7 bytes I expect. When it is 4, I see a stream that is not the first four bytes of what I expect.
Any pointers what I am doing wrong?
From what I can see in your code it should work according to the async_read documentation:
The asynchronous operation will continue until one of the following conditions is true:
The supplied buffers are full. That is, the bytes transferred is equal to the sum of the buffer sizes.
An error occurred.
However see the remark at the bottom:
This overload is equivalent to calling:
boost::asio::async_read(
s, buffers,
boost::asio::transfer_all(),
handler);
It looks like the transfer_all condition might be the only thing checked.
Try using the transfer_exactly condition and if it does work report an issue on https://github.com/boostorg/asio/issues.
The suggestion by #sergiopm to use transfer_all was good, and I'm pretty sure it helped. The other issue involved buffer lifetimes in the asynchronous send/receive functions. I got a bit confused, apparently, about how long certain things would live and how long I needed them to live, and so I was overwriting things from time to time. That may have been more important than transfer_all, but I'm still happy to give #sergiopm credit for helping getting me on my way.
The intent has just been to have a simple tcp client or server that I can declare, hand it a callback, and then go on my way knowing that I can only pay attention to those callbacks.
I'm pretty sure something like this must exist (thousands of times over). Do feel free to comment below, both for me and for those who come after, if you think there are better libraries than asio for this task (i.e., that would involve substantially less code on my part). The principle constraint is that, due to multiple languages and services, we need to own the wire protocol. Otherwise we get into things like "does library X have a module for language Y?".
As an aside, it's interesting to me that essentially every example I've found does length-prefix encoding rather than beginning/end of packet encoding. Length prefix is really easy to implement but, unless I'm quite mistaken, suffers from re-sync hell: if a stream is interrupted ("I'm going to send you 100 bytes, here are the first 50 but then I died") it's not clear to me that there aren't scenarios where I'm unable to resync properly.
Anyway, I learned a lot along the way, I recommend the exercise.

How to find frame end when MPEG2 stream coming in MPEG-TS Container over RTP?

I am receiving MPEG2-TS stream over RTP. But i am unable to find the end of a particular frame.
When only MPEG2 stream came over RTP then marker bit in RTP header is set to 1 when there is end of any frame , but in this case marker bit is always 0.
Can anyone help me , how can i find the frame end in case of MPEG2-TS?
According to RFC 2250 M bit should indicate the end of frame in case of mpeg-ts. (3.3 RTP Fixed Header for MPEG ES encapsulation) but many decoder may not be putting it in header.
only other way to find the start of frame is to decode the header of 188 byte mpeg-ts packet.mpeg-ts contains "Payload Unit Start Indicator".
so your algo will be like
RTP data contain integer number of mpeg-ts packets.
each packet starts with 0x47
check the "payload unit start indicator" fiels for each packet
if "payload unit start indicator == 1" check the if PES or PSI
ignore packet if PSI and continue with step-1, else go to next step
for PES packet check "Stream id" if its video you hit a new frame.

Binary through http

I'm using C++ to send post-request with binary information. The code looks like:
int binary[4] = { 1, 2, 3, 4 };
std::stringstream out;
out << "POST /address HTTP/1.1\r\n";
out << "Host: localhost\r\n";
out << "Connection: Keep-Alive\r\n";
out << "Content-Type: application/octet-stream\r\n";
out << "Content-Transfer-Encoding: binary\r\n";
out << "Content-Length: " << 4*sizeof(int) << "\r\n\r\n"; // 4 elements of integer type
And sending data into opened connection in socket:
std::string headers = out.str();
socket.send(headers.c_str(), headers.size()); // Send headers first
socket.send(reinterpret_cast<char*>(&binary[0]), bufferLength*sizeof(int)); // And array of numbers
But I was told, that sending pure bytes through http-protocol is wrong. Is that right? For example, I can't send 0 (zero), it's used by protocol.
If that's right (because I can't handle that post-request and get the data I've sent) what could I use instead? Maybe, convert array into hex or base64url?
Thanks.
The problem people saying it's wrong are addressing is about the endianness. You can transfer binary data with http of course, but when the other end receives them, it must be able to interpret them correctly. Let's suppose your machine is a little endian machine; your integers will be, in memory, stored as (32 bit int)
01 00 00 00
02 00 00 00
03 00 00 00
04 00 00 00
and you send these 16 bytes as they "are". Now, suppose the receiving machine get the data naively disregarding who and how they are sent, and suppose that machine is a big endian machine; in such machine, the memory layout for 1, 2, 3, 4 intergers would be
00 00 00 01
00 00 00 02
00 00 00 03
00 00 00 04
This means that for the receiving machine the first integer is 0x01000000 which is not 0x00000001 as the sender wanted.
If you decide that your integers must be sent always as big endian integer, then if the sender is a little endian machine, it needs to "re-arrange" properly the integers before sending. There are functions like hton* (host to net) that "transforms" host 32/16 bit integers to the "net byte order" that is big endian (and viceversa, with ntoh* net to host)
Note that data are not scrambled, they are send as they "are", so to say. What changes is the way you store them in memory, and the way you interpret them when reading. Usually it's not an issue, since data are sent according to a format that, if needed, specifies the endianness of non-single-byte data (e.g. see PNG format spec, sec 2.1, integers byte order: PNG uses net byte order i.e. big endian)
But I was told, that sending pure bytes through http-protocol is
wrong. Is that right?
No, it is fine in the body, depending on the Content-Type of course. "Octet-stream" should be fine in this regard, and yes it can contain zero bytes.
There is nothing wrong to send binaries via HTTP.
This happens all the time with images and with file upload