AVPF scheduling algorithm - scheduling

According to RFC4585:
If R's FB message(s) was not suppressed by other receiver FB
messages as per 5, when te is reached, R MUST transmit the
(minimal) compound RTCP packet containing its FB message(s). R
then MUST set allow_early = FALSE, MUST recalculate tn = tp +
2*T_rr, and MUST set tp to the previous tn. As soon as the newly
calculated tn is reached, regardless whether R sends its next
Regular RTCP packet or suppresses it because of T_rr_interval, it
MUST set allow_early = TRUE again.
suppose that an early RTCP packet is transmitted at time t1'
t1---t1'------t2------------t3
and t1, t2, t3 are the times at which regular RTCP packets would be sent. The RFC is suggesting that if a packet is sent at time t1', the RTCP packet at time t2 is dropped and no feedback messages are allowed until t3.
However, later on in the same RFC it is said that
Full compound RTCP packets MUST be sent in regular intervals. These
packets MAY also contain one or more FB messages. Transmission of
Regular RTCP packets is scheduled as follows:
If T_rr_interval == 0, then the transmission MUST follow the rules
as specified in Sections 3.2 and 3.4 of this document and MUST
adhere to the adjustments of tn specified in Section 3.5.2 (i.e.,
skip one regular transmission if an Early RTCP packet transmission
has occurred). Timer reconsideration takes place when tn is
reached as per [1]. The Regular RTCP packet is transmitted after
timer reconsideration. Whenever a Regular RTCP packet is sent or
suppressed, allow_early MUST be set to TRUE and tp, tn MUST be
updated as per [1]. After the first transmission of a Regular RTCP
packet, Tmin MUST be set to 0.
This seems to be contradictory, because the paragraph is saying that when an RTCP packet is suppressed, allow_early must be set to TRUE, so it seems that allow_early should be set to TRUE at time t2.

Related

ffmpeg/ffplay/libav: how to play out a muxed RTP/RTCP stream using an SDP file?

I'm trying to play out an incoming RTP audio stream using ffplay (or, alternatively, by using my own code which uses libav). The incoming stream is muxing RTP and RTCP packets. The playout works, but two local UDP ports are used:
The port I'm requesting
The port I'm requesting + 1 (which I guess is the RTCP port)
This is the ffplay command:
ffplay -loglevel verbose -protocol_whitelist file,udp,rtp test.sdp
And the content of the SDP file:
v=0
o=- 0 0 IN IP4 192.168.51.51
s=RTP-INPUT-1
c=IN IP4 192.168.51.61
t=0 0
m=audio 8006 RTP/AVP 97
b=AS:96
a=rtpmap:97 opus/48000/1
a=rtcp-mux
Note the line a=rtcp-mux. Even with this line present, two local UDP ports are used. I would expect this to be only 1 port.
I'm looking for a way to use only one UDP port.
Here's the relevant libav c++ code (I've left out error handling etc):
auto formatContext = avformat_alloc_context();
const AVInputFormat* format = av_find_input_format("sdp");
AVDictionary *formatOpts = nullptr;
av_dict_set(&formatOpts, "protocol_whitelist", "file,udp,rtp", 0);
int result = avformat_open_input(&formatContext, sdpFilepath, format, &formatOpts);
result = avformat_find_stream_info(formatContext, nullptr);
RTP always uses two ports, RTP flow is on an even-numbered port and RTCP control flow is on the next odd-numbered port.
Edit: https://www.rfc-editor.org/rfc/rfc8035
RFC8035 clarifies how to multiplex RTP and RTCP on a single IP address and port, referred to as RTP/RTCP multiplexing.
You are on the right track. After some digging in all those RFC, I suppose the best action is to check if the RTP stack of ffmpeg implements rfc5761 esp. section 4:
Distinguishable RTP and RTCP Packets
When RTP and RTCP packets are multiplexed onto a single port, the
RTCP packet type field occupies the same position in the packet as
the combination of the RTP marker (M) bit and the RTP payload type
(PT). This field can be used to distinguish RTP and RTCP packets
when two restrictions are observed: 1) the RTP payload type values
used are distinct from the RTCP packet types used; and 2) for each
RTP payload type (PT), PT+128 is distinct from the RTCP packet types
used. The first constraint precludes a direct conflict between RTP
payload type and RTCP packet type; the second constraint precludes a
conflict between an RTP data packet with the marker bit set and an
RTCP packet.

rte_eth_tx_burst() descriptor/mbuf management guarantees vs. free thresholds

The rte_eth_tx_burst() function is documented as:
* It is the responsibility of the rte_eth_tx_burst() function to
* transparently free the memory buffers of packets previously sent.
* This feature is driven by the *tx_free_thresh* value supplied to the
* rte_eth_dev_configure() function at device configuration time.
* When the number of free TX descriptors drops below this threshold, the
* rte_eth_tx_burst() function must [attempt to] free the *rte_mbuf* buffers
* of those packets whose transmission was effectively completed.
I have a small test program where this doesn't seem to hold true (when using the ixgbe driver on a vfio X553 1GbE NIC).
So my program sets up one transmit queue like this:
uint16_t tx_ring_size = 1024-32;
rte_eth_dev_configure(port_id, 0, 1, &port_conf);
r = rte_eth_dev_adjust_nb_rx_tx_desc(port_id, &rx_ring_size, &tx_ring_size);
struct rte_eth_txconf txconf = dev_info.default_txconf;
r = rte_eth_tx_queue_setup(port_id, 0, tx_ring_size,
rte_eth_dev_socket_id(port_id), &txconf);
The transmit mbuf packet pool is created like this:
struct rte_mempool *pkt_pool = rte_pktmbuf_pool_create("pkt_pool", 1023, 341, 0,
RTE_MBUF_DEFAULT_BUF_SIZE, rte_socket_id());
In that way, when sending packets I rather run out of TX descriptors before I run out of packet buffers. (the program generates packets with just one segment)
My expectation is that when I call rte_eth_tx_burst() in a loop (to send one packet after another) that it never fails since it transparently frees mbufs of already sent packets.
However, this doesn't happen.
I basically have a transmit loop like this:
for (unsigned i = 0; i < 2048; ++i) {
struct rte_mbuf *pkt = rte_pktmbuf_alloc(args.pkt_pool);
// error check, prepare packet etc.
uint16_t l = rte_eth_tx_burst(args.port_id, 0, &pkt, 1);
// error check etc.
}
After 1086 transmitted packets (of ~ 300 bytes each), rte_eth_tx_burst() returns 0.
I use the default threshold values, i.e. the queried values are (from dev_info.default_txconf):
tx thresh : 32
tx rs thresh: 32
wthresh : 0
So the main question now is: How hard is rte_eth_tx_burst() supposed to try to free mbuf buffers (and thus descriptors)?
I mean, it could busy loop until the transmission of previously supplied mbufs is completed.
Or it could just quickly check if some descriptors are free again. But if not, just give up.
Related question: Are the default threshold values appropriate for this use case?
So I work around this like that:
for (;;) {
uint16_t l = rte_eth_tx_burst(args.port_id, 0, &pkt, 1);
if (l == 1) {
break;
} else {
RTE_LOG(ERR, USER1, "cannot send packet\n");
int r = rte_eth_tx_done_cleanup(args.port_id, 0, 256);
if (r < 0) {
rte_panic("%u. cannot cleanup tx descs: %s\n", i, rte_strerror(-r));
}
RTE_LOG(WARNING, USER1, "%u. cleaned up %d descriptors ...\n", i, r);
}
}
With that I get output like this:
USER1: cannot send packet
USER1: 1086. cleaned up 32 descriptors ...
USER1: cannot send packet
USER1: 1118. cleaned up 32 descriptors ...
USER1: cannot send packet
USER1: 1150. cleaned up 0 descriptors ...
USER1: cannot send packet
USER1: 1182. cleaned up 0 descriptors ...
[..]
USER1: cannot send packet
USER1: 1950. cleaned up 32 descriptors ...
USER1: cannot send packet
USER1: 1982. cleaned up 0 descriptors ...
USER1: cannot send packet
USER1: 2014. cleaned up 0 descriptors ...
USER1: cannot send packet
USER1: 2014. cleaned up 32 descriptors ...
USER1: cannot send packet
USER1: 2046. cleaned up 32 descriptors ...
Meaning that it frees at most 32 descriptors like this. And that it doesn't always succeed, but then the next rte_eth_tx_burst() succeeds freeing some.
Side question: Is there a better more dpdk-idiomatic way to handle the recycling of mbufs?
When I change the code such that I run out of mbufs before I run out of transmit descriptors (i.e. tx ring created with 1024 descriptors, mbuf pool still has 1023 elements), I have to change the alloc part like this:
struct rte_mbuf *pkt;
do {
pkt = rte_pktmbuf_alloc(args.pkt_pool);
if (!pkt) {
r = rte_eth_tx_done_cleanup(args.port_id, 0, 256);
if (r < 0) {
rte_panic("%u. cannot cleanup tx descs: %s\n", i, rte_strerror(-r));
}
RTE_LOG(WARNING, USER1, "%u. cleaned up %d descriptors ...\n", i, r);
}
} while (!pkt);
The output is similar, e.g.:
USER1: 1023. cleaned up 95 descriptors ...
USER1: 1118. cleaned up 32 descriptors ...
USER1: 1150. cleaned up 32 descriptors ...
USER1: 1182. cleaned up 32 descriptors ...
USER1: 1214. cleaned up 0 descriptors ...
USER1: 1214. cleaned up 0 descriptors ...
USER1: 1214. cleaned up 32 descriptors ...
[..]
That means the freeing of descriptors/mbufs is so 'slow' that it has to busy loop up to 3 times.
Again, is this a valid approach, or are there better dpdk ways to solve this?
Since rte_eth_tx_done_cleanup() might return -ENOTSUP, this may point to the direction that my usage of it might not be the best solution.
Incidentally, even with the ixgbe driver it fails for me when I disable checksum offloads!
Apparently, ixgbe_dev_tx_done_cleanup() then invokes ixgbe_tx_done_cleanup_vec() instead of ixgbe_tx_done_cleanup_full() which unconditionally returns -ENOTSUP:
static int
ixgbe_tx_done_cleanup_vec(struct ixgbe_tx_queue *txq __rte_unused,
uint32_t free_cnt __rte_unused)
{
return -ENOTSUP;
}
Does this make sense?
So then perhaps the better strategy is then to make sure that there are less descriptors than pool elements (e.g. 1024-32 < 1023) and just re-call rte_eth_tx_burst() until it returns one?
That means like this:
for (;;) {
uint16_t l = rte_eth_tx_burst(args.port_id, 0, &pkt, 1);
if (l == 1) {
break;
} else {
RTE_LOG(ERR, USER1, "%u. cannot send packet - retry\n", i);
}
}
This works, and the output shows again that the descriptors are freed 32 at a time, e.g.:
USER1: 1951. cannot send packet - retry
USER1: 1951. cannot send packet - retry
USER1: 1983. cannot send packet - retry
USER1: 1983. cannot send packet - retry
USER1: 2015. cannot send packet - retry
USER1: 2015. cannot send packet - retry
USER1: 2047. cannot send packet - retry
USER1: 2047. cannot send packet - retry
I know that I also can use rte_eth_tx_burst() to submit bigger bursts. But I want to get the simple/edge cases right and understand the dpdk semantics, first.
I'm on Fedora 33 and DPDK 20.11.2.
Recommendation/Solution: after analyzing the cause of the issue is indeed with TX descriptor with either rte_mempool_list_dump or dpdk-procinfo, please use rte_eth_tx_buffer_flush or change the settings for TX thresholds.
Explanation:
The behaviour mbuf_free is varied across PMD, and within the same NIC PF and VF also varies. Follow are some points to understand this propely
rte_mempool can be created with or without cache elements.
when created with cached elements, depending upon the available lcores (eal_options) and number of cache elements per core parameter, the configured mbufs are added per core cache.
When HW offload DEV_TX_OFFLOAD_MBUF_FAST_FREE is available and enabled, the agreement is the mbuf will have ref_cnt as 1.
So when ever tx_burst (success or failure is invoked) threshold levels are checked if free mbuf/mbuf-segments can be pushed back to pool.
With DEV_TX_OFFLOAD_MBUF_FAST_FREE enabled the driver blindly puts the elements into lcore cache.
while in case of no DEV_TX_OFFLOAD_MBUF_FAST_FREE, generic approach of validating the MBUF ensuring the nb_segments and ref_cnt are checked, then pushed to mempool.
But always the either fixed (32 I believe is the default set for all PMD) or available free mbuf is pushed to cache or pool always.
Facts:
In the case of the IXGBE VF driver the option DEV_TX_OFFLOAD_MBUF_FAST_FREE is not available. Which means each time whenever thresholds are met, each individual mbuf are checked and pushed to the mempool.
as per the code snippet rte_eth_dev_configure is configured only for TX, and rte_pktmbuf_pool_create is created to have 341 elements as cache.
Assumption has to be made, that there is only 1 Lcore based (which runs the loop of alloc and tx).
Code Snippet-1:
for (unsigned i = 0; i < 2048; ++i) {
struct rte_mbuf *pkt = rte_pktmbuf_alloc(args.pkt_pool);
// error check, prepare packet etc.
uint16_t l = rte_eth_tx_burst(args.port_id, 0, &pkt, 1);
// error check etc.
}
After 1086 transmitted packets (of ~ 300 bytes each), rte_eth_tx_burst() returns 0.
[Observation] If indeed the mbuf were running, the rte_pktmbuf_alloc should be failing before rte_eth_tx_burst. But failing at 1086, creates an interesting phenomenon because total mbuf created is 1023, and failure happens are 2 iteration of 32 mbuf_release to mempool. Analyzing the driver code for ixgbe, it can be found that (only place return as 0) in tx_xmit_pkts is
/* Only use descriptors that are available */
nb_pkts = (uint16_t)RTE_MIN(txq->nb_tx_free, nb_pkts);
if (unlikely(nb_pkts == 0))
return 0;
Even though in config tx_ring_size is set to 992, internally rte_eth_dev_adjust_nb_desc sets to max of *nb_desc, desc_lim->nb_min. Based on the code it is not because there are no free mbuf, but it due to TX descriptor is low or not availble.
while in all other cases, whenever rte_eth_tx_done_cleanup or rte_eth_tx_buffer_flush these actually pushes any pending descriptors to be DMA immediately out of SW PMD. This internally frees up more descriptors which makes the tx_burst much smoother.
To identify the root cause, whenever DPDK API tx_burst return either
invoke rte_mempool_list_dump or
make use of mempool dump via dpdk-procinfo
Note: most PMD operates on amortizing the cost of the descriptor (PCIe payload) write by batching and bunching for at least 4 (in case of SSE). Hence a single packet even if DPDK tx_burst returning 1 will not be pushing the packet out of NIC. Hence to ensure use rte_eth_tx_buffer_flush.
Say, you invoke rte_eth_tx_burst() to send one small packet (single mbuf, no offloads). Suppose, the driver indeed pushes the packet to the HW. Doing so eats up one descriptor in the ring: the driver "remembers" that this packet mbuf is associated with that descriptor. But the packet is not sent instantly. The HW typically has some means to notify the driver of completions. Just imagine: if the driver checked for completions on every rte_eth_tx_burst() invocation (thus ignoring any thresholds), then calling rte_eth_tx_burst() one more time in a tight loop manner for another packet would likely consume one more descriptor rather than recycle the first one. So, given this fact, I'd not use tight loop when investigating tx_free_thresh semantics. And it shouldn't matter whether you invoke rte_eth_tx_burst() once per a packet or once per a batch of them.
Now. Say, you have a Tx ring of size N. Suppose, tx_free_thresh is M. And you have a mempool of size Z. What you do is allocate a burst of N - M - 1 small packets and invoke rte_eth_tx_burst() to send this burst (no offloads; each packet is assumed to eat up one Tx descriptor). Then you wait for some wittingly sufficient (for completions) amount of time and check the number of free objects in the mempool. This figure should read Z - (N - M - 1). Then you allocate and send one extra packet. Then wait again. This time, the number of spare objects in the mempool should read Z - (N - M). Finally, you allocate and send one more packet (again!) thus crossing the threshold (the number of spare Tx descriptors becomes less than M). During this invocation of rte_eth_tx_burst(), the driver should detect crossing the threshold and start checking for completions. This should make the driver free (N - M) descriptors (consumed by two previous rte_eth_tx_burst() invocations) thus clearing up the whole ring. Then the driver proceeds to push the new packet in question to the HW thus spending one descriptor. You then check the mempool: this should report Z - 1 free objects.
So, the short of it: no loop, just three rte_eth_tx_burst() invocations with sufficient waiting time between them. And you check the spare object count in the mempool after each send operation. Theoretically, this way, you'll be able to understand the corner case semantics. That's the gist of it. However, please keep in mind that the actual behaviour may vary across different vendors / PMDs.
Relying on rte_eth_tx_done_cleanup() really isn't an option since many PMDs don't implement it. Mostly Intel PMD's provide it, but e.g. SFC, MLX* and af_packet ones don't.
However, it's still unclear why the ixgbe PMD doesn't support cleanup when no offloads are enabled.
The requirements on rte_eth_tx_burst() with respect to freeing are really light - from the API docs:
* It is the responsibility of the rte_eth_tx_burst() function to
* transparently free the memory buffers of packets previously sent.
* This feature is driven by the *tx_free_thresh* value supplied to the
* rte_eth_dev_configure() function at device configuration time.
* When the number of free TX descriptors drops below this threshold, the
* rte_eth_tx_burst() function must [attempt to] free the *rte_mbuf* buffers
* of those packets whose transmission was effectively completed.
[..]
* #return
* The number of output packets actually stored in transmit descriptors of
* the transmit ring. The return value can be less than the value of the
* *tx_pkts* parameter when the transmit ring is full or has been filled up.
So just attempting to free (but not waiting on the results of that attempt) and returning 0 (since 0 is less than tx_pkts) is covered by that 'contract'.
FWIW, no example distributed with dpdk loops around rte_eth_tx_burst() to re-submit not-yet-sent packages. There are some examples that use rte_eth_tx_burst() and discard unsent packages, though.
AFAICS, besides rte_eth_tx_done_cleanup() and rte_eth_tx_burst() there is no other function for requesting the release of mbufs previously submitted for transmission.
Thus, it's advisable to size the mbuf packet pool larger than the configured ring size in order to survive situations where all mbufs are inflight and can't be recovered because there is no mbuf left for calling rte_eth_tx_burst() again.

How to tell if SSL_read has received and processed all the records from single message

Following is the dilemma,
SSL_read, on success returns number of bytes read, SSL_pending is used to tell if the processed record has more that to be read, that means probably buffer provided is not sufficient to contain the record.
SSL_read may return n > 0, but what if this happens when first records has been processed and message effectively is multi record communication.
Question: I am using epoll to send/receive messages, which means I have to queue up event in case I expect more data. What check will ensure that all the records have been read from single message and it's time to remove this event and queue up an response event that will write the response back to client?
PS: This code hasn't been tested so it may be incorrect. Purpose of the code is to share the idea that I am trying to implement.
Following is code snippet for the read -
//read whatever is available.
while (1)
{
auto n = SSL_read(ssl_, ptr_ + tail_, sz_ - tail_);
if (n <= 0)
{
int ssle = SSL_get_error(ch->ssl_, rd);
auto old_ev = evt_.events;
if (ssle == SSL_ERROR_WANT_READ)
{
//need more data to process, wait for epoll notification again
evt_.events = EPOLLIN | EPOLLERR;
}
else if (err == SSL_ERROR_WANT_WRITE)
{
evt_.events = EPOLLOUT | EPOLLERR;
}
else
{
/* connection closed by peer, or
some irrecoverable error */
done_ = true;
tail_ = 0; //invalidate the data
break;
}
if (old_ev != evt_.events)
if (epoll_ctl(epoll_fd_, EPOLL_CTL_MOD, socket_fd_, &evt_) < 0)
{
perror("handshake failed at EPOLL_CTL_MOD");
SSL_free(ssl_);
ssl_ = nullptr;
return false;
}
}
else //some data has been read
{
tail_ = n;
if (SSL_pending(ssl_) > 0)
//buffer wasn't enough to hold the content. resize and reread
resize();
else
break;
}
}
```
enter code here
SSL_read() returns the number of decrypted bytes returned in the caller's buffer, not the number of bytes received on the connection. This mimics the return value of recv() and read().
SSL_pending() returns the number of decrypted bytes that are still in the SSL's buffer and haven't been read by the caller yet. This would be equivalent to calling ioctl(FIONREAD) on a socket.
There is no way to know how many SSL/TLS records constitute an "application message", that is for the decrypted protocol data to dictate. The protocol needs to specify where a message ends and a new message begins. For instance, by including the message length in the message data. Or delimiting messages with terminators.
Either way, the SSL/TLS layer has no concept of "messages", only an arbitrary stream of bytes that it encrypts and decrypts as needed, and transmits in "records" of its choosing. Similar to how TCP breaks up a stream of arbitrary bytes into IP frames, etc.
So, while your loop is reading arbitrary bytes from OpenSSL, it needs to process those bytes to detect separations between protocol messages, so it can then act accordingly per message.
What check will ensure that all the records have been read from single message and it's time to remove this event and queue up an response event that will write the response back to client?
I'd have hoped that your message has a header with the number of records in it. Otherwise the protocol you've got is probably unparseable.
What you'd need is to have a stateful parser that consumes all the available bytes and outputs records once they are complete. Such a parser needs to suspend its state once it reaches the last byte of decrypted input, and then must be called again when more data is available to be read. But in all cases if you can't predict ahead of time how much data is expected, you won't be able to tell when the message is finished - that is unless you're using a self-synchronizing protocol. Something like ATM headers would be a starting point. But such complication is unnecessary when all you need is just to properly delimit your data so that the packet parser can know exactly whether it's got all it needs or not.
That's the problem with sending messages: it's very easy to send stuff that can't be decoded by the receiver, since the sender is perfectly fine with losing data - it just doesn't care. But the receiver will certainly need to know how many bytes or records are expected - somehow. It can be told this a-priori by sending headers that include byte counts or fixed-size record counts (it's the same size information just in different units), or a posteriori by using unique record delimiters. For example, when sending printable text split into lines, such delimiters can be Unicode paragraph separators (U+2029).
It's very important to ensure that the record delimiters can't occur within the record data itself. Thus you need some sort of a "stuffing" mechanism, where if a delimiter sequence appears in the payload, you can alter it so that it's not a valid delimiter anymore. You also need an "unstuffing" mechanism so that such altered delimiter sequences can be detected and converted back to their original form, of course without being interpreted as a delimiter. A very simple example of such delimiting process is the octet-stuffed framing in the PPP protocol. It is a form of HDLC framing. The record separator is 0x7E. Whenever this byte is detected in the payload, it is escaped - replaced by a 0x7D 0x5E sequence. On the receiving end, the 0x7D is interpreted to mean "the following character has been XOR'd with 0x20". Thus, the receiver converts 0x7D 0x5E to 0x5E first (it removes the escape byte), and then XORs it with 0x20, yielding the original 0x7E. Such framing is easy to implement but potentially has more overhead than framing with a longer delimiter sequence, or even a dynamic delimiter sequence whose form differs for each position within the stream. This could be used to prevent denial-of-service attacks, when the attacker may maliciously provide a payload that will incur a large escaping overhead. The dynamic delimiter sequence - especially if unpredictable, e.g. by negotiating a new sequence for every connection - prevents such service degradation.

DPDK rte_eth_tx_burst() reliability

According to the DPDK documentation, the rte_eth_tx_burst() function takes a batch of packets, and returns the number of packets that have been actually stored in transmit descriptors of the transmit ring.
Assuming that the packets are sent exactly in the same order as they are inserted in the tx_pkts array parameter, it is possible to call the function iteratively until all the packets are sent. Here a sample code taken from one of the examples:
sent = 0;
do {
n_pkts = rte_eth_tx_burst(portid, 0, &tx_pkts_burst[sent], n_mbufs - sent);
sent += n_pkts;
} while (sent < n_mbufs);
However, using the above code, I see that, sometimes, the amount of packets that the function says are sent, are not really sent.
I am accumulating the return value of rte_eth_tx_burst() in a variable and, at the end of the job, the value of the accumulator is greater than the value of opackets in the device eth_stats.
I see the same number of transmitted packets in eth_stats, eth_xstats and on the other side of the cable, and this number is less than the sum of the values returned by rte_eth_tx_burst().
So, my question is: in what case the rte_eth_tx_burst() function returns a value that does not correspond to the real number of transmitted packets?
According to the documentation, the function is returning only the number of packets that have been successfully inserted in the ring, so I assumed the return value was reliable.
My testbed:
NIC: Intel 82599ES
DPDK driver: igb_uio
DPDK version: 18.05
Traffic: UDP packets, sized 174B, with IP and UDP checksum offload
Edit 1
My test is the following:
the sender sends 32 messages with different IDs, then for each ACK received, a new message with the same ID of the ack-ed packet is sent again. The test ends when every ID has been sent and ack-ed N times (N=36864).
As described above, at some point one packet is not sent, so all the IDs complete the cycle, except one. This is what I see as output:
ID - #sent
0 - 36864
1 - 36864
2 - 36864
3 - 36864
4 - 8151
5 - 36864
6 - 36864
7 - 36864
....
At the end of the test, the accumulator variable with the number of packets sent is greater than the stats and the difference is 1. So, it looks like the rte_eth_tx_burst function failed to send that one packet that is not acknowledged.
Edit 2
It can be relevant that the value "n_mbufs" is not necessarily constant, since the packets are read as a burst from a ring.

RTP timestamp not linear?

I was trying to reconstruct an audio conversation (a-b call using g711 audio) using the rtp time-stamp. I used to fill silence using difference of two rtp time-stamp and sampling rate. The conversation went out of sync and then I see that rtp time-stamp is not linear.I was not able to get exact clock time using rtp time-stamp and resulted in sync issues. How do i calculate the exact time.
I have the same problem with a Stream provided by GStreamer, whic doesnt provide monotonic timestamps.
for Example: The Difference between the stamps should bei exactly 1920, but it is between ~120 and ~3500, but in average 1920.
The problem here is that there is no way to find missing samples, because you never know if the high difference is from the Encoder delay or from a sample missing.
If you have only Audio to decode, I would try to put "valid" PTS values to each sample (in my case basetime+1920, basetime+3840 and so on.)
The big problem here comes when video AND audio were combined. Here this trick doesnt work well, when samples are missing and there is no way to find out when this is the case :(
when you want to send rtp you should notice about two things:
the time stamp is incremented due to the amout of byte sents.
e.g for PT=10, you may have this pattern:
1160 byte , time stamp increment: 1154 and wait 26 ms
lets see how this calculation happens:
. number of packet should be sent in one second : 1/(26ms) = 38
time stamp increment : clockrate / # = 1154
Regarding to RFC3550 (https://www.ietf.org/rfc/rfc3550.txt)
The sampling instant MUST be derived from a clock that increments
monotonically
Its not a choice nor an option. By the way please read the full description of the timestamp field of the RTP packet, there I found it also:
As an example, for fixed-rate audio
the timestamp clock would likely increment by one for each
sampling period. If an audio application reads blocks covering
160 sampling periods from the input device, the timestamp would be
increased by 160 for each such block, regardless of whether the
block is transmitted in a packet or dropped as silent.
If you want to check linearity then use the RTCP SR RTP and NTP timestamps field. At the SR report the RTP timestamp belongs to the NTP timestamp.
So the difference of two consecutive RTP timestamp (lets call them dRTPt_1, dRTP_2, ...) and the difference of two consecutive NTP timestamps (lets call them dNTP_1, dNTP_2, ...) and then multiply dRTP_t* with clock rate and check weather you get dNTP_t*.
But first please read the RFC.