Why DPDK only cannot send and receive 60 bytes packet - dpdk

I have written a simple DPDK send and receive application. When the packet len <= 60 bytes, send and receive application works, but when packet len > 60 bytes, send application show it has sent out packet. but in recieve application, it does not receive anything.
In send application:
mbuf_pool = rte_pktmbuf_pool_create("MBUF_POOL", NUM_MBUFS,
MBUF_CACHE_SIZE, 0, RTE_MBUF_DEFAULT_BUF_SIZE, rte_socket_id());
pkt = rte_pktmbuf_alloc(mbuf_pool);
pkt->data_len = packlen; //if packlen<=60, it works, but when packlen>60, receiver cannot receive anything.
I try both l2fwd and basicfwd as receive application. It is same result.

The issue is here:
pchar[12]=0;
pchar[13] = 0
This means Ethertype is 0. From the list of assigned Ethertypes:
https://www.iana.org/assignments/ieee-802-numbers/ieee-802-numbers.xhtml
We see that 0 means zero Ethernet frame length. Since the minimum Ethernet frame length is 64 (60 + 4 FCS), that is why you have troubles sending packets longer that 60 bytes.
To fix the issue, simply put there a reasonable Ethertype from the list above.

Related

Sometime Disconnect Req is inside Publish Message

On the client side I use:
mosquitto_pub -t tpc -m msg
On the server side I use nonblocking socket and socket() API:
https://www.ibm.com/support/knowledgecenter/ssw_ibm_i_72/rzab6/xnonblock.htm
After first received packet I send connect acknowledge packet.
For each received packet I print how many bytes was received and whole buffer in hex.
I compare received data with WireShark capturing.
Sometime it works well:
37 bytes received - Connect Command
10 bytes received - Publish Message [tpc]
2 bytes received - Disconnect Req
Sometime I get Disconnect Req inside Publish Message [tpc]:
37 bytes received - Connect Command
12 bytes received - Publish Message [tpc] + Disconnect Req
These last two bytes are Disconnect Req:
30
8
0
3
74
70
63
6d
73
67
ffffffe0 <--
0 <--
How can I avoid these situations and get always 3 packets?
Short answer: you can't. You have to actually parse the messages to determine the length.
The constant to create a tcp socket is called SOCK_STREAM for a reason. Socket has to be treated as such: a stream of bytes. Nobody guarantees that one send() on one side results in one recv() on the other side. The only guarantee is that the sequence is preserved: abcd may become (ab, cd), but will not become acbd.
The packets may be splitted somewhere half the way. So it may be that the client sends 2048 bytes, but on the server side you'll receive first ~1400 bytes and then the rest. So N sends does not result in N recv.
Another thing is that the client also treats the socket as a stream. It may send byte by byte, or send a batch of messages with one send(). N messages are not N sends.

DPDK rte_eth_tx_burst() reliability

According to the DPDK documentation, the rte_eth_tx_burst() function takes a batch of packets, and returns the number of packets that have been actually stored in transmit descriptors of the transmit ring.
Assuming that the packets are sent exactly in the same order as they are inserted in the tx_pkts array parameter, it is possible to call the function iteratively until all the packets are sent. Here a sample code taken from one of the examples:
sent = 0;
do {
n_pkts = rte_eth_tx_burst(portid, 0, &tx_pkts_burst[sent], n_mbufs - sent);
sent += n_pkts;
} while (sent < n_mbufs);
However, using the above code, I see that, sometimes, the amount of packets that the function says are sent, are not really sent.
I am accumulating the return value of rte_eth_tx_burst() in a variable and, at the end of the job, the value of the accumulator is greater than the value of opackets in the device eth_stats.
I see the same number of transmitted packets in eth_stats, eth_xstats and on the other side of the cable, and this number is less than the sum of the values returned by rte_eth_tx_burst().
So, my question is: in what case the rte_eth_tx_burst() function returns a value that does not correspond to the real number of transmitted packets?
According to the documentation, the function is returning only the number of packets that have been successfully inserted in the ring, so I assumed the return value was reliable.
My testbed:
NIC: Intel 82599ES
DPDK driver: igb_uio
DPDK version: 18.05
Traffic: UDP packets, sized 174B, with IP and UDP checksum offload
Edit 1
My test is the following:
the sender sends 32 messages with different IDs, then for each ACK received, a new message with the same ID of the ack-ed packet is sent again. The test ends when every ID has been sent and ack-ed N times (N=36864).
As described above, at some point one packet is not sent, so all the IDs complete the cycle, except one. This is what I see as output:
ID - #sent
0 - 36864
1 - 36864
2 - 36864
3 - 36864
4 - 8151
5 - 36864
6 - 36864
7 - 36864
....
At the end of the test, the accumulator variable with the number of packets sent is greater than the stats and the difference is 1. So, it looks like the rte_eth_tx_burst function failed to send that one packet that is not acknowledged.
Edit 2
It can be relevant that the value "n_mbufs" is not necessarily constant, since the packets are read as a burst from a ring.

libusb_interrupt_transfer LIBUSB_ERROR_TIMEOUT

I have a general design question, the final software will eventually run on Linux and Windows...
I am trying to read 8 bytes on an endpoint with libusb_interrupt_transfer and LIBUSB_ERROR_TIMEOUT occurs in the middle of data being received... Will the be data be broken up? The docs warn about specifying anything other than the actual endpoint data size for the 'length' variable of the data to be received; that it can lead to buffer overrun. Also the docs say that if a timeout occurs to check the 'transferred' variable and that not all the data may have been received. Those two things being true, How am I supposed to deal with partially received data? If LIBUSB_ERROR_TIMEOUT occurs and my packet is only 8 bytes, will all 8 bytes always be received? And I supposed to always supply an 8 byte buffer, even if I am only requesting to receive the next 2 bytes to complete a previously timed out read request? And if I do supply that 8 byte buffer, and only request 2 bytes, it is possible that I may end up with 6 bytes of the next data packet that is incoming? Even though I only requested 2 bytes? Any info is greatly appreciated.
http://libusb.sourceforge.net/api-1.0/group__syncio.html#gac412bda21b7ecf57e4c76877d78e6486
libusb docs states "Also check transferred when dealing with a timeout error code. libusb may have to split your transfer into a number of chunks to satisfy underlying O/S requirements"
http://libusb.sourceforge.net/api-1.0/packetoverflow.html
docs states: "When requesting data on a bulk endpoint, libusb requires you to supply a buffer and the maximum number of bytes of data that libusb can put in that buffer. However, the size of the buffer is not communicated to the device - the device is just asked to send any amount of data."
Then it also states: "Overflows can only happen if the final packet in an incoming data transfer is smaller than the actual packet that the device wants to transfer. Therefore, you will never see an overflow if your transfer buffer size is a multiple of the endpoint's packet size: the final packet will either fill up completely or will be only partially filled."
unsigned char data[8];
int timeout = 250; //timeout in milliseconds
int xmtcnt = 0;
int rcvcnt = 0;
//EP OUT (Send data to USB Device)
//0x02 = Endpoint Type 0x00 + Endpoint Number 2
r = libusb_interrupt_transfer(devh,0x02, data, sizeof(data), &xmtcnt, timeout);
if(r != 0 || xmtcnt != 8){printf("XMT libusb_interrupt_transfer error %d\n",r); goto out_release;}
//EP IN (Recv data from USB device)
//0x81 = Endpoint Type 0x80 + Endpoint Number 1
//-----IS IT POSSIBLE TO RECEIVE LESS THAN 8 BYTES IF WE TIMEOUT?----
r = libusb_interrupt_transfer(devh,0x81, data, sizeof(data), &rcvcnt, timeout);
if(r != 0 || rcvcnt != 8){printf("RCV libusb_interrupt_transfer error %d\n",r); goto out_release;}
//show data received
CONSOLE("data: %d %d %d %d %d %d %d %d xmt:%d rcv:%d\n",data[0],data[1],data[2],data[3],data[4],data[5],data[6],data[7],xmtcnt,rcvcnt);

ioctlsocket or recv takes more time to execute in windows socket programming?

In socket programming, some data is sent to the server, and as soon as server receives it sends the acknowledgement response message. it is more than 1 byte, so i check for more than one byte check while receiving, here i am losing around 120-200ms. Which is a very big issue. As client need to send ack back for this acknowledgement. I have sniffed to see data is arrived to my IP at the same time when server has sent. but recv or ioctlsocket(to check more than 1 byte is ready to be read) takes time to read more than one byte. How can i resolve this. The code is as follows.
DWORD RecvCount = 0;
char szBuff1[2048];
bool stop = false;
while(!stop)
{
ioctlsocket(*socket, FIONREAD, &RecvCount);
if(RecvCount > 1)
stop = true;
}
int Res = recv(*socket, szBuff1, RecvCount,0);
You should disable the Nagle algorithm on windows as otherwise the socket will sit on your data until the buffer is full (or at least wait a couple of hundred milliseconds before sending it anyway).
You do this by setting the TCP_NODELAY socket option:
int flag = 1;
int result = setsockopt(m_Socket,IPPROTO_TCP,TCP_NODELAY,(char *) &flag,sizeof(int));

TCP memcpy buffer returns rubbish data using C++

I'm doing something similar to Stack Overflow question Handling partial return from recv() TCP in C.
The data receive is bigger than the buffer initialised (for example, 1000 bytes). Therefore a temporary buffer of a bigger size (for example, 10000 bytes) is used. The problem is that the multiple data received is rubbish. I've already checked the offset to memcpy to the temporary buffer, but I keep receiving rubbish data.
This sample shows what I do:
First message received:
memcpy(tmpBuff, dataRecv, 1000);
offSet = offSet + 1000;
Second msg onwards:
memcpy(tmpBuffer + offSet, dataRecv, 1000);
Is there something I should check?
I've checked the TCP hex that was sent out. Apparently, the sender is sending an incomplete message. How my program works is that when the sender sends the message, it will pack (message header + actual message). the message header has some meta data, and one of it is the message length.
When the receiver receives the packet, it will get the message header using the message header offset and message header length. It will extract the message length, check if the current packet size is more than or equal to the message length and return the correct message size to the users. If there's a remaining amount of message left in the packet, it will store it into a temporary buffer and wait to receive the next packet. When it receives the next packet, it will check the message header for the message length and do the same thing.
If the sender pack three messages in a packet, each message have its own message header indicating the message length. Assume all three messages are 300 bytes each in length. Also assume that the second message sent is incomplete and turns out to be only 100 bytes.
When the receiver receives the three messages in a packet, it will return the first message correctly. Since the second message is incomplete, my program wouldn't know, and so it will return 100 bytes from the second message and 200 bytes from the third message since the message header indicates the total size is 300 bytes. Thus the second message returned will have some rubbish data.
As for the third message, my program will try to get the message length from the message header. Since the first 200 bytes are already returned, the message header is invalid. Thus, the message length returned to my program will be rubbish as well. Is there a way to check for a complete message?
Suppose you are expecting 7000 bytes over the tcp connection. In this case it is very likely that your messages will be split into tcp packets with an actual payload size of let's say 1400 bytes (so 5 messages).
In this case it is perfectly possible consecutive recv calls with a target buffer of 1000 bytes will behave as follows:
recv -> reads 1000 bytes (packet 1)
recv -> reads 400 bytes (packet 1)
recv -> reads 1000 bytes (packet 2)
recv -> reads 400 bytes (packet 2)
...
Now, in this case, when reading the 400 bytes packet you still copy the full 1000 bytes to your larger buffer, actually pasting 600 bytes of rubbish in between. You should actually only memcpy the number of bytes received, which is the return value of recv itself. Of course you should also check if this value is 0 (socket closed) or less than zero (socket error).