Not able to initiate mlx5 vf sriov - dpdk

I am using dpd19.11 and having error when trying to use sriov vf of a mlx5 100G card
Here is the error message i have
net_mlx5: Unexpected error in DR drop action support detection
The same code work for pf
The same code +/- compiled with dpdk17 work for vf
I am not trying to do filtering on nic, but just to read from the port, 1 rx queue, 1 tx queue
Network devices using kernel driver
===================================
0000:0b:00.0 'MT27800 Family [ConnectX-5 Virtual Function] 1018' if=ens192 drv=mlx5_core unused=igb_uio
0000:13:00.0 'VMXNET3 Ethernet Controller 07b0' if=ens224 drv=vmxnet3 unused=igb_uio *Active*
0000:1b:00.0 'VMXNET3 Ethernet Controller 07b0' if=ens256 drv=vmxnet3 unused=igb_uio *Active*
ethtool -i ens192
driver: mlx5_core
version: 5.0-0
firmware-version: 16.29.1016 (HPE0000000009)
expansion-rom-version:
bus-info: 0000:0b:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: yes
Code example can be found at the following
https://coliru.stacked-crooked.com/a/9d38f87799c3365c
It happens even with 1 rx queue and no rte flow defined
From dpdk site, firmware version looks ok
https://doc.dpdk.org/guides/rel_notes/release_19_11.html#tested-platforms

Related

DPDK-Pktgen from DPDK NIC to directly connected non-DPDK NIC (different servers) not working

I've been playing around with DPDK and trying to create the following scenario.
I have 2 identical servers with 2 different NICs each. The goal is from server1 to send packets in different rates (up to the link maximum using DPDK), and capture the packets on the other side where an app will be running.
On server1 one NIC (Netronome) is taken by DPDK, but on server 2 it's not. The NICs are directly connected with fiber.
On server1 I run
./dpdk-devbind.py --bind=vfio-pci 0000:05:00.0
and then pktgen. It appears to be working (packets are being reported as sent by pktgen). However on the other side (server2), the inteface goes down the moment I devbind. From:
Settings for enp6s0np1:
Supported ports: [ FIBRE ]
Supported link modes: Not reported
Supported pause frame use: Symmetric
Supports auto-negotiation: No
Supported FEC modes: None
Advertised link modes: Not reported
Advertised pause frame use: Symmetric
Advertised auto-negotiation: No
Advertised FEC modes: None
Speed: 40000Mb/s
Duplex: Full
Auto-negotiation: off
Port: Direct Attach Copper
PHYAD: 0
Transceiver: internal
Link detected: yes
it goes to:
ethtool enp6s0np1
Settings for enp6s0np1:
Supported ports: [ FIBRE ]
Supported link modes: Not reported
Supported pause frame use: Symmetric
Supports auto-negotiation: No
Supported FEC modes: None
Advertised link modes: Not reported
Advertised pause frame use: Symmetric
Advertised auto-negotiation: No
Advertised FEC modes: None
Speed: Unknown!
Duplex: Unknown! (255)
Auto-negotiation: off
Port: Other
PHYAD: 0
Transceiver: internal
Link detected: no
It thinks that there is no physical connection between the two NICs and obviously this is not the case.
enp6s0np1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
link/ether 00:15:4d:13:30:5c brd ff:ff:ff:ff:ff:ff
The pktgen output:
\ Ports 0-1 of 2 <Main Page> Copyright(c) <2010-2021>, Intel Corporation
Flags:Port : P------Sngl :0
Link State : <--Down--> ---Total Rate---
Pkts/s Rx : 0 0
Tx : 14,976 14,976
MBits/s Rx/Tx : 0/10 0/10
Pkts/s Rx Max : 0 0
Tx Max : 15,104 15,104
Broadcast : 0
Multicast : 0
Sizes 64 : 0
65-127 : 0
128-255 : 0
256-511 : 0
512-1023 : 0
1024-1518 : 0
Runts/Jumbos : 0/0
ARP/ICMP Pkts : 0/0
Errors Rx/Tx : 0/0
Total Rx Pkts : 0
Tx Pkts : 2,579,072
Rx/Tx MBs : 0/1,733
TCP Flags : .A....
TCP Seq/Ack : 305419896/305419920
Pattern Type : abcd...
Tx Count/% Rate : Forever /0.1%
Pkt Size/Tx Burst : 64 / 128
TTL/Port Src/Dest : 10/ 1234/ 8000
Pkt Type:VLAN ID : IPv4 / UDP:0001
802.1p CoS/DSCP/IPP : 0/ 0/ 0
VxLAN Flg/Grp/vid : 0000/ 0/ 0
IP Destination : 192.168.0.2
Source : 192.168.0.1
MAC Destination : 00:15:4d:13:30:5c
Source : 00:15:4d:13:30:81
PCI Vendor/Addr : 19ee:4000/05:00.0
when I try to capture with tcpdump -i enp6s0np1, it doesn't record anything. Are those issues related and if yes, is there any workaround? Shouldn't some packets be captured by tcpdump on server2?
Nvm this got resolved. Apparently the NIC has two ports and only one was connected. Therefore the issue was the Link State down below.
Ports 0-1 of 2 <Main Page> Copyright(c) <2010-2021>, Intel Corporation
Flags:Port : P------Sngl :0
Link State : <--Down-->
To resolve it I had to use port 1 instead of port 0

DEV_TX_OFFLOAD_VXLAN_TNL_TSO Offload Testing - DPDK

I am working on Mellanox ConnectX-5 cards and using DPDK 20.11 with CentOS 8 (4.18.0-147.5.1.el8_1.x86_64).
I wanted to test the DEV_TX_OFFLOAD_VXLAN_TNL_TSO offload and what I want to ask is that what should the packet structure be like (I am using scapy) that I should send to the DPDK application such that this offload will come into action and perform segmentation (since it is a VXLAN_TNL_TSO).
I am modifying the dpdk-ip_fragmentation example and have added: DEV_TX_OFFLOAD_IP_TNL_TSO inside the port_conf
static struct rte_eth_conf port_conf = {
.rxmode = {
.max_rx_pkt_len = JUMBO_FRAME_MAX_SIZE,
.split_hdr_size = 0,
.offloads = (DEV_RX_OFFLOAD_CHECKSUM |
DEV_RX_OFFLOAD_SCATTER |
DEV_RX_OFFLOAD_JUMBO_FRAME),
},
.txmode = {
.mq_mode = ETH_MQ_TX_NONE,
.offloads = (DEV_TX_OFFLOAD_IPV4_CKSUM |
DEV_TX_OFFLOAD_VXLAN_TNL_TSO
),
},
};
And at the ol_flags:
ol_flags |= (PKT_TX_IPV4 | PKT_TX_IP_CKSUM | PKT_TX_TUNNEL_VXLAN );
In short, to test this offload it would be great if someone can help me with 2 things:
What should the packet structure be that I should send (using scapy, such that the offload comes into action)?
Required settings to do in the DPDK example application (It is not necessary to use the ip_fragmentation example, any other example would be fine too).
note: Based on the 3 hours debug session, it is been clarified the title and question shared is incorrect. Hence the question will be re-edited to reflect actual requirement as how enable DPDK port with TCP-TSO offloads for tunnelled VXLAN packets.
Answer to the first question what should be scapy settings for sending a packet to DPDK DUT for TSO and receiving segmented traffic is
Disable all TSO related offload on the SCAPY interface using ethtool -K [scapy interface] rx off tx off tso off gso off gro off lro off
Set MTU to send larger frames like 9000
Ensure to send large frames as payload but less than interface MTU.
Run tcpdump for ingress traffic with the directional flag as tcpdump -eni [scapy interface] -Q in
Answers to the second question Required settings to do in the DPDK example application is as follows
dpdk testpmd application can enable HW and SW TSO offloads based on NIC support.
next best application is tep_termination, but requires vhost interface (VM) or DPDK vhost to achieve the same.
Since the requirement is targeted for any generic application like skeleton, l2fwd, one can enable as follows
Ensure to use DPDK 20.11 LTS (to get the latest and best support for TUNNEL TSO)
In application check for tx_offload capability with dev_get_info API.
Cross-check for HW TSO for tunnelled (VXLAN) packets.
If the TSO needs to done for UDP payload, check for UDP_TSO support in HW.
configure the NIC with no-multisegment, jumbo frame, max frame len > 9000 Bytes.
Receive the packet via rx_burst, and ensure the packet is ipv4, UDP, VXLAN (tunnel) with nb_segs as 1.
modify the mbuf to point to l2_len, l3_len, l4_len.
mark the packets ol_flags as PKT_TX_IPV4 | PKT_TX_TUNNEL_VXLAN | PKT_TX_TUNNEL_IP. For UDP inner payload PKT_TX_TUNNEL_UDP.
then set the segment size as DPDK MTU (1500 as default) - l3_len - l4_len
This will enable the PMD which support HW TSO offload to update appropriate fields in descriptors for the given payload to transmitted as multiple packets. For our test case scapy send packet of 9000 bytes will be converted into 7 * 1500 byte packets. this can be observed as part tcpdump.
Note:
the reference code is present in tep_termination and test_pmd.
If there is no HW offload, SW library of rte gso is available.
For HW offload all PMD as of today require the MBUF is a continuous single non-external buffer. So make sure to create mbufpool or mempool with sufficient size for receiving large packets.

DPDK 18.11 HW checksum support for X722 NIC?

I am running dpdk-stable-18.11.8 on Centos 7, targeting an Intel X722 NIC.
I want ipv4 and udp header checksums to be calculated by hardware, so I set the device configuration to:
struct rte_eth_conf local_port_conf;
memset(&local_port_conf, 0, sizeof(struct rte_eth_conf));
local_port_conf.rxmode.split_hdr_size = 0;
local_port_conf.txmode.mq_mode = ETH_MQ_TX_NONE;
local_port_conf.txmode.offloads = DEV_TX_OFFLOAD_OUTER_UDP_CKSUM | DEV_TX_OFFLOAD_OUTER_IPV4_CKSUM;
rte_eth_dev_configure(0,1,1,&local_port_conf);
rte_eth_dev_configure returns:
0xffffffea (-22)
Does this mean that DPDK 18.11 doesn't support checksum offload to the X722 NIC?
DEV_TX_OFFLOAD_OUTER_IPV4_CKSUM is used for outer tunnelling packet, for which X710 has to be loaded with DDP. If the intent is for normal packet DEV_TX_OFFLOAD_IPV4_CKSUM is to be used.
Note: right way of configuring any DPDK port is to first fetch capability by rte_eth_dev_info_get. Then check dev_info.tx_offload_capa & DEV_TX_OFFLOAD_IPV4_CKSUM, if present configure.

No DPDK packet fragmentation supported in Mellanox ConnectX-3?

Hello Stackoverflow Experts,
I am using DPDK on Mellanox NIC, but am struggling with applying the packet
fragmentation in DPDK application.
sungho#c3n24:~$ lspci | grep Mellanox
81:00.0 Ethernet controller: Mellanox Technologies MT27500 Family
[ConnectX-3]
the dpdk application(l3fwd, ip-fragmentation, ip-assemble) did not
recognized the received packet as the ipv4 header.
At first, I have crafted my own packets when sending ipv4 headers so I
assumed that I was crafting the packets in a wrong way.
So I have used DPDK-pktgen but dpdk-application (l3fwd, ip-fragmentation,
ip-assemble) did not recognized the ipv4 header.
As the last resort, I have tested the dpdk-testpmd, and found out this in
the status info.
********************* Infos for port 1 *********************
MAC address: E4:1D:2D:D9:CB:81
Driver name: net_mlx4
Connect to socket: 1
memory allocation on the socket: 1
Link status: up
Link speed: 10000 Mbps
Link duplex: full-duplex
MTU: 1500
Promiscuous mode: enabled
Allmulticast mode: disabled
Maximum number of MAC addresses: 127
Maximum number of MAC addresses of hash filtering: 0
VLAN offload:
strip on
filter on
qinq(extend) off
No flow type is supported.
Max possible RX queues: 65408
Max possible number of RXDs per queue: 65535
Min possible number of RXDs per queue: 0
RXDs number alignment: 1
Max possible TX queues: 65408
Max possible number of TXDs per queue: 65535
Min possible number of TXDs per queue: 0
TXDs number alignment: 1
testpmd> show port
According to DPDK documentation.
in the flow type of the info status of port 1 should show, but mine shows
that no flow type is supported.
The below example should be the one that needs to be displayed in flow types:
Supported flow types:
ipv4-frag
ipv4-tcp
ipv4-udp
ipv4-sctp
ipv4-other
ipv6-frag
ipv6-tcp
ipv6-udp
ipv6-sctp
ipv6-other
l2_payload
port
vxlan
geneve
nvgre
So Is my NIC, Mellanox Connect X-3 does not support DPDK IP fragmentation? Or is
there additional configuration that needs to be done before trying out the packet fragmentation?
-- [EDIT]
So I have checked the packets from DPDK-PKTGEN and the packets received by DPDK application.
The packets that I receive is the exact one that I have sent from the application. (I get the correct data)
The problem begins at the code
struct rte_mbuf *pkt
RTE_ETH_IS_IPV4_HDR(pkt->packet_type)
This determines the whether the packet is ipv4 or not.
and the value of pkt->packet_type is both zero from DPDK-PKTGEN and DPDK application. and if the pkt-packet_type is zero then the DPDK application reviews this packet as NOT IPV4 header.
This basic type checker is wrong from the start.
So what I believe is that either the DPDK sample is wrong or the NIC cannot support ipv4 for some reason.
The data I received have some pattern at the beginning I receive the correct message but after that sequence of packets have different data between the MAC address and the data offset
So what I assume is they are interpreting the data differently, and getting the wrong result.
I am pretty sure any NIC, including Mellanox ConnectX-3 MUST support ip fragments.
The flow type you are referring is for the Flow Director, i.e. mapping specific flows to specific RX queues. Even if your NIC does not support flow director, it does not matter for the IP fragmentation.
I guess there is an error in the setup or in the app. You wrote:
the dpdk application did not recognized the received packet as the ipv4 header.
I would look into this more closely. Try to dump those packets with dpdk-pdump or even by simply dumping the receiving packet on the console with rte_pktmbuf_dump()
If you still suspect the NIC, the best option would be to temporary substitute it with another brand or a virtual device. Just to confirm it is the NIC indeed.
EDIT:
Have a look at mlx4_ptype_table for fragmented IPv4 packets it should return packet_type set to RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN | RTE_PTYPE_L4_FRAG
Please note the functionality was added in DPDK 17.11.
I suggest you to dump pkt->packet_type on console to make sure it is zero indeed. Also make sure you have the latest libmlx4 installed.

How to send and receive data using DPDK

I have a quad port Intel 1G network card. I am using DPDK to send data on one physical port and receive on another.
I saw a few examples in DPDK code, but could not make it work. If anybody knows how to do that please send me simple instructions so I can follow and understand. I setup my PC properly for huge pages, loading driver, and assigning network port to use dpdk driver etc... I can run helloworld from DPDK so system setup looks ok to me.
Thanks in advance.
temp5556
After building DPDK:
cd to the DPDK directory.
Run sudo build/app/testpmd -- --interactive
You should see output like this:
$ sudo build/app/testpmd -- --interactive
EAL: Detected 8 lcore(s)
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Multi-process socket /var/run/.rte_unix
EAL: Probing VFIO support...
EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using unreliable clock cycles !
EAL: PCI device 0002:00:02.0 on NUMA socket 0
EAL: probe driver: 15b3:1004 net_mlx4
PMD: net_mlx4: PCI information matches, using device "mlx4_0" (VF: true)
PMD: net_mlx4: 1 port(s) detected
PMD: net_mlx4: port 1 MAC address is 00:0d:3a:f4:6e:17
Interactive-mode selected
testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=203456, size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Warning! port-topology=paired and odd forward ports number, the last port
will pair with itself.
Configuring Port 0 (socket 0)
Port 0: 00:0D:3A:F4:6E:17
Checking link statuses...
Done
testpmd>
Don't worry about the "No free hugepages" message. It means it couldn't find any 1024 MB hugepages but it since it continued OK, it must have found some 2 MB hugepages. It'd be nice if it said "EAL: Using 2 MB huge pages" instead.
At the prompt type, start tx_first, then quit. You should see something like:
testpmd> start tx_first
io packet forwarding - ports=1 - cores=1 - streams=1 - NUMA support enabled, MP over anonymous pages disabled
Logical Core 1 (socket 0) forwards packets on 1 streams:
RX P=0/Q=0 (socket 0) -> TX P=0/Q=0 (socket 0) peer=02:00:00:00:00:00
io packet forwarding packets/burst=32
nb forwarding cores=1 - nb forwarding ports=1
port 0:
CRC stripping enabled
RX queues=1 - RX desc=1024 - RX free threshold=0
RX threshold registers: pthresh=0 hthresh=0 wthresh=0
TX queues=1 - TX desc=1024 - TX free threshold=0
TX threshold registers: pthresh=0 hthresh=0 wthresh=0
TX RS bit threshold=0 - TXQ offloads=0x0
testpmd> quit
Telling cores to stop...
Waiting for lcores to finish...
---------------------- Forward statistics for port 0 ----------------------
RX-packets: 0 RX-dropped: 0 RX-total: 0
TX-packets: 32 TX-dropped: 0 TX-total: 32
----------------------------------------------------------------------------
+++++++++++++++ Accumulated forward statistics for all ports+++++++++++++++
RX-packets: 0 RX-dropped: 0 RX-total: 0
TX-packets: 32 TX-dropped: 0 TX-total: 32
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
In my system there is only one DPDK port, so I sent 32 packets but did not receive any. If I had a multi-port card with a cable directly between the ports, then I'd see the RX count also increase.
you can use TESTPMD to test DPDK.
TestPMD can work as a packet generator (tx_only mode) , a receiver (rx_only mode) , or a forwarder(io mode).
you will need generator nodes to be connected to your box if you are willing to use TESTPMD as a forwarder only.
I propose that you start with the following examples :
generator(pktgen) ------> testPMD (io mode )----------> recevier (testPMD rx_only mode).
at the pktgen generator specify the mac address destination which is the MAC address of the receive's receiving PORT.
PKTGEN and how it works in detail is explained more in this link :
http://pktgen.readthedocs.io/en/latest/getting_started.html
TESTPMD and how it works is explained here :
http://www.intel.com/content/dam/www/public/us/en/documents/guides/dpdk-testpmd-application-user-guide.pdf
I hope this helps.