Issue when using OVS-DPDK with Intel SR-IOV Virtual Function - dpdk

I'm using OVS-DPDK with Intel X710 NIC. What I do are:
(1) Binding UIO driver for NIC port (e.g igb_uio in case physical function and vfio_pci for virtual function)
(2) Binding this DPDK port to OVS, called dpdkport
(3) Add a vhostuser port to OVS
(4) Add a flow that forwarding PKT from vhostuser to dpdkport
(5) Run a DPDK Packet Generator (e.g. pktgen or testpmd) to port vhostuser. These packets are forwarded to dpdkport and then be sent to NIC.
Everything is OK with Physical Function NIC Port. But issue appeared when using "Virtual Function - VF".
When using VF, I dumped port information by using ovs-ofctl dump-port. It shows that about all packet are error and can not be sent to NIC.
Error log:
ovs-ofctl dump-ports ovs-br0 dpdk-p0
OFPST_PORT reply (xid=0x4): 1 ports port "dpdk-p0": rx pkts=0, bytes=0, drop=0, errs=0, frame=?, over=?, crc=? tx pkts=0, bytes=0, drop=0, errs=317971312, coll=?
Do you have any idea about what happened?
P/s: there is a duplicating item in step (3), so I edited it. I also add the report from ovs about error in packet.
Regards
Michael Tang

Related

dpdk-testpmd command executed and then hangs

I made ready dpdk compatible environment and then I tried to send packets using dpdk-testpmd and wanted to see it being received in another server.
I am using vfio-pci driver in no-IOMMU (unsafe) mode.
I ran
$./dpdk-testpmd -l 11-15 -- -i
which had output like
EAL: Detected NUMA nodes: 2
EAL: Detected static linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: VFIO support initialized
EAL: Using IOMMU type 8 (No-IOMMU)
EAL: Probe PCI driver: net_i40e (8086:1572) device: 0000:01:00.1 (socket 0)
TELEMETRY: No legacy callbacks, legacy socket not created
Interactive-mode selected
testpmd: create a new mbuf pool <mb_pool_1>: n=179456, size=2176, socket=1
testpmd: preferred mempool ops selected: ring_mp_mc
testpmd: create a new mbuf pool <mb_pool_0>: n=179456, size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Warning! port-topology=paired and odd forward ports number, the last port will pair with itself.
Configuring Port 0 (socket 0)
Port 0: E4:43:4B:4E:82:00
Checking link statuses...
Done
then
$set nbcore 4
Number of forwarding cores set to 4
testpmd> show config fwd
txonly packet forwarding - ports=1 - cores=1 - streams=1 - NUMA support enabled, MP allocation mode: native
Logical Core 12 (socket 0) forwards packets on 1 streams:
RX P=0/Q=0 (socket 0) -> TX P=0/Q=0 (socket 0) peer=BE:A6:27:C7:09:B4
my nbcore is not being set correctly, even 'txonly' mode was not being set before I set the eth-peer addr. but some parameters are working. Moreover if I don't change the burst delay my server gets crashed as soon as I start transmitting through it has 10G ethernet port (80MBps available bandwidth by calculation). Hence, I am not seeing packets at receiving server by tailing tcpdump at corresponding receiving interface. What is happening here and what am I doing wrong?
based on the question & answers in the comments, the real intention is to send packets from DPDK testpmd using Intel Fortville (net_i40e) to the remote server.
The real issue for traffic not being generated is neither the application command line nor the interactive option is set to create packets via dpdk-testpmd.
In order to generate packets there are 2 options in testpmd
start tx_first: this will send out a default burst of 32 packets as soon the port is started.
forward mode tx-only: this puts the port under dpdk-testpmd in transmission-only mode. once the port is start it will transmit packets with the default packet size.
Neither of these options is utilized, hence my suggestion is
please have a walk through DPDK documentation on testpmd and its configuratio
make use of either --tx-first or use --forward-mode=txonly as per DPDK Testpmd Command-line Options
make use of either start txfirst or set fwd txonly or set fwd flwogen under interactive mode refer Testpmd Runtime Functions
with this traffic will be generated from testpmd and sent to the device (remote server). A quick example of the same is "dpdk-testpmd --file-prefix=test1 -a81:00.0 -l 7,8 --socket-mem=1024 -- --burst=128 --txd=8192 --rxd=8192 --mbcache=512 --rxq=1 --txq=1 --nb-cores=2 -a --forward-mode=io --rss-udp --enable-rx-cksum --no-mlockall --no-lsc-interrupt --enable-drop-en --no-rmv-interrupt -i"
From the above example config parameters
numbers of packets for rx-tx burst is set by --burst=128
number of rx-tx queues is configured by --rxq=1 --txq=1
number of cores to use for rx-tx is set by --nb-cores=2
to set flowgen, txonly, rxonly or io mode we use --forward-mode=io
hence in comments, it is mentioned neither set nbcore 4 or there are any configurations in testpmd args or interactive which shows the application is set for TX only.
The second part of the query is really confusing, because as it states
Moreover if I don't change the burst delay my server gets crashed as
soon as I start transmitting through it has 10G ethernet port (80MBps
available bandwidth by calculation). Hence, I am not seeing packets at
receiving server by tailing tcpdump at corresponding receiving
interface. What is happening here and what am I doing wrong?
assuming my server is the remote server to which packets are being sent by dpdk testpmd. because there is mention of I see packets with tcpdump (since Intel fortville X710 when bound with UIO driver will remove kernel netlink interface).
it is mentioned 80MBps which is around 0.08Gbps, is really strange. If the remote interface is set to promiscuous mode and there is AF_XDP application or raw socket application configured to receive traffic at line rate (10Gbps) it works. Since there is no logs or crash dump of the remote server, and it is highly unlikely actual traffic is generated from testpmd, this looks more of config or setup issue in remote server.
[EDIT-1] based on the live debug, it is confirmed
The DPDK is not installed - fixed by using ninja isntall
the DPDK nic port eno2 - is not connected to remote server directly.
the dpdk nic port eno2 is connected through switch
DPDk application testpmd is not crashing - confirmed with pgrep testpmd
instead when used with set fwd txonly, packets flood the switch and SSH packets from other port is dropped.
Solution: please use another switch for data path testing, or use direct connection to remote server.

DPDK: Zero Tx or Rx packets while runnning TestPMD

I have setup DPDK 20.11. While running the basic TestPMD code, the number of Transmitted packets and received packets are zero. I need help and I am new to this.
I have attached the terminal screenshot of running TestPmd. I would like to know where I am making mistake.
OS: Ubuntu 16.04.6LTS (Xenial Xerus)
Testpmd was provided with no arguments (just gave 'sudo ./dpdk-testpmd' command)
Physical NIC
Firmware Details:
The driver details and NIC firware has been provided in the link
[Edit 1] port info of first port
Port info of second port
Had a live debug on the setup, The physical were not physically connected NIC or switch. In Linux Kernel ethtool the links are down. Hence in DPDK application, we get same message as link Down
Solution: connect the interfaces with either NIC or switch to get ports state up

DPDK getting too many rx_crc_errors on one port

what may cause rx_crc_erros in DPDK ports?
is it a software thing ? or a hardware thing related to the port or the traffic coming from the other end ?
DPDK Version: 19.02
PMD: I40E
This Port is running on customer Network, worth mentioning that this is the only port (out of 4) having this behaviour, so this may be a router/traffic thing but I couldnt verify that
used dpdk-proc-info to get this data
could not do any additional activity as this is running on customer site
DPDK I40E PMD has only option to enable or disable CRC on the port. Hence the assumption of DPDK I40E PMD is causing CRC error on 1 port out of 4 can be ruled out fully.
The `RX packets are validated by ASIC per port for CRC and then DMA to mbuf for packet buffer. The PMD copies the descirptor states to mbuf struct (one among them is CRC). The packet descriptor indicates the CRC result of the packet buffer to the driver (Kernel/DPDK-PMD). So the CRC error on a given port can arise due to the following reasons as
the port connected to ASIC is faulty (very rare case)
the SFP+ is not properly connected (possible).
the SFP+ is not the recommended one (possible).
the traffic coming from the other end is sending CRC packets as faulty.
One needs to isolate the issue by
binding the port to Linux Driver i40e and checking the statistics via ethtool -S [port].
Check SFP+ for compatibility on the faulty port by swapping with a working one.
re-seat the SFP+ again.
swap the data cables between working and faulty port. Then check if the error is present or not.
If all the above 4 cases the error only comes on the fault port, then indeed the NIC card has only 3 working ports among 4. The NIC card needs replacements or one should ignore the faulty port altogether. Hence this is not a DPDK PMD or library issue.

DPDK IPv4 Flow Filtering on Mellanox

I have a DPDK application that used Boost asio to join a multicast group and receives multicast IPv4 UDP packets over a VLAN on a particular UDP port number (other UDP ports are also used for other traffic). I am trying to receive only those multicast UDP packets at that port in the DPDK application and place them into an RX queue and have all other ingress network traffic act as if the DPDK application were not running (go to kernel). As such, I am using flow isolated mode (rte_flow_isolate()). The flow filtering part of my application as based off the flow_filtering example provided by DPDK with the additions of the call to rte_flow_isolate() and a VLAN filter. The filters I'm using are below:
action[0].type = RTE_FLOW_ACTION_TYPE_QUEUE;
action[0].conf = &queue;
action[1].type = RTE_FLOW_ACTION_TYPE_END;
pattern[0].type = RTE_FLOW_ITEM_TYPE_ETH;
pattern[1].type = RTE_FLOW_ITEM_TYPE_VLAN;
//vlan id here
pattern[2].type = RTE_FLOW_ITEM_TYPE_IPV4;
//no specific ip address given
pattern[3].type = RTE_FLOW_ITEM_TYPE_UDP;
//udp port here
pattern[4].type = RTE_FLOW_ITEM_TYPE_END;
Using these filters, I am unable to receive any packets and the same is true if I only remove the UDP filter. However, if I remove both the IPV4 and UDP filters (keeping the ETH and VLAN filters), I can receive all the packets I need, along with other ones I don't want (and would like to be sent to the kernel).
Here's an entry for the packet I need to receive from a Wireshark capture. Currently my theory is that because the reserved bit (evil bit) is being set in the IPv4 header, the packet is not being recognized as IPv4. This is probably a stretch:
Frame 100: 546 bytes on wire (4368 bits), 546 bytes captured (4368 bits) on interface 0
Ethernet II, Src: (src MAC), Dst: IPv4mcast_...
802.1Q Virtual LAN, PRI: 0, FRI: 0, ID: 112
Internet Protocol Version 4, Src: (src IP), Dst: (Dst mcast IP)
Version: 4
Header length: 20 bytes
Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00: Not-ECT (Not ECN-Capable Transport))
Total length: 1108
Identification: 0x000 (0)
Flags: 0x04 (RESERVED BIT HAS BEEN SET)
Fragment offset: 0
Time to live: 64
Protocol: UDP (17)
Header checksum: 0xd8c4 [validation disabled]
Source: srcip
Destination: dstip
User Datagram Protocol, Src Port: (src port), Dst Port: (dst port)
Data (N bytes)
The hardware I'm running on has a Mellanox ConnectX-5 card and as such, DPDK is using the MLX5 driver, which does not support RTE_FLOW_ITEM_TYPE_RAW along with many other items in the RTE Flow API. I am on DPDK 19.11 and the OFED version I'm using is 4.6 for RHEL 7.6 (x86_64)
What am I doing wrong here and why does adding the RTE_FLOW_ITEM_TYPE_IPV4 filter (without ip address, spec and mask both memset to 0) cause my application to not receive any packets, even though they are IPv4 packets? Is there a way around this with the MLX5 driver for DPDK?
The answer is quite simple: the packets are fragmented. There are two reasons they can’t be matched:
The UDP header is present only in the first IP fragment,
So from NIC perspective, fragmented UDP is just an IP packet.
Try to match non-fragmented packets to confirm.
Is it possible to use flow isolation mode while receiving multicast with DPDK? I thought that flow isolation and promiscuous or allmulticast modes were not compatible.

Not able to generate .pcap file with testpmd

I'm trying to use testpmd as a traffic sniffer and I want to save that traffic into a .pcap file.
I've installed and configured DPDK and binded the interface from which I want to capture traffic.
Network devices using DPDK-compatible driver
0000:01:00.0 'I210 Gigabit Network Connection 157b' drv=igb_uio unused=igb
Network devices using kernel driver
0000:02:00.0 'I210 Gigabit Network Connection 157b' if=enp2s0 drv=igb unused=igb_uio Active
0000:03:00.0 'I210 Gigabit Network Connection 157b' if=enp3s0 drv=igb unused=igb_uio Active
0000:04:00.0 'QCA986x/988x 802.11ac Wireless Network Adapter 003c' if=wlp4s0 drv=ath10k_pci unused=igb_uio
The problem I find is the following:
my#server:~/dpdk-stable-17.11.1$ sudo build/app/testpmd -c '0xf' -n 4 --vdev 'eth_pcap0,rx_iface=enp1s0,tx_pcap=/home/output.pcap' -- --port-topology=chained --total-num-mbufs=2048 --nb-cores=3
EAL: Detected 4 lcore(s)
EAL: Probing VFIO support...
EAL: PCI device 0000:01:00.0 on NUMA socket -1
EAL: Invalid NUMA socket, default to 0
EAL: probe driver: 8086:157b net_e1000_igb
EAL: PCI device 0000:02:00.0 on NUMA socket -1
EAL: Invalid NUMA socket, default to 0
EAL: probe driver: 8086:157b net_e1000_igb
EAL: PCI device 0000:03:00.0 on NUMA socket -1
EAL: Invalid NUMA socket, default to 0
EAL: probe driver: 8086:157b net_e1000_igb
PMD: Initializing pmd_pcap for eth_pcap0
PMD: Couldn't open enp1s0: enp1s0: SIOCETHTOOL(ETHTOOL_GET_TS_INFO) ioctl failed: No such device
PMD: Couldn't open interface enp1s0
vdev_probe(): failed to initialize eth_pcap0 device
EAL: Bus (vdev) probe failed.
USER1: create a new mbuf pool <mbuf_pool_socket_0>: n=2048, size=2176, socket=0
Configuring Port 0 (socket 0)
Port 0: 00:0D:B9:48:87:54
Checking link statuses...
Done
No commandline core given, start packet forwarding
io packet forwarding - ports=1 - cores=1 - streams=1 - NUMA support enabled, MP over anonymous pages disabled
Logical Core 1 (socket 0) forwards packets on 1 streams:
RX P=0/Q=0 (socket 0) -> TX P=0/Q=0 (socket 0) peer=02:00:00:00:00:00
io packet forwarding packets/burst=32
nb forwarding cores=3 - nb forwarding ports=1
port 0:
CRC stripping enabled
RX queues=1 - RX desc=128 - RX free threshold=32
RX threshold registers: pthresh=8 hthresh=8 wthresh=4
TX queues=1 - TX desc=512 - TX free threshold=0
TX threshold registers: pthresh=8 hthresh=1 wthresh=16
TX RS bit threshold=0 - TXQ flags=0x0
Press enter to exit
PMD: eth_igb_interrupt_action(): Port 0: Link Up - speed 1000 Mbps - full-duplex
Port 0: LSC event
Telling cores to stop...
Waiting for lcores to finish...
---------------------- Forward statistics for port 0 ----------------------
RX-packets: 4498370 RX-dropped: 1630 RX-total: 4500000
TX-packets: 4498370 TX-dropped: 0 TX-total: 4498370
----------------------------------------------------------------------------
+++++++++++++++ Accumulated forward statistics for all ports+++++++++++++++
RX-packets: 4498370 RX-dropped: 1630 RX-total: 4500000
TX-packets: 4498370 TX-dropped: 0 TX-total: 4498370
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Done.
Shutting down port 0...
Stopping ports...
Done
Closing ports...
Done
Bye...
PMD is not able to open enp1s0 because it is being used by DPDK so the kernel does not have access to it.
What can I do?
Thanks in advance!!
PMD is not able to open enp1s0 because it is being used by DPDK so the kernel does not have access to it.
You are right. The idea behind pcap PMD is to use kernel interface and/or .pcap files in DPDK using pcap library. But once you bound an interface to UIO, you cannot open it anymore with pcap library:
PMD: Initializing pmd_pcap for eth_pcap0
PMD: Couldn't open enp1s0: enp1s0: SIOCETHTOOL(ETHTOOL_GET_TS_INFO) ioctl failed: No such device
PMD: Couldn't open interface enp1s0
vdev_probe(): failed to initialize eth_pcap0 device
Instead we can use another interface (say, enp2s0) as a source:
--vdev 'eth_pcap0,rx_iface=enp2s0,tx_pcap=/home/output.pcap'
Or we can use another .pcap file as a source:
--vdev 'eth_pcap0,rx_pcap=/home/input.pcap,tx_pcap=/home/output.pcap'
Also please note that writing to .pcap might slow down the testpmd performance, i.e. the performance will be bound by pcap_dump() calls.