XDP and eBPF performance with AMD EPYC CPU

XDP and eBPF performance with AMD EPYC CPU - c++

I'm always running XDP applications on servers with Intel Xeon Gold CPU's, performance was always good and was not a problem - up to 125Mpps with 100 GbE MCX515A-CCAT network card and 2 CPU's inside 1U server.
Today I was trying to make it work on AMD EPYC 7371 and for some reason the performance results were very low — maximum I've got is 13Mpps with the same network card and 1 AMD EPYC 7371. All cores are pushed to their max (100%).
Testing was done on ubuntu 18.04 (and 20.04 as well afterwards). I have installed the same Mellanox drivers as I do for intel (OFED), made other configurations as usual.
I run XDP in Driver mode, JIT enabled. Other tuning as follows:
sudo mlnx_tune -p HIGH_THROUGHPUT
sudo ethtool -G enp65s0 rx 512 tx 512
sudo ethtool -L enp65s0 combined 8
sudo ethtool --show-priv-flags enp65s0
sudo ethtool --set-priv-flags enp65s0 rx_cqe_compress on
Is there anything else I should do to run it on AMD CPU? Because performance just can't be so low and I guess I misconfigured something throughout?
A bit more info about our config: kernel 5.4 for 20.04 and 4.5 for 18.04, we generate traffic from our 2nd machine (TRex) which also has MCX515A-CCAT network card and is connected to 1st machine (AMD one) with Mellanox 100 GbE cable

Related

DPDK l2fwd - How to forward ethernet interface to my PMD

I have a board with one ethernet interface (eth0) running Linux.
I'm trying to forward all incoming traffic from eth0 to my PMD driver, using dpdk-l2fwd example application.
Here is what I've tried:
./dpdk-l2fwd -c 0x3 --vdev={my_pmd}0 -- -p 0x3 -T 0
I can see that my rx_pkt_burst callback is polled by the application, but that's it.
How can I forward all incoming eth0 packets to my PMD?
I tried to use net_tap, using the following command:
./dpdk-l2fwd -c 0xff --vdev=net_tap0 --vdev={my_pmd}0 -- -p 0x7 -T 0 --portmap="(1,2)"
And my tx_pkt_burst callback is called occasionally, but not when I think it should be called.
For example, if I ping this board from another one, the ping is successful, but the tx_pkt_burst callback is not been called.
I tried to use devbind tool, but no devices are detected:
./usertools/dpdk-devbind.py --status
No 'Network' devices detected
=============================
No 'Baseband' devices detected
==============================
No 'Crypto' devices detected
============================
No 'Eventdev' devices detected
==============================
No 'Mempool' devices detected
=============================
No 'Compress' devices detected
==============================
No 'Misc (rawdev)' devices detected
===================================
No 'Regex' devices detected
===========================
Update
DPDK version - 20.11.
My HW is a embedded device based on NXP's Layerscape.
$ lshw -class network
*-network
description: Ethernet interface
physical id: 3
logical name: eth0
serial: 00:11:22:44:11:44
size: 1Gbit/s
capacity: 1Gbit/s
capabilities: ethernet physical tp mii 10bt-fd 100bt-fd 1000bt-fd autonegotiation
configuration: autonegotiation=on broadcast=yes driver=fsl_dpaa2_eth driverversion=5.10.35-00002-g3434eea0e1e7-dir duplex=full firmware=7.17 ip=192.168.15.157 link=yes multicast=yes port=twisted pair speed=1Gbit/s
I'm trying to bypass all traffic to the PMD I'm currently developing.
Thanks.

[EDIT-1] clarification of using same interface for DPDK and Kernel routing
Answer> as discussed over comments please refer to DPDKD + kernel on same interface
Based on the information shared there are multiple questions to the single query I'm trying to bypass all traffic to the PMD I'm currently developing. Addressing each one separately below
question 1: using dpdk-l2fwd example application
Answer> DPDK application l2fwd application makes use of basic APi with almost no HW offloads. Based on your environment (I have a board with one ethernet interface (eth0)), the right set of parameters should be -p 0x1 --no-mac-updating -T 1. This will configure the application to receive and transmit packet using single DPDK interface (that is eth0 on your board).
Note: DPDK Application can work with DPDK PMD both physical and virtual
question 2: I tried to use net_tap, using the following command:
Answer> If the intend is to intercept the traffic from physical and then forward to tap interface, then one needs modify the eal arguments as ./build/l2fwd --vdev=net_tap0,iface="my_eth0" -- -p 0x3 -T 1 --no-mac-updating. This will allow the application to probe physical NXP interface (eth0) and make use of Linux TAP interface as secondary interface. Thus any traffic from NXP and TAP will be cross connected such as NXP (eth0) <==> TAP (my_eth0)
question 3: ./usertools/dpdk-devbind.py --status returns empty
Answer> Form the dpdk site supported NIC list NXP dpaa, dpaa2, enetc, enetfec, pfe. Cross checking the kernel driver fsl_dpaa2_eth I think it is safe to assume dpaa2 PMD is supported. As you have mentioned the NIC is not enumerated, it looks like there are certain caveats to such model revision, supported board, BSP package, vendor-sub vendor ID check etc. More details can be found Board Support Package, and DPAA2 NIC guide
Debug & Alternative solutions:
To start with use the Kernel Driver to bring in packets
Use extra logging and debug to identify why the NIC is shown in the application
Approach 1:
Make sure the NIC is bind with kernel driver fsl_dpaa2_eth.
ensure NIC is connected and link is up with ethtool eth0
set to promiscous mode with ifconfig eth0 promisc up
start DPDK application with PCAP PMD, ./build/l2fwd --vdev=net_pcap0,iface=eth0 -- -p 1 --no-mac-updating -T 1
Check packet are received and redirected to PCAP eth0 PMD by checking the statistics.
Approach 2:
Ideally the NIC should be categorized under network device to be probed by debind.py.
Check the device details using lshw -c net -businfo for network.
try checking with lspci -Dvmmnnk [PCIe BUS:Slot:Function id] for network details.
If above details does not show up as network device this might be reason for not getting listed.
Suggestions or workaround: You can try to forcefully bind with igb_uio or vfio-pci (I am not much famialr with NXP SoC) by dpdk-devbind -b vfio-pci [PCIe S:B:F]. Then cross check with lspci -ks [PCIe S:B:F]. Once successfully done, one can start dpdk l2fwd in PMD debug mode with ./build/l2fwd -a [PCIe S:B:F] --log-level=pmd,8 -- -p 1 --no-mac-updating | more. Thus by intercepting and interpreting the logs one can identify what is going
Note:
It is assumed the application is build with static libraries and not dynamic. To build with static libraries use make static for l2fwd.
For the described use case recommended application is basicfwd/skeleton rather than l2fwd.

Found the problem.
I had to unbind eth0 from Linux kernel.
Now I can simply run:
./dpdk-l2fwd -c 0x3 --vdev={MY_PMD}0 -- -p 0x3 -T 1
And all traffic in the physical port is forwarded to my PMD.

"minikube start" stops to run on Ubuntu 18.04

I am trying, unsuccessfully, to run "minikube start" on my UBUNTU 18.04.
The characteristics of the system are:
Windows 10 PC with Virtual Box 6.1.16
In Virtual Box I have installed Ubuntu 18.04 (11GB of memory and 2 processors)
I've also enabled nested virtualization
In Ubuntu I also installed (searching with google):virtualbox-dkms and linux-headers-generic
Every time I run
minikube start
I get the following messages
minikube v1.15.1 on Ubuntu 18.04
Automaticaally selected the virtualbox driver
Downloading VM boot image ...
minikube-v1.15.0.iso.sha256: 65 B / 65 B [-------] 100.00%? p / s 0s
minikube-v1.15.0.iso: 181.00 MiB / 181.00 MiB [] 100.00% 6.45 MiB p / s 28s
Starting control plne node minikube in cluster minikube
Downloading Kubernetes v1.19.4 preload ...
preloaded-images-k8s-v6-v1.19.4-docker-overlay2-amd64.tar.lz4: 486.35 MiB
Creating virtualbox VM (CPU = 2, Memory = 2600MB, Disk = 20000MB) ...
and here it stops.
I also thought that the problem was related to the lz4 format and that the machine was unable to decompress the image.
I then installed
liblz4-tool
but without success.
Could you help me?
Thanks

Running DPDK C program without root privileges

I have C application that is using DPDK 19.11. Currently, the application is running with root permissions (using sudo command). In addition my application is running with huge pages (1GB).
Network devices using DPDK-compatible driver:
0000:02:00.0 'Ethernet Controller X710 for 10GbE backplane 1581'
drv=igb_uio unused=
I would like to run my application without the root permissions - get rid from "sudo" command.
I change permission for those files/folders:
/sys/class/uio/uio*/device/resource*
/sys/class/uio/uio*/device/config
/dev/uio*
/dev/hugepages/*
when I run my application without "sudo"- I run in a problems with rte_eal_init function. I got this error:
EAL: FATAL: Cannot use IOVA as 'PA' since physical addresses are not available
EAL: Cannot use IOVA as 'PA' since physical addresses are not available
My OS is Ubuntu 18.04, kernel 4.15.0-128-generic I noticed that at DPDK docs there is a remark about Running DPDK Applications Without Root Privileges - "since version 4.0, the kernel does not allow unprivileged processes to read the physical address information from the pagemaps file, making it impossible for those processes to be used by non-privileged users. In such cases, using the VFIO driver is recommended."
After reading comments I tried to use vfio-pci.
I load the module using:
sudo modprobe vfio-pci enable_unsafe_noiommu_mode=1
I also changed permission for /dev/hugepages/* and /dev/vfio/*
running with vfio-pci and sudo was successfully.
when running without sudo i got the same error:
EAL: FATAL: Cannot use IOVA as 'PA' since physical addresses are not available
EAL: Cannot use IOVA as 'PA' since physical addresses are not available
See also: https://doc.dpdk.org/guides/linux_gsg/enable_func.html#running-dpdk-applications-without-root-privileges
I wonder if someone has experience to run DPDK application without root with kernel 4.0 and above?
Also, as an alternative solution is to launch simple DPDK application with root privileges that will init DPDK. In parallel run another application without root privileges - this application will consume the packet and perform the business logic, is it possible?
thanks

First, it makes sense to check if you really need to use the unsafe mode with vfio-pci. Perhaps you just need to add intel_iommu=on iommu=pt to the kernel parameters for making the device work safely, i.e.:
modprobe vfio-pci
I haven't used the unsafe mode so far, perhaps the kernel even unconditionally disallows mappings for the vfio device, if unsafe mode is enabled, for (obvious?) security reasons.
For running a dpdk application without root privileges you need to adjust the permissions of the right vfio device. For example, when the permissions look like this
# ls -l /dev/vfio/
total 0
crw-------. 1 root root 235, 0 2021-08-21 15:13 17
crw-rw-rw-. 1 root root 10, 196 2021-08-21 15:13 vfio
then /dev/vfio/17 is the device you've bound for dpdk, thus adjust its permission like this:
chown juser /dev/vfio/17
A user process doesn't need extra permission for mapping huge pages. You don't even have to mount the hugetblfs, if you supply the --in-memory option to your dpdk program.
However, some hugepages must be reserved by root, e.g. during system boot. Example:
echo 4096 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
echo 8 > /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/nr_hugepages
Besides permissions, the default resource limits likely are too low. Especially, the memlock ones. If it's too low kernel logs something like this when starting the dpdk application:
kernel: vfio_pin_pages_remote: RLIMIT_MEMLOCK (65536) exceeded
And the dpdk application prints:
EAL: cannot set up DMA remapping, error 12 (Cannot allocate memory)
EAL: 0000:05:00.1 DMA remapping failed, error 12 (Cannot allocate memory)
Increasing the limits fixes this issue, e.g.:
cat /etc/security/limits.d/24-memlock.conf
# memlock unit: KiB
juerr hard memlock 16777216
juser soft memlock 1048576

DPDK should detect if you should use IOVA VA or PA. Using the switch enable_unsafe_noiommu_mode=1 is telling DPDK that you have no iommu and that you will use IOVA PA.
The problem is, that running in PA mode requires root privileges as you need access to the physical address.
That dpdk.org document you cited should do the trick. I was able to get DPDK running without root privileges in 20.02 in a docker container. However, there was another problem with the software we were running on top of DPDK and its interaction with the hugepage backing.
In the end, we decided to still run DPDK as root, however, we limited the capabilities of the container to the bare minimum set needed to run DPDK.

Geth (go-ethereum) uses 100% CPU during mining even with 1 thread is specified

Steps to reproduce:
Run geth with parameters
--mine --minerthreads "1"
or
--mine --minerthreads 1
Expected behaviour:
Only 1 thread is used.
Actual behaviour:
All CPU threads (8) are used with a 100% load.
System information:
Geth version: 1.9.6
OS & Version: Linux (Ubuntu 18.04)

It only limits the threads for mining, but your geth node is also syncing with other nodes, and this can occupy a huge amount of CPU resources.

"Incompatible hardware version" error when running DPDK on VMWare with VMXNET3 interface

We're trying to run DPDK example apps in a guest machine running Centos 7.5. The host is ESXi version 6.5.
I'm building dpdk on the guest machine where I'm trying to run it. I've tried both DPDK versions 18.05 and 18.08.
We have created five interfaces on esxi for connection to our guest. One management port and four data ports. We're binding theses four data ports to DPDK. The ports are all VMXNET3 interfaces. They are basically setup like the VMXNET3 interfaces in [https://doc.dpdk.org/guides/nics/vmxnet3.html], using a vswitch to connect to a physical interface. However note that we do not have any VF interfaces as shown in this document, only VMXNET3 interfaces. Unfortunately this document does not show any details on how to do the setup.
This document from vmware also shows a very similar setup. But again no details on how to setup.
Fundamentally, the roadblock we are hitting is that the VMXNET3 interfaces are failing initialization when starting the DPDK example app. Here is what we see:
[root#rg-vm ~]# ./dpdk-18.08/examples/packet_ordering/build/packet_ordering -c 0x0e0 -- -p 0xf
EAL: Detected 24 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Probing VFIO support...
EAL: PCI device 0000:04:00.0 on NUMA socket -1
EAL: Invalid NUMA socket, default to 0
EAL: probe driver: 15ad:7b0 net_vmxnet3
eth_vmxnet3_dev_init(): Incompatible hardware version: 0
EAL: Requested device 0000:04:00.0 cannot be used
We see this for all four interfaces that we are trying to bind to DPDK. However, strangely, sometimes after a reboot, the first two interfaces initialize correctly. But after that first attempt, all four interfaces then fail the same way.
Here are the commands we're using to setup DPDK.
modprobe uio
insmod ./dpdk-18.08/build/build/kernel/linux/igb_uio/igb_uio.ko
./dpdk-18.08/usertools/dpdk-devbind.py --bind=igb_uio 04:00.0
./dpdk-18.08/usertools/dpdk-devbind.py --bind=igb_uio 0c:00.0
./dpdk-18.08/usertools/dpdk-devbind.py --bind=igb_uio 13:00.0
./dpdk-18.08/usertools/dpdk-devbind.py --bind=igb_uio 1b:00.0
Note that we have also tried using the uio_pci_generic with the same results. We have not been able to get the vfio-pci driver to bind to the VMXNET3 interfaces.
I'm not sure it matters, but the physical interfaces on the other side of the vswitch that we're connecting to are:
17:00.0 Ethernet controller: Intel Corporation I350 Gigabit Fiber Network Connection (rev 01)
We have also tried using a Ethernet cards based on the intel 82576 chipset (this is the chipset DPDK shows being used in their documentation), and one based on the Intel X710. We see the same error using either of these cards as we did with the i350. So I think that eliminates the ethernet hardware, which makes sense, as using the vswitch between us and the ethernet controller should make us agnostic to what it actually is.
We are running on a Dell R540. Also note that when we run Centos 7.5 with DPDK on this hardware without VMWare, everything works fine. Also if we run in VMWare, but "passthrough" the i350 interfaces to the VM (instead of using vswitch and vmxnet) everything also works fine in that case.
I've tried updating the kernel (3.10) to the latest (4.18) but still get the same error.
If I try to read the version register (VRRS) (the one that causes this error) in the vmxnet3 pci bar registers (before I bind to DPDK) using ethtool, it looks fine (0xf). I've googled around a lot but can't seem to find much help on this. It is very possible the issue is with how I'm setting things up but I can't find any info that gives details on how else to do it.
Any help would be greatly appreciated. Thanks!

Try these steps:
cd /etc/default
vi grub
Edit GRUB-CMDLINE and Add “nopku”
GRUB_CMDLINE_LINUX="crashkernel=auto rd.lvm.lv=centos/root rd.lvm.lv=centos/swap rhgb quiet nopku transparent_hugepage=never log_buf_len=8M"
Recompile grub: sudo grub2-mkconfig -o /boot/grub2/grub.cfg
Reboot the VM and try DPDK.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js