DPDK - interrupts rather than polling

DPDK - interrupts rather than polling - dpdk

Is it possible to configure DPDK so that the NIC sends an interrupt whenever a packet is received (rather than turning off interrupts and having the core poll on the RX queue)? I know this seems counterintuitive but there is a use case I have in mind that could benefit from this.
DPDK claims to allow you to use interrupts for RX queues (you can call rte_eth_dev_rx_intr_enable and pass a port/queue pair as arguments), but upon digging through the code, it seems that this is misleading. There is a polling thread that calls epoll_wait, and upon receipt of a packet, calls eal_intr_process_interrupts. This function then goes through a list of callback functions (which are supposed to be the interrupt handlers) and executes each one. The function then calls epoll_wait again (i.e. it is in an infinite loop).
Is my understanding of how DPDK handles "interrupts" correct? In other words, even if you turn "interrupts" on, DPDK is really just polling in the background and then executing callback functions (so there are no interrupts)?

Is my understanding of how DPDK handles "interrupts" correct?
DPDK is a user space application. Unfortunately, there is no magic way to receive an interrupt callback directly to the user space application.
So NIC interrupts get serviced in kernel any way, then kernel notifies to a user space using an eventfd. User space thread waits for the eventfd notification using epoll_wait.
In other words, even if you turn "interrupts" on, DPDK is really just polling in the background and then executing callback functions (so there are no interrupts)?
If there is no data to receive, DPDK polling thread should block on epoll_wait.

Related

QTcpSocket is really full duplex?

BSD stream sockets are full duplex, meaning two connected parties can both send/receive at the same time.
A QTcpSocket (qt socket implementation) has asynchronous support, non blocking mode, but can only belong to one thread, see qt docs.
Event driven objects may only be used in a single thread.
Specifically, this applies to the timer mechanism and the network
module.
Let's say I want a transmit/tx thread and a separate receive/rx thread to use the same socket and send/receive data at the same time.
In my understanding this can be 'done' via qt signals/slots, but the socket thread will never really perform the send() and the receive() simultaneously. It just runs the event loop which will do this in a serial fashion and emit the signals when send/receive is done.
Yes, my rx and tx threads can work concurrently and handle the notifications via qt slots, but the socket itself is never really used in full duplex mode.
Is it correct to say that: considering one endpoint only, in the socket thread, its send() and receive() calls are always serial, never simultaneous?
(because the event loop thread is one thread only)

In my understanding this can be 'done' via qt signals/slots, but the
socket thread will never really perform the send() and the receive()
simultaneously. It just runs the event loop which will do this in a
serial fashion and emit the signals when send/receive is done.
True, but keep in mind that the kernel buffers incoming and outgoing data, and QTCPSocket sets the socket to non-blocking, so that the send() and recv() calls always return immediately and never block the event-loop. That means that the actual processes of sending and receiving data will happen simultaneously (inside the kernel), even if the (more-or-less instantaneous) send() and recv() calls technically do not. (*)
Yes, my rx and tx threads can work concurrently and handle the
notifications via qt slots, but the socket itself is never really used
in full duplex mode. Is this correct?
That is not correct -- the socket's data streams can (and do) flow both ways across the network simultaneously, so the socket really is full-duplex. The full-duplex capability is present whether you are using a single thread or multiple threads.
(*) You can test this with a single-threaded Qt program that uses a QTCPSocket to send or receive data, by simply disconnecting your computer's Ethernet cable during a large data transfer. If the QTCPSocket's send() or recv() calls are blocking until completion, that would block the GUI thread and cause your GUI to become unresponsive until you reconnect the cable (or until the TCP connection times out after several minutes).

Making the application passive, which triggered by events?

I'm studying some codes about RS232 with Borland C++. The implementation of reading data from the port is polling the status of the port by timer. There are some events checking whether the status of the port changed. If the status changed, events trigger the data-reading subroutine.
However, I think that polling is so bad that much resource is spent on the action. Could the program be passive in monitoring the port without any aggressive polling or something else? In other words,
the program hibernates unless some events which triggered by incoming
data in the port activate it.
Is the idea is possible?
Thank you for reading
Best regards

I think for your requirements the design pattern named Reactor is appropriate. Reactor is based on the system call 'select' (which is available in both Unix and Windows environments). From the referenced document,
Blocks awaiting events to occur on a set of Handles. It returns when it is possible to
initiate an operation on a Handle without blocking. A common demultiplexer for I/O
events is select [1], which is an event demultiplexing system call provided by the UNIX
and Win32 OS platforms. The select call indicates which Handles can have operations
invoked on them synchronously without blocking the application process.
You can see that this pattern is encoded as a library in several frameworks such as ACE, Boost.

If you are working with the Win32 API functions for reading the serial port you can call ReadFile. It will suspend until it has the number of bytes you requested or until a timeout that you can set. If your program is a GUI then the serial read should be in a secondary thread so the GUI thread can react to any received Windows messages.

Can I WaitForMultipleObjects on an event and an IOCompletionPort having input?

I am adding support for a FTDI driver to an existing code base which communicates with serial ports and pipes using Overlapped IO and an IOCompletionPort. I would like to interface directly with the FTD2xx.dll rather than use the virtual com port function (http://www.ftdichip.com/Support/Documents/ProgramGuides/D2XX_Programmer%27s_Guide%28FT_000071%29.pdf).
The problem is that, as far as I understand, the FTD2xx.dll emulates Overlapped IO but is not compatible with an IOCompletionPort. It is however possible to pass in an event which is set whenever anything has changed in the drivers internal state. The program I'm updating has very low throughput but requires insanely low latency (real time communication with an embedded system).
So my question is how can I wait for either an event to be signaled or an IOCompletionPort to not be empty? Preferably not using any other threads.
Or alternatively could I use RegisterWaitForSingleObject with a call back which posts a custome message to the IOCompletionPort? I understand this uses the thread pool, could this increase latency in cases where the system is busy? (I can set my own thredas to high priority but I don't know anything about the priorities of the thread pool).
Edit: If I use the WT_EXECUTEINWAITTHREAD flag in RegisterWaitForSingleObject what thread is this "waiter thread" and what priority does it have?

An IOCP is not a waitable object, so you cannot use it directly with any of the wait functions. What you can do is create a separate waitable event via CreateEvent() and then have a separate thread call GetQueuedCompletionStatus/Ex() and signal the event when an IOCP packet arrives.

Priority of kernel modules and SCHED_RR threads

I have an embedded Linux platform (the Beagleboard, running Angstrom Linux) with two devices connected:
a Laser range finder (Hokuyo UTM 30) connected via USB
a custom external board connected via SPI
We have a written a Linux kernel module which is responsible for the SPI data transfer. It has an IRQ handler in which spi_async is called which in turn causes an async callback method to be called.
My C++ application consists of three threads:
a main thread for data processing
a laser polling thread
an SPI polling thread
I am experiencing problems which seem to be caused by how the modules described above interact.
When I switch off the USB device (laser range finder) I receive all SPI messages correctly (1 message every 3ms, message length divided by data rate is <1ms), independent from thread scheduling
When I switch on the USB device and I run my program with normal thread scheduling (SCHED_OTHER, priority 0, no nice level set) about 1% of the messages is "lost" because the callback method of spi_async is running when the next IRQ occurs (I could handle this case differently in order not to loose the messages, so this is not a big issue.)
With the USB device turned on and I run the program with SCHED_RR and
priority = 10 for main thread
priority = 10 for SPI reading thread
priority = 4 for USB/Laser polling thread
then I am loosing 40% of the messages because the IRQ is triggered again before the spi-callback method is called! (I could still maybe find a workaround, but the problem is that I need fast response times which can no longer be reached in this case). I need to use the thread scheduling and the laser device so I am looking for a way to solve this case.
Question 1:
My assumption was that IRQ handlers and the callbacks triggered by spi_async in kernel space have higher priority than any thread running in user space (no matter if SCHED_RR or SCHED_OTHER). This would mean that turning to SCHED_RR in my application shouldn't slow down SPI transfer, but this seems very wrong. Is it?
Question 2:
How can I determine what happens here? Which debugging aids exist? (Or maybe you don't need any further information?) The main question for me is: why do I experience the problems only when the laser device is turned on. Could the USB driver consume so much time?
----- EDIT:
I have made the following observation:
The spi_async's callback calls wake_up_interruptible(&mydata->readq); (with wait_queue_head_t readq;). From the user space (my app) I call a function which results in poll_wait(file, &mydata->readq, wait); When the poll returns the user space calls read().
When my application runs with SCHED_OTHER I can see that the callback method first finishes before the read() method in my kernel module is entered.
When my application runs with SCHED_RR read is entered before exiting the callback.
This seems to proof that the priority of the user space threads is higher than the callback method's context's priority. Is there any way to change this behaviour and still have SCHED_RR for my application's threads?

Not all kernel thread have an RT priority. Imagine a periodically waking up thread that needs to do some background work is waking up. You don't want this thread to preemt your RT thread. So I guess your first assumption is wrong.
Based on your other questions :
your main processing loop receives SPI data through a queue
the spi processing thread feeds the main processing queue
It seems your main processing thread get in the way of the spi driver thread responsible for the spi data transfer.
Here is what happens :
an IRQ is fired
spi_async is called, which means a data transfer is queued, that will be picked up by a thread created by the spi master driver.
spi master thread compete with your main processing thread, the laser thread, but this kernel thread has not RT priority, so it looses every time one of the RR thread is running.
What you can do is going back to normal scheduling, while playing with the various CONFIG_PREEMPT_ options. Or mess with the spi master driver, to ensure that any delayed work is queued with enough priority. Or even not queued at all.

Programmatically Interrupting Serial I/O when USB Device is Removed - C++

I have an application wherein serial I/O is conducted with an attached USB device via a virtual COM port. When surprise removal of the device is detected, what would be the best way to stop the serial I/O. Should I simply close the port? Or, should there be a global variable, which is maintained to indicate the presence of the device, that should be checked in each serial I/O function prior to attempting to transmit/receive data? Or, should it be a combination of the two, or something else? Thanks.

I'm assuming you are running Windows.
This depends on how you have designed your communication flow.
I have a BasePort object where I have derived a COMPort object (and many other communication objects). The COMPort object creates one TXThread and RXThread class. These threads are waiting for the "OVERLAP" to signal that the read or write operation finished with WaitForMultipleObjects().
The TXThreads goes to sleep if there is nothing to do and wakes up by the TXWrite function (the data between main process and thread goes through a trhead safe FIFO buffer).
In this case they also need to wait for an event signal that the port has closed, so they actually can cancel any pending operations and exit (the treads exits and gets deleted).
To detect if the USB port is connectd/disconneted I listen for the Windows message DEVICE_CHANGE. If the port is disconnected I set the event and waits for the threads to exit before the Port class deletes and closes the port.
I have found this approach very reliable and safe. It's the core in a communication platform I designed for over 8 years ago and still kicking.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js