Linux poll on serial transmission end - c++

I'm implementing RS485 on arm developement board using serial port and gpio for data enable.
I'm setting data enable to high before sending and I want it to be set low after transmission is complete.
It can be simply done by writing:
//fd = open("/dev/ttyO2", ...);
write(fd, data, datalen);
tcdrain(fd); //Wait until all data is sent
I wanted to change from blocking-mode to non-blocking and use poll with fd. But I dont see any poll event corresponding to 'transmission complete'.
How can I get notified when all data has been sent?
System: linux
Language: c++
Board: BeagleBone Black

I don't think it's possible. You'll either have to run tcdrain in another thread and have it notify the the main thread, or use timeout on poll and poll to see if the output has been drained.
You can use the TIOCOUTQ ioctl to get the number of bytes in the output buffer and tune the timeout according to baud rate. That should reduce the amount of polling you need to do to just once or twice. Something like:
enum { writing, draining, idle } write_state;
while(1) {
int write_event, timeout = -1;
if (write_state == writing) {
poll_fds[poll_len].fd = write_fd;
poll_fds[poll_len].event = POLLOUT;
write_event = poll_len++
} else if (write == draining) {
int outq;
ioctl(write_fd, TIOCOUTQ, &outq);
if (outq == 0) {
write_state = idle;
} else {
// 10 bits per byte, 1000 millisecond in a second
timeout = outq * 10 * 1000 / baud_rate;
if (timeout < 1) {
timeout = 1;
int r = poll(poll_fds, poll_len, timeout);
if (write_state == writing && r > 0 && (poll_fds[write_event].revent & POLLOUT)) {
DataEnable.Set(true); // Gets set even if already set.
int n = write(write_fd, write_data, write_datalen);
write_data += n;
write_datalen -= n;
if (write_datalen == 0) {
state = draining;

Stale thread, but I have been working on RS-485 with a 16550-compatible UART under Linux and find
tcdrain works - but it adds a delay of 10 to 20 msec. Seems to be polled
The value returned by TIOCOUTQ seems to count bytes in the OS buffer, but NOT bytes in the UART FIFO, so it may underestimate the delay required if transmission has already started.
I am currently using CLOCK_MONOTONIC to timestamp each send, calculating when the send should be complete, when checking that time against the next send, delaying if necessary. Sucks, but seems to work


Using interrupts with DPDK

I'm trying to enable interrupts in DPDK so that my network receive thread can sleep on an epoll until packets arrive. I am using the igb_uio and ixgbe drivers with an Intel 82599ES 10Gbps NIC.
I'm doing roughly the following to enable the interrupts, but the epoll never indicates that packets have arrived. The thread only handles packets when the epoll times out. I don't even see interrupts arrive from the device when monitoring /proc/interrupts.
port_conf.intr_conf.rxq = 1;
CHECK_EQ(rte_eth_dev_rx_intr_ctl_q(kPort, kQueue, RTE_EPOLL_PER_THREAD,
CHECK_EQ(rte_eth_dev_rx_intr_enable(kPort, kQueue), 0);
rte_epoll_event event;
while (true) {
int n = rte_epoll_wait(RTE_EPOLL_PER_THREAD, &event, /*maxevents=*/1,
if (n == 0) {
// Timeout expired.
} else {
// Received RX interrupt.
Given that I don't see anything coming through in /proc/interrupts, I am going to start digging through the ixgbe driver. However, I wanted to ask here first to see if my setup is missing anything obvious since this is supposed to be easy to do. I based my code closely on the l3fwd-power example.
I talked with Dmitry Kozlyuk who pointed out that the ixgbe driver requires interrupts to be re-armed before each call to rte_epoll_wait. So rte_eth_dev_rx_intr_enable should be called before each call to rte_epoll_wait in the while loop:
rte_epoll_event event;
while (true) {
CHECK_EQ(rte_eth_dev_rx_intr_enable(kPort, kQueue), 0);
int n = rte_epoll_wait(RTE_EPOLL_PER_THREAD, &event, /*maxevents=*/1,
if (n == 0) {
// Timeout expired.
} else {
// Received RX interrupt.
You need to do the re-arm before each call to rte_poll_wait for some other drivers, too, such as vmxnet3.

Why does setting SO_SNDBUF and SO_RCVBUF destroy performance?

Running in Docker on a MacOS, I have a simple server and client setup to measure how fast I can allocate data on the client and send it to the server. The tests are done using loopback (in the same docker container). The message size for my tests was 1000000 bytes.
When I set SO_RCVBUF and SO_SNDBUF to their respective defaults, the performance halves.
SO_RCVBUF defaults to 65536 and SO_SNDBUF defaults to 1313280 (retrieved by calling getsockopt and dividing by 2).
When I test setting neither buffer size, I get about 7 Gb/s throughput.
When I set one buffer or the other to the default (or higher) I get 3.5 Gb/s.
When I set both buffer sizes to the default I get 2.5 Gb/s.
Server code: (cs is an accepted stream socket)
void tcp_rr(int cs, uint64_t& processed) {
/* I remove this entire thing and performance improves */
if (setsockopt(cs, SOL_SOCKET, SO_RCVBUF, &ENV.recv_buf, sizeof(ENV.recv_buf)) == -1) {
perror("RCVBUF failure");
char *buf = (char *)malloc(ENV.msg_size);
while (true) {
int recved = 0;
while (recved < ENV.msg_size) {
int recvret = recv(cs, buf + recved, ENV.msg_size - recved, 0);
if (recvret <= 0) {
if (recvret < 0) {
perror("Recv error");
processed += recvret;
recved += recvret;
Client code: (s is a connected stream socket)
void tcp_rr(int s, uint64_t& processed, BenchStats& stats) {
/* I remove this entire thing and performance improves */
if (setsockopt(s, SOL_SOCKET, SO_SNDBUF, &ENV.send_buf, sizeof(ENV.send_buf)) == -1) {
perror("SNDBUF failure");
char *buf = (char *)malloc(ENV.msg_size);
while (stats.elapsed_millis() < TEST_TIME_MILLIS) {
int sent = 0;
while (sent < ENV.msg_size) {
int sendret = send(s, buf + sent, ENV.msg_size - sent, 0);
if (sendret <= 0) {
if (sendret < 0) {
perror("Send error");
processed += sendret;
sent += sendret;
Zeroing in on SO_SNDBUF:
The default appears to be: net.ipv4.tcp_wmem = 4096 16384 4194304
If I setsockopt to 4194304 and getsockopt (to see what's currently set) it returns 425984 (10x less than I requested).
Additionally, it appears a setsockopt sets a lock on buffer expansion (for send, the lock's name is SOCK_SNDBUF_LOCK which prohibits adaptive expansion of the buffer). The question then is - why can't I request the full size buffer?
Clues for what is going on come from the kernel source handle for SO_SNDBUF (and SO_RCVBUF but we'll focus on SO_SNDBUF below).
net/core/sock.c contains implementations for the generic socket operations, including the SOL_SOCKET getsockopt and setsockopt handles.
Examining what happens when we call setsockopt(s, SOL_SOCKET, SO_SNDBUF, ...):
/* Don't error on this BSD doesn't and if you think
* about it this is right. Otherwise apps have to
* play 'guess the biggest size' games. RCVBUF/SNDBUF
* are treated in BSD as hints
val = min_t(u32, val, sysctl_wmem_max);
sk->sk_userlocks |= SOCK_SNDBUF_LOCK;
sk->sk_sndbuf = max_t(int, val * 2, SOCK_MIN_SNDBUF);
/* Wake up sending tasks if we upped the value. */
if (!capable(CAP_NET_ADMIN)) {
ret = -EPERM;
goto set_sndbuf;
Some interesting things pop out.
First of all, we see that the max possible value is sysctl_wmem_max, a setting which is difficult to pin down within a docker container. We know from the context above that this is likely 212992 (half your max value you retrieved after trying to set 4194304).
Secondly, we see SOCK_SNDBUF_LOCK being set. This setting is in my opinion not well documented in the man pages, but it appears to lock dynamic adjustment of the buffer size.
For example, in the function tcp_should_expand_sndbuf we get:
static bool tcp_should_expand_sndbuf(const struct sock *sk)
const struct tcp_sock *tp = tcp_sk(sk);
/* If the user specified a specific send buffer setting, do
* not modify it.
if (sk->sk_userlocks & SOCK_SNDBUF_LOCK)
return false;
So what is happening in your code? You attempt to set the max value as you understand it, but it's being truncated to something 10x smaller by the sysctl sysctl_wmem_max. This is then made far worse by the fact that setting this option now locks the buffer to that smaller size. The strange part is that apparently dynamically resizing the buffer doesn't have this same restriction, but can go all the way to the max.
If you look at the first code snip above, you see the SO_SNDBUFFORCE option. This will disregard the sysctl_wmem_max and allow you to set essentially any buffer size provided you have the right permissions.
It turns out processes launched in generic docker containers don't have CAP_NET_ADMIN, so in order to use this socket option, you must run in --privileged mode. However, if you do, and if you force the max size, you will see your benchmark return the same throughput as not setting the option at all and allowing it to grow dynamically to the same size.

ESP8266-01 does not react to AT Commands over UART with TM4C123GH6PM

I am trying to connect my TM4C123GH6PM Microcontroller from Texas Instruments with my Smartphone and use it to control an alarm clock and LED Lights. (the LEDs are controlled over a Transistor, which is controlled over an GPIO Pin).
I have some experience with coding in C++ and the TM4C123GH6PM, but I am still learning a lot. So please excuse some foolish mistakes I might have made.
I want to connect the ESP8266 with the Microcontroller using UART and the TivaWare Framework.
I have written some code and my UART works correctly (I tested it by sending chars from UART 4 to 3).
According to the AT commands of ESP8266 It should respond to "AT" with "OK". But whenever I send something to the ESP it responds with exactly what I sent to it. I checked the wiring, and that's not The Issue. Or at least I think so. Please correct me, if the wiring is wrong.
ESP -> TM4C123GH6PM:
VCC -> 3.3V
Tx -> Rx (UART3 / PC6)
Rx -> Tx (UART4 / PC5)
CH_PD -> 3.3V
I also checked for the power consumption of the ESP. Everything is powered by the USB-port of my laptop, since that helps keep the cable mess down. I monitor the power consumption with ( The ESP is drawing about 150mA from the computer, but the port can provide a lot more. I checked with some LEDs and 400mA is not a problem.
Can anyone help me? I am working on this now for over two days and can't find a Solution. What is the Problem with the ESP not responding correctly to the AT command? The blue light is one, when the code is running.
PS: The attached code contains also code for the alarm clock control and LEDs. I attached it, since it could be part of the problem, but some of it is commented out and most of it is not used.
#include "driverlib/rom.h"
// stores the time since system start in ms
uint32_t systemTime_ms;
//bools or controling the alarm clock and LEDS
bool an_aus = false;
bool alarm_clock = false;
void InterruptHandlerTimer0A (void)
// Clear the timer interrupt flag to avoid calling it up again directly
// increase the ms counter by 1 ms
void clockSetup(void)
uint32_t timerPeriod;
//configure clock
//activate peripherals for the timer
// configure timers as 32 bit timers in periodic mode
// set the variable timerPeriod to the number of periods to generate a timeout every ms
timerPeriod = (SysCtlClockGet()/1000);
// pass the variable timerPeriod to the TIMER-0-A
TimerLoadSet(TIMER0_BASE, TIMER_A, timerPeriod-1);
// register the InterruptHandlerTimer0A function as an interrupt service routine
TimerIntRegister(TIMER0_BASE, TIMER_A, &(InterruptHandlerTimer0A));
// activate the interrupt on TIMER-0-A
// generate an interrupt when TIMER-0-A generates a timeout
// all interrupts are activated
// start the timer
TimerEnable(TIMER0_BASE, TIMER_A);
void UART (void)
//configure UART 4:
//GPIO pins for transmitting and receiving
//configure UART 8Bit, no parity, baudrat 38400
//configure UART 3:
void delay_ms(uint32_t waitTime)
// Saves the current system time in ms
uint32_t aktuell = systemTime_ms;
// Wait until the current system time corresponds to the sum of the time at the start of the delay and the waiting time
while(aktuell + waitTime > systemTime_ms);
void ex_int_handler(void)
// press the button to start timer for alarm clock
alarm_clock = true;
int main(void)
//Peripherals for LED and GPIO
// button
//Interrupt Timer
//Transistor Gate
//debugging only: save all the received data from the ESP in an array to look at while debugging
int32_t data[20] = {0};
int32_t j = 0;
//Code for debugging the UART and ESP8266
//Checks for Data in the FIFO
//send AT-command to ESP8266
//Read data from the FIFO in UART3 -> received from ESP8266
data[j] = UARTCharGet(UART3_BASE);
//clear array when its full
if (j >= 20)
j = 0;
for(int32_t a = 0; a <21; a++)
data[a] = 0;
//code to run the alarm clock and leds
if (alarm_clock)
alarm_clock = false;
//Start Red LED blinking when it is finished
According to the AT commands of ESP8266 It should respond to "AT" with
"OK". But whenever I send something to the ESP it responds with
exactly what I sent to it
Modems with AT Commands commonly ship with the echo mode turned on, so that when you are interacting with it manually through serial port, it will echo the characters you sent first, and then send the reply.
So, when you are automating the process, you first send the characters, then wait for the reply until you reach a '\r'. Well, you are reaching a '\r', but its the one from the echo. You might have some other characters next. You send AT, you should receive AT first, then you have the OK.
To solve this problem, you should turn echo mode off.
The command to turn off echo is ATE0.

Using FTDI D2xx and Thorlabs APT communication protocol results in delays on Linux

I am trying to communicate with the Thorlabs TDC001 controllers (apt - dc servo controller) by using the FTDI D2xx driver on Linux. However, when I send writing commands, large delays occur (1-2 seconds) until the command is actually executed on TDC001.
In particular, this can be observed when the connected linear stage is moving and a new position command is sent. It takes 1-2 seconds until the stage actually changes its direction. Also, if I request DCSTATUSUPDATE (which gives position and velocity) and then read out the queue of FTDI, I do not get the right data. Only if I wait 1 second between requesting and reading, I get the (correct) data, but for the past. I added the C++ code for this case.
I need live-data and faster execution of writing commands for closed-loop control.
I'm not sure if the problem is on the side of Thorlabs or FTDI. Everything works, except for the large delays. There are other commands, e.g. MOVE_STOP, which respond immediately. Also, if I send a new position command right after finishing homing, it is executed immediately.
Whenever I ask for FT_GetStatus, there is nothing else in the Tx or Rx queue except the 20 bytes in Rx for DCSTATUSUPDATE.
The references for D2XX and APT communication protocol can be found here:
FTDI Programmer's Guide
Thorlabs APT Communication Protocol
The initialization function:
bool CommunicationFunctions::initializeKeyHandle(string serialnumber){
* This function initializes the TDC motor controller and finds its corresponding keyhandle.
keyHandle = NULL;
// To open the device, the vendor and product ID must be set correctly
ftStatus = FT_SetVIDPID(0x403,0xfaf0);
//Open device:
const char* tmp = serialnumber.c_str();
int numAttempts=0;
while (keyHandle ==0){
ftStatus = FT_OpenEx(const_cast<char*>(tmp),FT_OPEN_BY_SERIAL_NUMBER, &keyHandle);
if (numAttempts++>20){
cerr << "Device Could Not Be Opened";
return false;
// Set baud rate to 115200
ftStatus = FT_SetBaudRate(keyHandle,115200);
// 8 data bits, 1 stop bit, no parity
ftStatus = FT_SetDataCharacteristics(keyHandle, FT_BITS_8, FT_STOP_BITS_1, FT_PARITY_NONE);
// Pre purge dwell 50ms.
// Purge the device.
ftStatus = FT_Purge(keyHandle, FT_PURGE_RX | FT_PURGE_TX);
// Post purge dwell 50ms.
// Reset device.
ftStatus = FT_ResetDevice(keyHandle);
// Set flow control to RTS/CTS.
ftStatus = FT_SetFlowControl(keyHandle, FT_FLOW_RTS_CTS, 0, 0);
// Set RTS.
ftStatus = FT_SetRts(keyHandle);
return true;
How I read out my data:
bool CommunicationFunctions::read_tdc(int32_t* position, uint16_t* velocity){
uint8_t *RxBuffer = new uint8_t[256]();
DWORD RxBytes;
DWORD BytesReceived = 0;
// Manually request status update:
uint8_t req_statusupdate[6] = {0x90,0x04,0x01,0x00,0x50,0x01};
ftStatus = FT_Write(keyHandle, req_statusupdate, (DWORD)6, &written);
if(ftStatus != FT_OK){
cerr << "Command could not be transmitted: Request status update" << endl;
return false;
// sleep(1); //**this sleep() would lead to right result, but I don't want this delay**
// Get number of bytes in queue of TDC001
// Check if there are bytes in queue before reading them, otherwise do
// not read anything in
if(ftStatus != FT_OK){
cerr << "Read device failed!" << endl;
return false;
// Check if enough bytes are received, i.e. if signal is right
if(!(BytesReceived >= 6)){
cerr << "Error in bytes received" << endl;
return false;
// Look for correct message in RxBuffer and read out velocity and position
// Delete receive buffer
delete[] RxBuffer;
RxBuffer = NULL;
return true;
If I use read_tdc function after homing and during movement to absolute position, I just get "Homing completed" message in the first attempt. When I try read_tdc again, I get an old value (probably the one from before). I don't understand what happens here. Why does this old data even remain in the queue (latency is 10 ms). Can anybody help me to get faster responses and reactions?

Is poll() an edge triggered function?

I am responsible for a server that exports data over a TCP connection. With each data record that the server transmits, it requires the client to send a short "\n" acknowledgement message back. I have a customer who claims that the acknowledgement that he sends is not read from the web server. The following is code that I am using for I/O on the socket:
bool can_send = true;
char tx_buff[1024];
char rx_buff[1024];
struct pollfd poll_descriptor;
int rcd;
poll_descriptor.fd = socket_handle; = POLLIN | POLLOUT;
poll_descriptor.revents = 0;
while(!should_quit && is_connected)
// if we know that data can be written, we need to do this before we poll the OS for
// events. This will prevent the 100 msec latency that would otherwise occur
while(can_send && !should_quit && !write_buffer.empty())
uint4 tx_len = write_buffer.copy(tx_buff, sizeof(tx_buff));
rcd = ::send(
if(rcd == -1 && errno != EINTR)
throw SocketException("socket write failure");
if(rcd > 0)
on_low_level_write(tx_buff, rcd);
if(rcd < tx_len)
can_send = false;
// we will use poll for up to 100 msec to determine whether the socket can be read or
// written
if(!can_send) = POLLIN | POLLOUT;
else = POLLIN;
poll(&poll_descriptor, 1, 100);
// check to see if an error has occurred
if((poll_descriptor.revents & POLLERR) != 0 ||
(poll_descriptor.revents & POLLHUP) != 0 ||
(poll_descriptor.revents & POLLNVAL) != 0)
throw SocketException("socket hung up or socket error");
// check to see if anything can be written
if((poll_descriptor.revents & POLLOUT) != 0)
can_send = true;
// check to see if anything can be read
if((poll_descriptor.revents & POLLIN) != 0)
ssize_t bytes_read;
ssize_t total_bytes_read = 0;
int bytes_remaining = 0;
bytes_read = ::recv(
if(bytes_read > 0)
total_bytes_read += bytes_read;
else if(bytes_read == -1)
throw SocketException("read failure");
while(bytes_remaining != 0);
// recv() will return 0 if the socket has been closed
if(total_bytes_read > 0)
is_connected = false;
I have written this code based upon the assumption that poll() is a level triggered function and will unblock immediately as long as there is data to be read from the socket. Everything that I have read seems to back up this assumption. Is there a reason that I may have missed that would cause the above code to miss a read event?
It is not edge triggered. It is always level triggered. I will have to read your code to answer your actual question though. But that answers the question in the title. :-)
I can see no clear reason in your code why you might be seeing the behavior you are seeing. But the scope of your question is a lot larger than the code you're presenting, and I cannot pretend that this is a complete problem diagnosis.
It is level triggered. POLLIN fires if there is data in the socket receive buffer when you poll, and POLLOUT fires if there is room in the socket send buffer (which there almost always is).
Based on your own assessment of the problem (that is, you are blocked on poll when you expect to be able to read the acknowledgement), then you will eventually get a timeout.
If the customer's machine is more than 50ms away from your server, then you will always timeout on the connection before receiving the acknowledgement, since you only wait 100ms. This is because it will take a minimum of 50ms for the data to reach the customer, and a minimum of 50ms for the acknowledgement to return.