ESP32: Attaching an interrupt directly to system time - c++

Currently I'm setting a separate hardware timer to the system time periodically to trigger timed interrupts. It's working fine but for elegance sake, but I wondered if it was possible to attach an interrupt directly to the system time
The events are pretty fast: one every 260 microseconds
ESP32 has a few clocks used for system time. The default full power clock is an 80 MHz called APB_CLK. But even the slow RTC clock has 6.6667 μs resolution. (Documentation here: https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-reference/system/system_time.html)
I have a GPS module that I use to update the system time periodically using adjtime(3). The advantage of that being that it gradually adjusts the system time monotonically. Also system time calls are thread safe
I'm using the Arduino IDE, so my knowledge of accessing registers and interrupts directly is poor. Here's a semi boiled down version of what I'm doing. Bit banging a synchronized digital signal. Rotating in 160 bit pages that are prepped from the other core. It's not all of my code, so something not important might be missing:
#define CPU_SPEED 40
hw_timer_t* timer = NULL;
PageData pages[2];
PageData* timerCurrentPage = &pages[0];
PageData* loopCurrentPage = &pages[1];
TaskHandle_t prepTaskHandle;
volatile int bitCount = 0;
void IRAM_ATTR onTimer() {
int level = timerCurrentPage->data[bitCount];
dac_output_voltage(DAC_CHANNEL_1, level?high:low);
bitCount++;
if(bitCount<160) {
timerAlarmWrite(timer, (timerCurrentPage->startTick+timerCurrentPage->ticksPerPage*bitCount), false);
} else {
if(timerCurrentPage == &pages[0]) timerCurrentPage = &pages[1];
else timerCurrentPage = &pages[0];
bitCount = 0;
timerAlarmWrite(timer, (timerCurrentPage->startTick), false);
vTaskResume(prepTaskHandle);
}
}
uint64_t nowTick() {
timeval timeStruct;
gettimeofday(&timeStruct, NULL);
uint64_t result = (uint64_t)timeStruct.tv_sec*1000000UL + (uint64_t)timeStruct.tv_usec;
return result;
}
void gpsUpdate(uint64_t micros) {
int64_t now = nowTick();
int64_t offset = micros - now;
timeval adjustStruct = {0,offset};
adjtime(&adjustStruct,NULL);
}
void setup() {
setCpuFrequencyMhz(CPU_SPEED);
timer = timerBegin(0, CPU_SPEED, true);
timerWrite(timer, nowTick());
timerAttachInterrupt(timer, &onTimer, true);
setPage(&pages[0]);
xTaskCreatePinnedToCore(
prepLoop, /* Task function. */
"Prep Task", /* name of task. */
10000, /* Stack size of task */
NULL, /* parameter of the task */
1, /* priority of the task */
&prepTaskHandle, /* Task handle to keep track of created task */
1); /* pin task to core 0 */
timerAlarmWrite(timer, (timerCurrentPage->startTick), false);
}
//On Core 1
void prepLoop() {
while(1) {
vTaskSuspend(NULL); //prepTaskHandle
timerWrite(timer, nowTick());
if(loopCurrentPage == &pages[0]) loopCurrentPage = &pages[1];
else loopCurrentPage = &pages[0];
setPage(loopCurrentPage);
}
}

Related

How to read motor encoder values with interrupts in FreeRTOS?

I am working on a project where I need to obtain precise angular velocity from four motor encoders. I am using ESP32 DEVKIT-V1 module, and would like to use four interrupts, which will fire when each motor encoder switches state. This produces a square signal of around 700 Hz (period of 1,42 ms). This needs to be done on one core due to timing restrictions, as the processor must not miss any ticks. This is why I decided to use FreeRTOS. As the tick rate of the ESP32 is 1 ms, it cannot read higher frequencies than 500 Hz (period of 2 ms).
I would like to call getEncoderTickNumber() function every time one of the four interrupts fires, however, I only get the ESP32 to continually reset. I also wish to pass the number of ticks (encoderValueA1 - A4) from function getEncoderTickNumber() to getEncoderRPM() by queues.
I am still a beginner in C/C++, so I would be very grateful if you could point out some beginner mistakes that I am making. Thank you for your time.
#include <Arduino.h>
// Motor encoder output pulse per rotation (AndyMark Neverest 60)
int ENC_COUNT_REV = 420;
// Pulse count from encoder
long encoderValueA1 = 0;
long encoderValueA2 = 0;
long encoderValueA3 = 0;
long encoderValueA4 = 0;
int currentStateMotorEncoderA1;
int currentStateMotorEncoderA2;
int currentStateMotorEncoderA3;
int currentStateMotorEncoderA4;
int previousStateMotorEncoderA1;
int previousStateMotorEncoderA2;
int previousStateMotorEncoderA3;
int previousStateMotorEncoderA4;
// Variable for RPM measuerment
int rpm1 = 0;
int rpm2 = 0;
int rpm3 = 0;
int rpm4 = 0;
#define INT_PIN1 17
#define INT_PIN2 18
#define INT_PIN3 19
#define INT_PIN4 16
#define PRIORITY_LOW 0
#define PRIORITY_HIGH 1
QueueHandle_t encoderQueueHandle;
#define QUEUE_LENGTH 4 //four rpm readings
long* pdata = &encoderValueA1;
void io_expander_interrupt()
{
xQueueSendToBackFromISR(&encoderQueueHandle, &pdata, NULL);
}
///////////
// TASKS //
///////////
void getEncoderTickNumber(void *parameter)
{
while (1)
{
if (xQueueReceiveFromISR(&encoderQueueHandle, &pdata, NULL) == pdTRUE)
{
currentStateMotorEncoderA1 = digitalRead(INT_PIN1);
currentStateMotorEncoderA2 = digitalRead(INT_PIN2);
currentStateMotorEncoderA3 = digitalRead(INT_PIN3);
currentStateMotorEncoderA4 = digitalRead(INT_PIN4);
if (currentStateMotorEncoderA1 != previousStateMotorEncoderA1)
{
encoderValueA1++;
}
if (currentStateMotorEncoderA2 != previousStateMotorEncoderA2)
{
encoderValueA2++;
}
if (currentStateMotorEncoderA3 != previousStateMotorEncoderA3)
{
encoderValueA3++;
}
if (currentStateMotorEncoderA4 != previousStateMotorEncoderA4)
{
encoderValueA4++;
}
previousStateMotorEncoderA1 = currentStateMotorEncoderA1;
previousStateMotorEncoderA2 = currentStateMotorEncoderA2;
previousStateMotorEncoderA3 = currentStateMotorEncoderA3;
previousStateMotorEncoderA4 = currentStateMotorEncoderA4;
}
}
}
void getEncoderRPM(void *parameter)
{
while (1)
{
rpm1 = (encoderValueA1 * 60) / ENC_COUNT_REV;
rpm2 = (encoderValueA2 * 60) / ENC_COUNT_REV;
rpm3 = (encoderValueA3 * 60) / ENC_COUNT_REV;
rpm4 = (encoderValueA4 * 60) / ENC_COUNT_REV;
encoderValueA1 = 0;
encoderValueA2 = 0;
encoderValueA3 = 0;
encoderValueA4 = 0;
vTaskDelay(1000 / portTICK_RATE_MS);
}
}
void printData(void *parameter)
{
while (1)
{
Serial.print("1:");
Serial.print(rpm1);
Serial.print(" 2:");
Serial.print(rpm2);
Serial.print(" 3:");
Serial.print(rpm3);
Serial.print(" 4:");
Serial.println(rpm4);
vTaskDelay(500 / portTICK_RATE_MS);
}
}
void setup()
{
Serial.begin(115200);
pinMode(INT_PIN1, INPUT);
attachInterrupt(INT_PIN1, getEncoderTickNumber, RISING);
pinMode(INT_PIN2, INPUT);
attachInterrupt(INT_PIN2, getEncoderTickNumber, RISING);
pinMode(INT_PIN3, INPUT);
attachInterrupt(INT_PIN3, getEncoderTickNumber, RISING);
pinMode(INT_PIN4, INPUT);
attachInterrupt(INT_PIN4, getEncoderTickNumber, RISING);
// Create the queue
encoderQueueHandle = xQueueCreate(QUEUE_LENGTH, sizeof(uint32_t));
xTaskCreatePinnedToCore( // Use xTaskCreate() in vanilla FreeRTOS
getEncoderTickNumber, // Function to be called
"getEncoderTickNumber", // Name of task
1024, // Stack size (bytes in ESP32, words in FreeRTOS) inside the heap
NULL, // Parameter to pass to function
PRIORITY_LOW, // Task priority (0 to configMAX_PRIORITIES - 1)
NULL, // Task handle
1); // Run on one core for demo purposes (ESP32 only)
xTaskCreatePinnedToCore( // Use xTaskCreate() in vanilla FreeRTOS
printData, // Function to be called
"printData", // Name of task
1024, // Stack size (bytes in ESP32, words in FreeRTOS) inside the heap
NULL, // Parameter to pass to function
PRIORITY_LOW, // Task priority (0 to configMAX_PRIORITIES - 1)
NULL, // Task handle
0); // Run on one core for demo purposes (ESP32 only)
xTaskCreatePinnedToCore( // Use xTaskCreate() in vanilla FreeRTOS
getEncoderRPM, // Function to be called
"getEncoderRPM", // Name of task
1024, // Stack size (bytes in ESP32, words in FreeRTOS)
NULL, // Parameter to pass to function
PRIORITY_HIGH, // Task priority (0 to configMAX_PRIORITIES - 1)
NULL, // Task handle
0); // Run on one core for demo purposes (ESP32 only)
vTaskDelete(NULL); // Deletes the setup/loop task now that we are finished setting up (optional)
}
void loop()
{
}
There are quite a few problems in your code. Let's go over them one by one, see if it clears things up.
Firstly, don't delete the task in setup():
vTaskDelete(NULL); // Deletes the setup/loop task now that we are finished setting up (optional)
Arduino will manage the FreeRTOS tasks on its own, don't interfere with it. You may be causing your crash with that line alone.
Secondly, you're creating your tasks with a stack size of 1024 bytes which is too small. The task will likely corrupt the stack and crash. Start with a stack size of 4096 bytes for simple tasks, see if you can optimize later. Incidentally, you don't need any tasks at all for a simple implementation.
Thirdly, you don't seem to understand what an interrupt is and how to handle it. By calling this you're attaching the function getEncoderTickNumber() as an interrupt handler to all 4 GPIO inputs:
attachInterrupt(INT_PIN1, getEncoderTickNumber, RISING);
attachInterrupt(INT_PIN2, getEncoderTickNumber, RISING);
attachInterrupt(INT_PIN3, getEncoderTickNumber, RISING);
attachInterrupt(INT_PIN4, getEncoderTickNumber, RISING);
The function getEncoderTickNumber() cannot be the interrupt handler because it blocks with a while(1) loop - it will quickly trigger the watchdog and reboot. Additionally, you've already used this function as a task which runs in the background (and seems to expect input from the interrupt handlers).
Finally, you seem have a more suitable candidate for the position of an interrupt handler - the function io_expander_interrupt() - which currently doesn't do anything useful. Let's fix that.
You would need 4 interrupt handlers, one per each GPIO you're monitoring. Each handler is attached to its respective GPIO pin, triggers when the IO rises and each does its own encoder calculation. A simple implementation without extra tasks would look like this:
#include <Arduino.h>
// Motor encoder output pulse per rotation (AndyMark Neverest 60)
int ENC_COUNT_REV = 420;
// Pulse count from encoder. Must be volatile as it's shared between ISR and main task
volatile int encoderValueA1 = 0;
volatile int encoderValueA2 = 0;
volatile int encoderValueA3 = 0;
volatile int encoderValueA4 = 0;
#define INT_PIN1 17
#define INT_PIN2 18
#define INT_PIN3 19
#define INT_PIN4 16
void isr_rising_gpio1() {
encoderValueA1++
}
void isr_rising_gpio2() {
encoderValueA2++
}
void isr_rising_gpio3() {
encoderValueA3++
}
void isr_rising_gpio4() {
encoderValueA4++
}
void setup()
{
Serial.begin(115200);
pinMode(INT_PIN1, INPUT);
attachInterrupt(INT_PIN1, isr_rising_gpio1, RISING);
pinMode(INT_PIN2, INPUT);
attachInterrupt(INT_PIN2, isr_rising_gpio2, RISING);
pinMode(INT_PIN3, INPUT);
attachInterrupt(INT_PIN3, isr_rising_gpio3, RISING);
pinMode(INT_PIN4, INPUT);
attachInterrupt(INT_PIN4, isr_rising_gpio4, RISING);
}
void loop()
{
int rpm1 = (encoderValueA1 * 60) / ENC_COUNT_REV;
encoderValueA1 = 0;
int rpm2 = (encoderValueA2 * 60) / ENC_COUNT_REV;
encoderValueA2 = 0;
int rpm3 = (encoderValueA3 * 60) / ENC_COUNT_REV;
encoderValueA3 = 0;
int rpm4 = (encoderValueA4 * 60) / ENC_COUNT_REV;
encoderValueA4 = 0;
Serial.print("1:");
Serial.print(rpm1);
Serial.print(" 2:");
Serial.print(rpm2);
Serial.print(" 3:");
Serial.print(rpm3);
Serial.print(" 4:");
Serial.println(rpm4);
vTaskDelay(pdMS_TO_TICKS(1000));
}

C++: pros and cons of different sleep methods for a specific time duration [duplicate]

I am trying to execute a sleep function that is somewhat accurate. I measured how long my sleep function slept for and put them side by side. The format for the samples down below are: "expected ms:outcome ms".
I have tried many options and I still can't find a solution. Here are the routes I tried:
Route 1
Sleep(<time>)
/* milliseconds */
38.4344 46.4354
41.728 47.7818
0.556 0.0012
43.6532 46.8087
0.4523 0.0009
62.8664 76.995
1.5363 15.4592
75.9435 78.1663
91.5194 92.0786
0.6533 0.001
39.7423 45.6729
0.5022 0.0008
54.7837 60.597
0.4248 0.0011
39.2165 45.6977
0.4854 0.0008
10.6741 15.054
Had little to no noticeable CPU usage which is good but still inaccurate results.
Route 2
/* Windows sleep in 100ns units */
BOOLEAN nanosleep(LONGLONG ns){
/* Declarations */
HANDLE timer; /* Timer handle */
LARGE_INTEGER li; /* Time defintion */
/* Create timer */
if(!(timer = CreateWaitableTimer(NULL, TRUE, NULL)))
return FALSE;
/* Set timer properties */
li.QuadPart = -ns;
if(!SetWaitableTimer(timer, &li, 0, NULL, NULL, FALSE)){
CloseHandle(timer);
return FALSE;
}
/* Start & wait for timer */
WaitForSingleObject(timer, INFINITE);
/* Clean resources */
CloseHandle(timer);
/* Slept without problems */
return TRUE;
}
/* milliseconds */
1.057 14.7561
66.5977 79.4437
0.409 14.7597
152.053 156.757
1.26725 15.747
19.025 30.6343
67.3235 78.678
0.4203 14.4713
65.3507 74.4703
0.4525 14.8102
28.6145 29.7099
72.0035 74.7315
0.5971 14.8625
55.7059 59.3889
0.4791 14.5419
50.9913 61.6719
0.5929 15.5558
Had low CPU usage which was good but was still inaccurate.
I had read somewhere that using MultiMedia Timers would provide accurate sleep.
Code Source
Route 3
void super_sleep(double ms)
{
auto a = std::chrono::steady_clock::now();
while ((std::chrono::steady_clock::now() - a) < std::chrono::milliseconds(static_cast<int>(ms))) {
continue;
}
}
/* milliseconds */
55.7059 55.0006
0.5669 0.0008
66.5977 66.0009
0.4213 0.0009
0.7228 0.0007
7.5374 7.0006
0.8825 0.0007
0.4143 0.0009
59.8062 59.0005
51.7157 51.0006
54.0807 54.0006
11.8834 11.0006
65.3507 65.0004
14.429 14.0006
0.4452 0.0012
1.6797 1.0004
96.0012 96.0006
Worked a lot better than the other attempts but uses up to 7% of my CPU.
I also tried using std::this_thread::sleep_for() and received similar result to Route 2.
I am on Windows 10 20H2, C++17 and i9 9900k.
One way to get pretty good accuracy (but not perfect since Windows isn't a Real Time OS), is to use one of the standard sleep functions, but sleep short - and then busy-wait the remaining time. That usually keeps the CPU usage low.
template<class T, class U>
void Sleep(std::chrono::duration<T,U> ss) {
auto target = std::chrono::steady_clock::now() + ss; // the target end time
// Sleep short. 5 ms is just an example. You need to trim that parameter.
std::this_thread::sleep_until(target - std::chrono::milliseconds(5));
// busy-wait the remaining time
while(std::chrono::steady_clock::now() < target) {}
}

Create an array at different memory locations for each loop

Good morning everyone,
I am currently working on a data acquisition project, where I have to read sensors (at around 10 kHz) and transmit the data via Wi-Fi and the MQTT-protocol. I am using an ESP32 for both of these tasks.
One core is doing the sensor reading and the other core does the transmitting stuff. I also use the FreeRTOS for this.
Now, I want to pass the data as efficient as possible between the task. Currently I'm using the xQueue function built in the FreeRtos. I pass pointers in the Queue which point to an array, where one datapackage is stored.
Task one:
*sensor reading*
for(xx)
{
data_array[x] = sensor_data;
}
if {packageSize == 120}
{
xQueueSend(Queue1, &data_pointer, 0);
}
________________________
Task two:
if( uxQueueMessagesWaiting(Queue1) >= 1)
{
xQueueReceive(Queue1, &received_pointer, 0);
memcpy(data_send, received_pointer, packageSize);
* MQTT-Client sending data_send *
}
You see, my problem isn't the creation of the array with different pointers. The sensor reading task needs to create an array for every package, without overwritting the previous one.
My initial idea was to use the new and delete combination but it gave me strange results.
Is there any way I can change the location of the array on the memory at every loop of task one?
EDIT:
/* general variables*/
const int len = 150;
uint8_t data_received[len];
uint8_t data_send[len];
uint8_t *queue_pointer = 0;
uint8_t *received_pointer = 0;
uint8_t *to_delete_pointer = 0;
uint8_t dummy_data = 0;
int v = 0;
/* multithreading variables */
TaskHandle_t SPI_COM;
TaskHandle_t WIFI;
QueueHandle_t buffer_daten;
/* --------------------- Fake-SPI-Kommunikation auf Core 1 -------------------- */
void SPI_COM_code(void *pvParameters)
{
for (;;)
{
while (v <= 10000)
{
//queue_pointer = new int[len]; // creates a new array
queue_pointer = data_received;
queue_pointer[dummy_data] = dummy_data;
dummy_data++;
delayMicroseconds(100); // Dummy-Interrupt
if (dummy_data == len - 1)
{
dummy_data = 0;
xQueueSend(buffer_daten, &queue_pointer, 0);
v++;
}
}
}
}
/* --------------------- WiFi-Übertragung auf Core 0 --------------------- */
void WIFI_code(void *pvParameters)
{
for (;;)
{
//MQTT_connect();
if (uxQueueMessagesWaiting(buffer_daten) > 0)
{
xQueueReceive(buffer_daten, &received_pointer, 0);
to_delete_pointer = received_pointer;
memcpy(data_send, received_pointer, len);
// Data gets published by MQTT-Client
delayMicroseconds(12);
//delete[] to_delete_pointer; // deletes array, which was send
}
}
}
/* ----------------------------------- Setup ---------------------------------- */
void setup()
{
disableCore0WDT(); // <----- MÖGLICHE PROBLEMQUELLE
Serial.begin(115200);
buffer_daten = xQueueCreate(1000, sizeof(int));
xTaskCreatePinnedToCore(
SPI_COM_code, /* Task function. */
"SPI_COM", /* name of task. */
10000, /* Stack size of task */
NULL, /* parameter of the task */
1, /* priority of the task */
&SPI_COM, /* Task handle to keep track of created task */
1); /* pin task to core 0 */
delay(500);
xTaskCreatePinnedToCore(
WIFI_code, /* Task function. */
"WIFI", /* name of task. */
10000, /* Stack size of task */
NULL, /* parameter of the task */
2, /* priority of the task */
&WIFI, /* Task handle to keep track of created task */
0); /* pin task to core 1 */
delay(500);
}
void loop()
{
}
I would suggest you use a RTOS Message Buffers for this task
With this functions you could copy your array into the buffer and the second task could get it, when the data is available.
In both cases the consumer task should use the timeout '0' to request the data.
If the MQTT task is faster than the data acquisition (and it should be or your buffers will overflow sooner or later) this will lead to invalid pointers:
xQueueReceive(buffer_daten, &received_pointer, 0);
If the is no data available the function will return immediately giving you an invalid received_pointer.
You should either check the return value of xQueueReceive or set the timeout to portMAX_DELAY.

Have a timer restart every 100ms in C / C++

I am working with a application where the requirement is execute a function after every 100ms.
Below is my code
checkOCIDs()
{
// Do something that might take more than 100ms of time
}
void TimeOut_CallBack(int w)
{
struct itimerval tout_val;
int ret = 0;
signal(SIGALRM,TimeOut_CallBack);
/* Configure the timer to expire after 100000 ... */
tout_val.it_value.tv_sec = 0;
tout_val.it_value.tv_usec = 100000; /* 100000 timer */
/* ... and every 100 msec after that. */
tout_val.it_interval.tv_sec = 0 ;
tout_val.it_interval.tv_usec = 100000;
checkOCIDs();
setitimer(ITIMER_REAL, &tout_val,0);
return ;
}
Function TimeOut_CallBack ( ) is called only once and then on checkOCIDs( ) function must be executed after a wait of 100ms continuously.
Currently, The application is going for a block as checkOCIDs( ) function takes more than 100ms of time to complete and before that the Timer Out is triggered.
I do not wish to use while(1) with sleep( ) / usleep( ) as it eats up my CPU enormously.
Please suggest a alternative to achieve my requirement.
It is not clear whether the "check" function should be executed while it is in progress and timer expires. Maybe it would be ok to you to introduce variable to indicate that timer expired and your function should be executed again after it completes, pseudo-code:
static volatile bool check_in_progress = false;
static volatile bool timer_expired = false;
void TimeOut_CallBack(int w)
{
// ...
if (check_in_progress) {
timer_expired = true;
return;
}
// spawn/resume check function thread
// ...
}
void checkThreadProc()
{
check_in_progress = true;
do {
timer_expired = false;
checkOCIDs();
} while(timer_expired);
check_in_progress = false;
// end thread or wait for a signal to resume
}
Note, that additional synchronization may be required to avoid race conditions (for instance when one thread exists do-while loop and check_in_progress is still set and the other sets timer_expired, check function will not be executed), but that's depends on your requirements details.

Linux poll on serial transmission end

I'm implementing RS485 on arm developement board using serial port and gpio for data enable.
I'm setting data enable to high before sending and I want it to be set low after transmission is complete.
It can be simply done by writing:
//fd = open("/dev/ttyO2", ...);
DataEnable.Set(true);
write(fd, data, datalen);
tcdrain(fd); //Wait until all data is sent
DataEnable.Set(false);
I wanted to change from blocking-mode to non-blocking and use poll with fd. But I dont see any poll event corresponding to 'transmission complete'.
How can I get notified when all data has been sent?
System: linux
Language: c++
Board: BeagleBone Black
I don't think it's possible. You'll either have to run tcdrain in another thread and have it notify the the main thread, or use timeout on poll and poll to see if the output has been drained.
You can use the TIOCOUTQ ioctl to get the number of bytes in the output buffer and tune the timeout according to baud rate. That should reduce the amount of polling you need to do to just once or twice. Something like:
enum { writing, draining, idle } write_state;
while(1) {
int write_event, timeout = -1;
...
if (write_state == writing) {
poll_fds[poll_len].fd = write_fd;
poll_fds[poll_len].event = POLLOUT;
write_event = poll_len++
} else if (write == draining) {
int outq;
ioctl(write_fd, TIOCOUTQ, &outq);
if (outq == 0) {
DataEnable.Set(false);
write_state = idle;
} else {
// 10 bits per byte, 1000 millisecond in a second
timeout = outq * 10 * 1000 / baud_rate;
if (timeout < 1) {
timeout = 1;
}
}
}
int r = poll(poll_fds, poll_len, timeout);
...
if (write_state == writing && r > 0 && (poll_fds[write_event].revent & POLLOUT)) {
DataEnable.Set(true); // Gets set even if already set.
int n = write(write_fd, write_data, write_datalen);
write_data += n;
write_datalen -= n;
if (write_datalen == 0) {
state = draining;
}
}
}
Stale thread, but I have been working on RS-485 with a 16550-compatible UART under Linux and find
tcdrain works - but it adds a delay of 10 to 20 msec. Seems to be polled
The value returned by TIOCOUTQ seems to count bytes in the OS buffer, but NOT bytes in the UART FIFO, so it may underestimate the delay required if transmission has already started.
I am currently using CLOCK_MONOTONIC to timestamp each send, calculating when the send should be complete, when checking that time against the next send, delaying if necessary. Sucks, but seems to work