How best to interrupt a zeroMQ poll method for cleanup and termination - c++

Writing in C++ I have a thread that uses the zmq poll method to discover when there are new events to process, which works fine. What I want though is this thread to exit while cleaning up nicely when there are no more events expected.
Rather than infinite while loop I could put a condition in there but it would require REQUEST_TIMEOUT_MS to get there. So my question is, what is the best method to interrupt the poll for program exit?
void * Requester::recieve_thread(void *arg) {
zmq::socket_t * soc = (zmq::socket_t *) arg;
zmq::pollitem_t items[] = { { *soc, 0, ZMQ_POLLIN, 0 } };
while (1) {
zmq::poll(&items[0], 1, REQUEST_TIMEOUT_MS);
if (items[0].revents & ZMQ_POLLIN) {
// process the event
}
}
// clean up
}

It is often mentioned that you can just destroy the zmq context and anything sharing that context will exit, however this creates a nightmare because that will delete the socket objects and your exiting code has to do its best in avoiding a minefield of dead pointers.
Attempting to close the socket doesn't work either because they are not thread safe and you'll end up with crashes.
ANSWER: The best way is to do as the ZeroMQ guide suggests for any use via multiple threads; use zmq sockets and not thread mutexes/locks/etc.
Requester::Requester(zmq::context_t* context)
{
m_context = context;
// Create a socket that you'll use as the interrupt-event receiver
// I'm using a random address and an inproc socket (inprocs need to share a context)
snprintf(m_signalStopAddr, sizeof(m_signalStopAddr) / sizeof(*m_signalStopAddr), "inproc://%lx%x", (unsigned long)this, rand());
m_signalStop = new zmq::socket_t(m_context, ZMQ_PAIR);
m_signalStop->bind(m_signalStopAddr);
}
// Your thread-safe poll interrupter
Requester::interrupt()
{
char dummy;
zmq::socket_t doSignal(m_context, ZMQ_PAIR);
doSignal.connect(m_signalStopAddr);
doSignal.send(&dummy, sizeof(dummy));
}
void * Requester::recieve_thread(void *arg)
{
zmq::socket_t * soc = (zmq::socket_t *) arg;
zmq::pollitem_t items[] =
{
{ *soc, 0, ZMQ_POLLIN, 0 },
{ *m_signalStop, 0, ZMQ_POLLIN, 0 }
};
while (1)
{
zmq::poll(items, 2, REQUEST_TIMEOUT_MS);
if (items[1].revents & ZMQ_POLLIN)
{
break; // exit
}
if (items[0].revents & ZMQ_POLLIN)
{
// process the event
}
}
// Cleanup
}
zmq::context_t* m_context;
zmq::socket_t* m_signalStop; // Don't forget to delete this!
char m_signalStopAddr[100];

Don't interrupt the poll - send the thread a message instructing it to clean up and exit.

Related

Creating a dispatch queue / thread handler in C++ with pipes: FIFOs overfilling

Threads are resource-heavy to create and use, so often a pool of threads will be reused for asynchronous tasks. A task is packaged up, and then "posted" to a broker that will enqueue the task on the next available thread.
This is the idea behind dispatch queues (i.e. Apple's Grand Central Dispatch), and thread handlers (Android's Looper mechanism).
Right now, I'm trying to roll my own. In fact, I'm plugging a gap in Android whereby there is an API for posting tasks in Java, but not in the native NDK. However, I'm keeping this question platform independent where I can.
Pipes are the ideal choice for my scenario. I can easily poll the file descriptor of the read-end of a pipe(2) on my worker thread, and enqueue tasks from any other thread by writing to the write-end. Here's what that looks like:
int taskRead, taskWrite;
void setup() {
// Create the pipe
int taskPipe[2];
::pipe(taskPipe);
taskRead = taskPipe[0];
taskWrite = taskPipe[1];
// Set up a routine that is called when task_r reports new data
function_that_polls_file_descriptor(taskRead, []() {
// Read the callback data
std::function<void(void)>* taskPtr;
::read(taskRead, &taskPtr, sizeof(taskPtr));
// Run the task - this is unsafe! See below.
(*taskPtr)();
// Clean up
delete taskPtr;
});
}
void post(const std::function<void(void)>& task) {
// Copy the function onto the heap
auto* taskPtr = new std::function<void(void)>(task);
// Write the pointer to the pipe - this may block if the FIFO is full!
::write(taskWrite, &taskPtr, sizeof(taskPtr));
}
This code puts a std::function on the heap, and passes the pointer to the pipe. The function_that_polls_file_descriptor then calls the provided expression to read the pipe and execute the function. Note that there are no safety checks in this example.
This works great 99% of the time, but there is one major drawback. Pipes have a limited size, and if the pipe is filled, then calls to post() will hang. This in itself is not unsafe, until a call to post() is made within a task.
auto evil = []() {
// Post a new task back onto the queue
post({});
// Not enough new tasks, let's make more!
for (int i = 0; i < 3; i++) {
post({});
}
// Now for each time this task is posted, 4 more tasks will be added to the queue.
});
post(evil);
post(evil);
...
If this happens, then the worker thread will be blocked, waiting to write to the pipe. But the pipe's FIFO is full, and the worker thread is not reading anything from it, so the entire system is in deadlock.
What can be done to ensure that calls to post() eminating from the worker thread always succeed, allowing the worker to continue processing the queue in the event it is full?
Thanks to all the comments and other answers in this post, I now have a working solution to this problem.
The trick I've employed is to prioritise worker threads by checking which thread is calling post(). Here is the rough algorithm:
pipe ← NON-BLOCKING-PIPE()
overflow ← Ø
POST(task)
success ← WRITE(task, pipe)
IF NOT success THEN
IF THREAD-IS-WORKER() THEN
overflow ← overflow ∪ {task}
ELSE
WAIT(pipe)
POST(task)
Then on the worker thread:
LOOP FOREVER
task ← READ(pipe)
RUN(task)
FOR EACH overtask ∈ overflow
RUN(overtask)
overflow ← Ø
The wait is performed with pselect(2), adapted from the answer by #Sigismondo.
Here's the algorithm implemented in my original code example that will work for a single worker thread (although I haven't tested it after copy-paste). It can be extended to work for a thread pool by having a separate overflow queue for each thread.
int taskRead, taskWrite;
// These variables are only allowed to be modified by the worker thread
std::__thread_id workerId;
std::queue<std::function<void(void)>> overflow;
bool overflowInUse;
void setup() {
int taskPipe[2];
::pipe(taskPipe);
taskRead = taskPipe[0];
taskWrite = taskPipe[1];
// Make the pipe non-blocking to check pipe overflows manually
::fcntl(taskWrite, F_SETFL, ::fcntl(taskWrite, F_GETFL, 0) | O_NONBLOCK);
// Save the ID of this worker thread to compare later
workerId = std::this_thread::get_id();
overflowInUse = false;
function_that_polls_file_descriptor(taskRead, []() {
// Read the callback data
std::function<void(void)>* taskPtr;
::read(taskRead, &taskPtr, sizeof(taskPtr));
// Run the task
(*taskPtr)();
delete taskPtr;
// Run any tasks that were posted to the overflow
while (!overflow.empty()) {
taskPtr = overflow.front();
overflow.pop();
(*taskPtr)();
delete taskPtr;
}
// Release the overflow mechanism if applicable
overflowInUse = false;
});
}
bool write(std::function<void(void)>* taskPtr, bool blocking = true) {
ssize_t rc = ::write(taskWrite, &taskPtr, sizeof(taskPtr));
// Failure handling
if (rc < 0) {
// If blocking is allowed, wait for pipe to become available
int err = errno;
if ((errno == EAGAIN || errno == EWOULDBLOCK) && blocking) {
fd_set fds;
FD_ZERO(&fds);
FD_SET(taskWrite, &fds);
::pselect(1, nullptr, &fds, nullptr, nullptr, nullptr);
// Try again
return write(tdata);
}
// Otherwise return false
return false;
}
return true;
}
void post(const std::function<void(void)>& task) {
auto* taskPtr = new std::function<void(void)>(task);
if (std::this_thread::get_id() == workerId) {
// The worker thread gets 1st-class treatment.
// It won't be blocked if the pipe is full, instead
// using an overflow queue until the overflow has been cleared.
if (!overflowInUse) {
bool success = write(taskPtr, false);
if (!success) {
overflow.push(taskPtr);
overflowInUse = true;
}
} else {
overflow.push(taskPtr);
}
} else {
write(taskPtr);
}
}
Make the pipe write file descriptor non-blocking, so that write fails with EAGAIN when the pipe is full.
One improvement is to increase the pipe buffer size.
Another is to use a UNIX socket/socketpair and increase the socket buffer size.
Yet another solution is to use a UNIX datagram socket which many worker threads can read from, but only one gets the next datagram. In other words, you can use a datagram socket as a thread dispatcher.
You can use the old good select to determine whether the file descriptors are ready to be used for writing:
The file descriptors in writefds will be watched to see if
space is available for write (though a large write may still block).
Since you are writing a pointer, your write() cannot be classified as large at all.
Clearly you must be ready to handle the fact that a post may fail, and then be ready to retry it later... otherwise you will be facing indefinitely growing pipes, until you system will break again.
More or less (not tested):
bool post(const std::function<void(void)>& task) {
bool post_res = false;
// Copy the function onto the heap
auto* taskPtr = new std::function<void(void)>(task);
fd_set wfds;
struct timeval tv;
int retval;
FD_ZERO(&wfds);
FD_SET(taskWrite, &wfds);
// Don't wait at all
tv.tv_sec = 0;
tv.tv_usec = 0;
retval = select(1, NULL, &wfds, NULL, &tv);
// select() returns 0 when no FD's are ready
if (retval == -1) {
// handle error condition
} else if (retval > 0) {
// Write the pointer to the pipe. This write will succeed
::write(taskWrite, &taskPtr, sizeof(taskPtr));
post_res = true;
}
return post_res;
}
If you only look at Android/Linux using a pipe is not start of the art but using a event file descriptor together with epoll is the way to go.

create threads but don't run it immediately in linux

I am trying to execute my program in threads, I use pthread_create(), but it runs the threads immediately. I would like to allow the user to change thread priorities before running. How it is possible to resolve?
for(int i = 0; i < threads; i++)
{
pthread_create(data->threads+i,NULL,SelectionSort,data);
sleep(1);
print(data->array);
}
Set the priority as you create the thread.
Replace
errno = pthread_create(..., NULL, ...);
if (errno) { ... }
with
pthread_attr_t attr;
errno = pthread_attr_init(&attr);
if (errno) { ... }
{
struct sched_param sp;
errno = pthread_attr_getschedparam(&attr, &sp);
if (errno) { ... }
sp.sched_priority = ...;
errno = pthread_attr_setschedparam(&attr, &sp);
if (errno) { ... }
}
/* So our scheduling priority gets used. */
errno = pthread_attr_setinheritsched(&attr, PTHREAD_EXPLICIT_SCHED);
if (errno) { ... }
errno = pthread_create(..., &attr, ...);
if (errno) { ... }
errno = pthread_attr_destroy(&attr);
if (errno) { ... }
For pthreads the priority isn't set after thread creation but rather by passing suitable attributes upon thread creation: the thread attributes go where you have specified NULL in your pthread_create() call. If you want to delay thread creation until the user has given you a priority you can create a function object expecting the priority and upon call of that function object you'd kick off the thread. Of course, you'll still need to keep track of the thus created object (possibly using a std::future<...>-like object) to later join that thread.
Note that providing an answer shouldn't be construed as endorsing thread priorities: as far as I can tell, playing with thread priorities are ill-advised.

Event loop handling for sd-bus in libuv

We have an eventloop from libuv to handle unixsockets and TCP sockets. The program now also must handle DBus, and we decided to use sd-bus for that.
Lennart wrote on his blog:
Note that our APIs, including sd-bus, integrate nicely into sd-event
event loops, but do not require it, and may be integrated into other
event loops too, as long as they support watching for time and I/O events.
So i assume, it must be possible.
I can get the dbus socket fd via sd_bus_get_fd (sd_bus *bus).
But I can't find any obvious way to stop sd-bus from using its bus_poll method to wait for events internally.
For example when calling a method with sd_bus_call(...) will block with ppoll.
So: How do I handle the dbus events in libuv?
I figured it out, here's an example on how to unite C++, libuv and sd-bus:
I recommend that you read http://0pointer.de/blog/the-new-sd-bus-api-of-systemd.html to understand sd-bus in general.
These are code snippets from my implementation at https://github.com/TheJJ/horst
Method calls can then be done with sd_bus_call_async which does not block (opposed to sd_bus_call).
Don't forget to call update_events() after sd_bus_call_async so the call is sent out over the socket!
/**
* Callback function that is invoked from libuv
* once dbus events flowed in.
*/
static void on_dbus_ready(uv_poll_t *handle, int /*status*/, int /*events*/) {
DBusConnection *connection = (DBusConnection *)handle->data;
sd_bus *bus = connection->get_bus();
// let dbus handle the available events request
while (true) {
// this will trigger the dbus vtable-registered functions
int r = sd_bus_process(bus, nullptr);
if (r < 0) {
printf("[dbus] Failed to process bus: %s", strerror(-r));
break;
}
else if (r > 0) {
// try to process another request!
continue;
}
else {
// no more progress, wait for the next callback.
break;
}
}
// update the events we watch for on the socket.
connection->update_events();
}
/**
* Convert the sdbus-returned poll flags to
* corresponding libuv flags.
*/
int poll_to_libuv_events(int pollflags) {
int ret = 0;
if (pollflags & (POLLIN | POLLPRI)) {
ret |= UV_READABLE;
}
if (pollflags & POLLOUT) {
ret |= UV_WRITABLE;
}
// we also have the non-corresponding UV_DISCONNECT
return ret;
}
class DBusConnection {
public:
DBusConnection(Satellite *sat);
virtual ~DBusConnection() = default;
/** connect to dbus */
int connect() {
int r = sd_bus_open_system(&this->bus);
if (r < 0) {
printf("[dbus] Failed to connect to bus: %s", strerror(-r));
goto clean_return;
}
r = sd_bus_add_object_vtable(
this->bus,
&this->bus_slot,
"/rofl/lol", // object path
"rofl.lol", // interface name
your_vtable,
this // this is the userdata that'll be passed
// to the dbus methods
);
if (r < 0) {
printf("[dbus] Failed to install the horst sdbus object: %s", strerror(-r));
goto clean_return;
}
// register our service name
r = sd_bus_request_name(this->bus, "moveii.horst", 0);
if (r < 0) {
printf("[dbus] Failed to acquire service name: %s", strerror(-r));
goto clean_return;
}
// register the filedescriptor from
// sd_bus_get_fd(bus) to libuv
uv_poll_init(this->loop, &this->connection, sd_bus_get_fd(this->bus));
// make `this` reachable in callbacks.
this->connection.data = this;
// init the dbus-event-timer
uv_timer_init(this->loop, &this->timer);
this->timer.data = this;
// process initial events and set up the
// events and timers for subsequent calls
on_dbus_ready(&this->connection, 0, 0);
printf("[dbus] Listener initialized");
return 0;
clean_return:
sd_bus_slot_unref(this->bus_slot);
sd_bus_unref(this->bus);
this->bus = nullptr;
return 1;
}
/** update the events watched for on the filedescriptor */
void update_events() {
sd_bus *bus = this->get_bus();
// prepare the callback for calling us the next time.
int new_events = poll_to_libuv_events(
sd_bus_get_events(bus)
);
uint64_t usec;
int r = sd_bus_get_timeout(bus, &usec);
if (not r) {
// if the timer is running already, it is stopped automatically
// inside uv_timer_start.
uv_timer_start(
&this->timer,
[] (uv_timer_t *handle) {
// yes, handle is not a poll_t, but
// we just care for its -> data member anyway.
on_dbus_ready((uv_poll_t *)handle, 0, 0);
},
usec / 1000, // time in milliseconds, sd_bus provides µseconds
0 // don't repeat
);
}
// always watch for disconnects:
new_events |= UV_DISCONNECT;
// activate the socket watching,
// and if active, invoke the callback function
uv_poll_start(&this->connection, new_events, &on_dbus_ready);
}
/** close the connections */
int close() {
// TODO: maybe this memoryerrors when the loop actually
// does the cleanup. we have to wait for the callback.
uv_close((uv_handle_t *) &this->timer, nullptr);
uv_poll_stop(&this->connection);
sd_bus_close(this->bus);
sd_bus_slot_unref(this->bus_slot);
sd_bus_unref(this->bus);
return 0;
}
/**
* Return the bus handle.
*/
sd_bus *get_bus() const {
return this->bus;
}
protected:
/**
* loop handle
*/
uv_loop_t *loop;
/**
* polling object for dbus events
*/
uv_poll_t connection;
/**
* dbus also wants to be called periodically
*/
uv_timer_t timer;
/**
* dbus bus handle
*/
sd_bus *bus;
/**
* dbus slot handle
*/
sd_bus_slot *bus_slot;
};

ZMQ recv() is blocking even after the context was terminated

I did my best to follow the instructions in the ZMQ termination whitepaper, but so far I'm failing miserably.
I have a parent class, which spawns a listener thread (using win32-pthreads).
Accoring to the whitepaper, when terminating, I should set the _stopped flag, delete the context, which in turn would call zmq_term() and release the blocking recv(). Instead, what I get is either:
calling delete _zmqContext crashes the application (probably with a segmentation fault)
replacing the delete with zmq_term(_zmqContext) does not release the blocking recv()
I'm adding a partial code sample, which is long because I'm not sure which part may be important.
AsyncZmqListener.hpp:
class AsyncZmqListener
{
public:
AsyncZmqListener(const std::string uri);
~AsyncZmqListener();
bool Start();
void Stop();
private:
static void* _threadEntryFunc(void* _this);
void _messageLoop();
private:
bool _stopped;
pthread_t _thread;
zmq::context_t* _zmqContext;
};
AsyncZmqListener.cpp:
AsyncZmqListener::AsyncZmqListener(const std::string uri) : _uri(uri)
{
_zmqContext = new zmq::context_t(1);
_stopped = false;
}
void AsyncZmqListener::Start()
{
int status = pthread_create(&_thread, NULL, _threadEntryFunc, this);
}
void AsyncZmqListener::Stop()
{
_stopped = true;
delete _zmqContext; // <-- Crashes the application. Changing to 'zmq_term(_zmqContext)' does not terminate recv()
pthread_join(_thread, NULL); // <-- This waits forever
}
void AsyncZmqListener::_messageLoop()
{
zmq::socket_t listener(*_zmqContext, ZMQ_PULL);
listener.bind(_uri.c_str());
zmq::message_t message;
while(!_stopped)
{
listener.recv(&message); // <-- blocks forever
process(message);
}
}
P.S.
I'm aware of this related question, but none of the answers quite match the clean exit flow described in the whitepaper. I will resolve to polling if I have to...
ZMQ recv() did unblock after its related context was terminated
I was not aware that recv() throws an ETERM exception when this happens.
Revised code that works:
void AsyncZmqListener::_messageLoop()
{
zmq::socket_t listener(*_zmqContext, ZMQ_PULL);
listener.bind(_uri.c_str());
zmq::message_t message;
while(!_stopped)
{
try
{
listener.recv(&message);
process(message);
}
catch(const zmq::error_t& ex)
{
// recv() throws ETERM when the zmq context is destroyed,
// as when AsyncZmqListener::Stop() is called
if(ex.num() != ETERM)
throw;
}
}
}

cancelling a thread inside a signal handler

I have started a timer and set the interval as 5 secs and registered a signal handler for it.
When SIGALRM is encountered iam trying to terminate the thread inside the signal handler, bt not able to do that. Thread is not getting terminated , instead of this whole process is killed.
The following is the code:
void signalHandler()
{
printf("Caught signal ...\n");
printf("Now going to terminate thread..\n");
pthread_kill(tid, SIGKILL);
}
void * thread_function()
{
int oldstate;
char result[256] = {0};
time_t startTime = time(NULL);
time_t timerDuration = 5;
time_t endTime = startTime + timerDuration;
while(1) {
printf("Timer is runnuing as dameon..\n");
if(!strcmp(result, "CONNECTED")) {
resp = 1;
pthread_exit(&resp);
}
}
}
int main()
{
int *ptr[2];
signal(SIGALRM, signalHandler);
timer.it_interval.tv_usec = 0;
timer.it_interval. tv_usec = 0;
timer.it_value.tv_sec = INTERVAL;
timer.it_value.tv_usec = 0;
setitimer(ITIMER_REAL, &timer, 0);
pthread_create(&tid, NULL, thread_function, NULL);
pthread_join(tid, (void**)&(ptr[0]));
printf("test %d\n\n",*ptr[0]);
while(1)
printf("1");
}
Platform : Linux , gcc compiler
As far as I'm aware you pretty much can't call anything inside a signal handler as you don't know what state your code is in.
Your best option is to set up a thread to handle your signals. All your other threads should call pthread_setsigmask and to block all signals, and then you create another thread, which calls calls pthread_setsigmask to catch SIGALARM, and then calls sigwait, at which point it can cancel the other thread.
The way of handling signals is much different in a multi-threaded environment as compared to a single threaded environment. In a multi-threaded code, you should block out all the signals for all the threads that have your business logic and then create a seperate thread for handling the signals. This is because, in multi-threaded environment, you cannot be sure to which thread the signal will be delivered.
Please refer to this link for more details:
http://devcry.heiho.net/2009/05/pthreads-and-unix-signals.html
Apart from this, to kill a thread use pthread_cancel which should work fine for you.
You can try using a flag:
int go_on[number_of_threads] = { 1 };
void signalHandler()
{
printf("Caught signal ...\n");
printf("Now going to terminate thread..\n");
go_on[tid] = 0;
}
void * thread_function()
{ /* */
while(go_on[this_thread_id]) {
printf("Timer is runnuing as dameon..\n");
if(!strcmp(result, "CONNECTED")) {
resp = 1;
pthread_exit(&resp);
}
}
}