I normally use single thread process with signal handlers and to achieve concurrency, by dividing parallel tasks into multiple process.
Now, i am trying to check if multi-threading can be faster. To implement alarms/timers, i typically register alarmHandlers and let OS send a signal. But in multi-threading environment, i cannot take this approach, UNLESS, there is a way such that signal can be delivered to a specific thread.
Hence my question, how to implement timers in multithreading environment? I can start a thread and let it sleep for desired amount and then set a shared variable. What other options do i have?
I assume you want to start threads at different times.
You can use the sleep_until function.
This is a C++11 function
The thread will sleep until a certain moment
http://en.cppreference.com/w/cpp/thread/sleep_until
So if you have several tasks to do your code would look like that:
int PerformTask(const std::chrono::time_point<Clock,Duration>& sleep_time, TaskParameters)
{
sleep_until(sleep_time);
PerformTask(TaskParameters);
}
Hope that helps,
You don't specify which environment (OS, API, etc) you are using so any answers you get are going to have to be fairly general.
From your example about starting a thread and having it sleep for a while and then set a shared variable, it sounds like what you're trying to do is have multiple threads all do something special at a particular time, correct?
If so, one easy way to do it would be to choose the alarm-time before spawning the threads, so that each thread can know in advance when to do the special action. Then its just a matter of coding each thread to "watch the clock" and do the action at the appointed time.
But let's say that you don't know in advance when the alarm is supposed to go off. In that case, what I think you need is a mechanism of inter-thread communication. Once you have a way for one thread to send a signal/message to another thread, you can use that to tell the target thread(s) when it's time for them to do the alarm-action.
There are various APIs to do that, but the way I like to use (because it's cross-platform portable and uses the standard BSD sockets API) is to create an entirely-local socket connection before spawning each thread. Under Unix/Posix, you can do this quite easily by calling socketpair(). Under Windows there isn't a socketpair() function to call but you can roll your own socketpair() via the usual networking calls (socket(),bind(),listen(),accept() for one socket, then socket() and connect() to create the other socket and connect it to the first end).
Once you have the pair of connected sockets, you have your parent thread keep only the first socket, and the newly-spawned thread keeps only the second socket. Et voila, now your parent thread and child thread can communicate with each other over the socket. E.g. if your parent thread wants the child thread to do something, it can send() a byte on its socket and the child thread will recv() that byte on its socket, or vice versa if the child thread wants to tell the parent to do something.
In that way, the parent thread could spawn a bunch of threads and then send a byte on each thread's socket when the alarm time arrived. The child threads in the meantime could be doing work and polling their socket via non-blocking recv() calls, or if they prefer to sleep while waiting for the alarm, they could block inside select() or recv() or whatever.
Note that you don't have to send all of your cross-thread data over the socketpair if you don't want to; usually I just lock a mutex, add a command object to a FIFO queue, unlock the mutex, and then send a single byte. When the child thread receives that byte, it responds by locking the same mutex, popping the command object off of the FIFO queue, unlocking the mutex, and then executing the command. That way you can used shared memory to "send" arbitrarily large amounts of data to the child thread without having to send lots of bytes across the socket. The one byte that is sent acts as only a "signal" to wake up the child thread.
Implement a timer with boost::asio
Here is a timer class witch we used in our project, witch project deal with 4Gbit/s internet flow(about 3.0-4.0 million timers). The timer is suit for most generaly work.
timer.h
/*
* Timer
* Licensed under Apache
*
* Author: KaiWen <wenkai1987#gmail.com>
* Date: Apr-16-2013
*
*/
#ifndef TIMER_H
#define TIMER_H
#include <boost/asio.hpp>
#include <boost/thread.hpp>
#include <boost/shared_ptr.hpp>
#include <boost/function.hpp>
#include <boost/unordered_map.hpp>
typedef boost::asio::deadline_timer* timer_ptr;
namespace bs = boost::system;
class timer;
class timer_node;
class tm_callback {
public:
explicit tm_callback(boost::function<void(timer_node&)>& f) : m_f(f)
{
}
void operator()(timer_node& node, const bs::error_code& e) {
if (!e)
m_f(node);
}
private:
boost::function<void(timer_node&)> m_f;
};
class timer_node {
friend class timer;
public:
timer_node() {}
timer_node(timer_ptr p, int ms, boost::function<void(timer_node&)> f) :
m_tptr(p), m_ms(ms), m_callback(f)
{
}
void reset(unsigned int ms = 0, boost::function<void(timer_node&)> f = 0) {
if (ms)
m_tptr->expires_from_now(boost::posix_time::milliseconds(ms));
else
m_tptr->expires_from_now(boost::posix_time::milliseconds(m_ms));
if (f)
m_tptr->async_wait(boost::bind<void>(tm_callback(f), *this, _1));
else
m_tptr->async_wait(boost::bind<void>(tm_callback(m_callback), *this, _1));
}
private:
timer_ptr m_tptr;
int m_ms;
boost::function<void(timer_node&)> m_callback;
};
timer.cpp
/*
* Timer
*
* Licensed under Apache
*
* Author: KaiWen <wenkai1987#gmail.com>
* Date: Apr-16-2013
*
*/
#include "timer.h"
#include <boost/bind.hpp>
#include <boost/date_time/posix_time/ptime.hpp>
namespace ba = boost::asio;
timer::timer(int thread_num) : m_next_ios(0), m_size(0) {
for (int i = 0; i < thread_num; i++) {
io_service_ptr p(new ba::io_service);
work_ptr pw(new ba::io_service::work(*p));
m_ios_list.push_back(p);
m_works.push_back(pw);
}
pthread_spin_init(&m_lock, 0);
}
timer::~timer() {
pthread_spin_destroy(&m_lock);
}
void timer::run() {
for (size_t i = 0; i < m_ios_list.size(); i++)
m_threads.create_thread(boost::bind(&ba::io_service::run, &*m_ios_list[i]))->detach();
}
If you like, combine the timer.cpp to the timer.h, then there is just a header file. A simple usage:
#include <stdio.h>
#include "timer.h"
timer t(3);
void callback(timer_node& nd) {
std::cout << "time out" << std::endl;
t.del_timer(nd);
}
int main(void) {
t.run();
t.add_timer(5000, callback); // set timeout 5 seconds
sleep(6);
return 0;
}
Implement a thread special timer
There is a lock in the timer above, witch cause the program not very fast. You can implement your owen thread special timer, witch not use lock, and not block, fater than the timer above , but this need a 'driver' and implement hardly. Here is a way we implement it:
pkt = get_pkt();
if (pkt) {
now = pkt->sec;
timer.execut_timer(now);
}
Now, here is no lock, and non-block and boost your performance, we use it to deal with 10GBit/s internet flow(about 8.0-9.0 million timers). But this is implement dependence. Hope help you.
Related
std::queue<double> some_q;
std::mutex mu_q;
/* an update function may be an event observer */
void UpdateFunc()
{
/* some other processing */
std::lock_guard lock{ mu_q };
while (!some_q.empty())
{
const auto& val = some_q.front();
/* update different states according to val */
some_q.pop();
}
/* some other processing */
}
/* some other thread might add some values after processing some other inputs */
void AddVal(...)
{
std::lock_guard lock{ mu_q };
some_q.push(...);
}
For this case is it okay to handle the queue this way?
Or would it be better if I try to use a lock-free queue like the boost one?
How bad it is to lock a mutex in an infinite loop or an update function
It's pretty bad. Infinite loops actually make your program have undefined behavior unless it does one of the following:
terminate
make a call to a library I/O function
perform an access through a volatile glvalue
perform a synchronization operation or an atomic operation
Acquiring the mutex lock before entering the loop and just holding it does not count as performing a synchronization operation (in the loop). Also, when holding the mutex, noone can add information to the queue, so while processing the information you extract, all threads wanting to add to the queue will have to wait - and no other worker threads wanting to share the load can extract from the queue either. It's usually better to extract one task from the queue, release the lock and then work with what you got.
The common way is to use a condition_variable that lets other threads acquire the lock and then notify other threads waiting with the same condition_variable. The CPU will be pretty close to idle while waiting and wake up to do the work when needed.
Using your program as a base, it could look like this:
#include <chrono>
#include <condition_variable>
#include <iostream>
#include <mutex>
#include <queue>
#include <thread>
std::queue<double> some_q;
std::mutex mu_q;
std::condition_variable cv_q; // the condition variable
bool stop_q = false; // something to signal the worker thread to quit
/* an update function may be an event observer */
void UpdateFunc() {
while(true) {
double val;
{
std::unique_lock lock{mu_q};
// cv_q.wait lets others acquire the lock to work with the queue
// while it waits to be notified.
while (not stop_q && some_q.empty()) cv_q.wait(lock);
if(stop_q) break; // time to quit
val = std::move(some_q.front());
some_q.pop();
} // lock released so others can use the queue
// do time consuming work with "val" here
std::cout << "got " << val << '\n';
}
}
/* some other thread might add some values after processing some other inputs */
void AddVal(double val) {
std::lock_guard lock{mu_q};
some_q.push(val);
cv_q.notify_one(); // notify someone that there's a new value to work with
}
void StopQ() { // a function to set the queue in shutdown mode
std::lock_guard lock{mu_q};
stop_q = true;
cv_q.notify_all(); // notify all that it's time to stop
}
int main() {
auto th = std::thread(UpdateFunc);
// simulate some events coming with some time apart
std::this_thread::sleep_for(std::chrono::seconds(1));
AddVal(1.2);
std::this_thread::sleep_for(std::chrono::seconds(1));
AddVal(3.4);
std::this_thread::sleep_for(std::chrono::seconds(1));
AddVal(5.6);
std::this_thread::sleep_for(std::chrono::seconds(1));
StopQ();
th.join();
}
If you really want to process everything that is currently in the queue, then extract everything first and then release the lock, then work with what you extracted. Extracting everything from the queue is done quickly by just swapping in another std::queue. Example:
#include <atomic>
std::atomic<bool> stop_q{}; // needs to be atomic in this version
void UpdateFunc() {
while(not stop_q) {
std::queue<double> work; // this will be used to swap with some_q
{
std::unique_lock lock{mu_q};
// cv_q.wait lets others acquire the lock to work with the queue
// while it waits to be notified.
while (not stop_q && some_q.empty()) cv_q.wait(lock);
std::swap(work, some_q); // extract everything from the queue at once
} // lock released so others can use the queue
// do time consuming work here
while(not stop_q && not work.empty()) {
auto val = std::move(work.front());
work.pop();
std::cout << "got " << val << '\n';
}
}
}
You can use it like you currently are assuming proper use of the lock across all threads. However, you may run into some frustrations about how you want to call updateFunc().
Are you going to be using a callback?
Are you going to be using an ISR?
Are you going to be polling?
If you use a 3rd party lib it often trivializes thread synchronization and queues
For example, if you are using a CMSIS RTOS(v2). It is a fairly straight forward process to get multiple threads to pass information between each other. You could have multiple producers, and a single consumer.
The single consumer can wait in a forever loop where it waits to receive a message before performing its work
when timeout is set to osWaitForever the function will wait for an
infinite time until the message is retrieved (i.e. wait semantics).
// Two producers
osMessageQueuePut(X,Y,Z,timeout=0)
osMessageQueuePut(X,Y,Z,timeout=0)
// One consumer which will run only once something enters the queue
osMessageQueueGet(X,Y,Z,osWaitForever)
tldr; You are safe to proceed, but using a library will likely make your synchronization problems easier.
Im new to Concurrency and parallelism programming in C/C++ so I need some help woth my project.
I want to run multiple process using POSIX and Semaphores in C++. So the structure of the program should be the following one.
First I Open Serial port (Serial communication of the Raspberry PI 4). While the Serial is Open Two processes are running
First one, the main one run automatically and do the following:
The thread ask for ODOM Updates(Pressure and IMU from the microcontroller) and Publish them. Also every 0.3 seconds check the modem inbox and if something new it publish.
The other one only on DEMAND from ROS Services detect that there is new message in the modem inbox do HALT( on the first main Process) and execute (publish) on the serial Port. Then the first process resume with normal work
So I trying to do first some pseudo C++ code that looks like these But I need help as Im new to Concurrency and parallelism. Here it is
#include <stdio.h>
#include <pthread.h>
#include <semaphore.h>
#include <unistd.h>
sem_t mutex;
void* thread(void* arg) { //function which act like thread
//Main Thread
// Here code for ASK ODOM UPDATE..
// Here code for CHECK MODEM INBOX...
sem_wait(&mutex); //wait state
// ENTER in the second Process
// Here code for the second process which run on DEMAND..
// ROS SERVICES
// Here code for CHECK The MODEM INBOX and HALT the First Process
// Here code for EXECUTE on SERIAL PORT(PUBLISH)
sleep(0.1); //critical section
printf("\nCompleted...\n"); //comming out from Critical section
sem_post(&mutex);
}
main() {
sem_init(&mutex, 0, 1);
pthread_t th1,th2;
pthread_create(&th1,NULL,thread,NULL);
sleep(1);
pthread_create(&th2,NULL,thread,NULL);
//Join threads with the main thread
pthread_join(th1,NULL);
pthread_join(th2,NULL);
sem_destroy(&mutex);
}
So Im not sure this is the correct way of implementation on C++ . Any help in the implementation and maybe help on the actual C++ code?
Thanks
Fortunately, ROS lets you decide what threading model you want to use (http://wiki.ros.org/roscpp/Overview/Callbacks%20and%20Spinning).
You could use the ros::AsyncSpinner for that:
Your main thread starts the AsyncSpinner that runs in the background, listens to ROS messages, and calls your ROS callback functions in its own threads.
Then, your main thread cares about your serial port connection and forwards/publishes messages. In pseudo-code, it could look as follows:
#include <ros/ros.h>
#include <mutex>
#include <std_msgs/Float32.h>
std::mutex mutex_;
void callback(std_msgs::Float32ConstPtr msg) {
// reentrant preprocessing
{
std::lock_guard<std::mutex> guard( mutex_ );
// work with serial port
}
// reentrant posprocessing
}
int main(int argc, char* argv[]) {
ros::init(argc, argv, "name");
ros::NodeHandle node("~");
ros::Subscriber sub = node.subscribe("/test", 1, callback);
ros::AsyncSpinner spinner(1);
spinner.start();
while(ros::ok()) {
// reentrant preprocessing
{
std::lock_guard<std::mutex> guard(mutex_);
// work with serial port
}
// reentrant postprocessing
}
}
You can see the code block that is critical. Here, both threads are synchronized, i.e., only one thread is in its critical path at a time.
I used the C++ mutex, as it is the C++ std way, but you could change that of course.
Also, feel free to wait for serial port messages in the main thread to reduce heat production on your chip.
I need to work with several objects, where each operation may take a lot of time.
The processing could not be placed in a GUI (main) thread, where I start it.
I need to make all the communications with some objects on asynchronous operations, something similar to std::async with std::future or QtConcurrent::run() in my main framework (Qt 5), with QFuture, etc., but it doesn't provide thread selection. I need to work with a selected object (objects == devices) in only one additional thread always,
because:
I need to make a universal solution and don't want to make each class thread-safe
For example, even if make a thread-safe container for QSerialPort, Serial port in Qt cannot be accessed in more than one thread:
Note: The serial port is always opened with exclusive access (that is, no other process or thread can access an already opened serial port).
Usually a communication with a device consists of transmit a command and receive an answer. I want to process each Answer exactly in the place where Request was sent and don't want to use event-driven-only logic.
So, my question.
How can the function be implemented?
MyFuture<T> fut = myAsyncStart(func, &specificLiveThread);
It is necessary that one live thread can be passed many times.
Let me answer without referencing to Qt library since I don't know its threading API.
In C++11 standard library there is no straightforward way to reuse created thread. Thread executes single function and can be only joined or detachted. However, you can implement it with producer-consumer pattern. The consumer thread needs to execute tasks (represented as std::function objects for instance) which are placed in queue by producer thread. So if I am correct you need a single threaded thread pool.
I can recommend my C++14 implementation of thread pools as tasks queues. It isn't commonly used (yet!) but it is covered with unit tests and checked with thread sanitizer multiple times. The documentation is sparse but feel free to ask anything in github issues!
Library repository: https://github.com/Ravirael/concurrentpp
And your use case:
#include <task_queues.hpp>
int main() {
// The single threaded task queue object - creates one additional thread.
concurrent::n_threaded_fifo_task_queue queue(1);
// Add tasks to queue, task is executed in created thread.
std::future<int> future_result = queue.push_with_result([] { return 4; });
// Blocks until task is completed.
int result = future_result.get();
// Executes task on the same thread as before.
std::future<int> second_future_result = queue.push_with_result([] { return 4; });
}
If you want to follow the Active Object approach here is an example using templates:
The WorkPackage and it's interface are just for storing functions of different return type in a vector (see later in the ActiveObject::async member function):
class IWorkPackage {
public:
virtual void execute() = 0;
virtual ~IWorkPackage() {
}
};
template <typename R>
class WorkPackage : public IWorkPackage{
private:
std::packaged_task<R()> task;
public:
WorkPackage(std::packaged_task<R()> t) : task(std::move(t)) {
}
void execute() final {
task();
}
std::future<R> get_future() {
return task.get_future();
}
};
Here's the ActiveObject class which expects your devices as a template. Furthermore it has a vector to store the method requests of the device and a thread to execute those methods one after another. Finally the async function is used to request a method call from the device:
template <typename Device>
class ActiveObject {
private:
Device servant;
std::thread worker;
std::vector<std::unique_ptr<IWorkPackage>> work_queue;
std::atomic<bool> done;
std::mutex queue_mutex;
std::condition_variable cv;
void worker_thread() {
while(done.load() == false) {
std::unique_ptr<IWorkPackage> wp;
{
std::unique_lock<std::mutex> lck {queue_mutex};
cv.wait(lck, [this] {return !work_queue.empty() || done.load() == true;});
if(done.load() == true) continue;
wp = std::move(work_queue.back());
work_queue.pop_back();
}
if(wp) wp->execute();
}
}
public:
ActiveObject(): done(false) {
worker = std::thread {&ActiveObject::worker_thread, this};
}
~ActiveObject() {
{
std::unique_lock<std::mutex> lck{queue_mutex};
done.store(true);
}
cv.notify_one();
worker.join();
}
template<typename R, typename ...Args, typename ...Params>
std::future<R> async(R (Device::*function)(Params...), Args... args) {
std::unique_ptr<WorkPackage<R>> wp {new WorkPackage<R> {std::packaged_task<R()> { std::bind(function, &servant, args...) }}};
std::future<R> fut = wp->get_future();
{
std::unique_lock<std::mutex> lck{queue_mutex};
work_queue.push_back(std::move(wp));
}
cv.notify_one();
return fut;
}
// In case you want to call some functions directly on the device
Device* operator->() {
return &servant;
}
};
You can use it as follows:
ActiveObject<QSerialPort> ao_serial_port;
// direct call:
ao_serial_port->setReadBufferSize(size);
//async call:
std::future<void> buf_future = ao_serial_port.async(&QSerialPort::setReadBufferSize, size);
std::future<Parity> parity_future = ao_serial_port.async(&QSerialPort::parity);
// Maybe do some other work here
buf_future.get(); // wait until calculations are ready
Parity p = parity_future.get(); // blocks if result not ready yet, i.e. if method has not finished execution yet
EDIT to answer the question in the comments: The AO is mainly a concurrency pattern for multiple reader/writer. As always, its use depends on the situation. And so this pattern is commonly used in distributed systems/network applications, for example when multiple clients request a service from a server. The clients benefit from the AO pattern as they are not blocked, when waiting for the server to answer.
One reason why this pattern is not used so often in fields other then network apps might be the thread overhead. When creating a thread for every active object results in a lot of threads and thus thread contention if the number of CPUs is low and many active objects are used at once.
I can only guess why people think it is a strange issue: As you already found out it does require some additional programming. Maybe that's the reason but I'm not sure.
But I think the pattern is also very useful for other reasons and uses. As for your example, where the main thread (and also other background threads) require a service from singletons, for example some devices or hardware interfaces, which are only availabale in a low number, slow in their computations and require concurrent access, without being blocked waiting for a result.
It's Qt. It's signal-slot mechanism is thread-aware. On your secondary (non-GUI) thread, create a QObject-derived class with an execute slot. Signals connected to this slot will marshal the event to that thread.
Note that this QObject can't be a child of a GUI object, since children need to live in their parents thread, and this object explicitly does not live in the GUI thread.
You can handle the result using existing std::promise logic, just like std::future does.
I'm trying to get my hands on multi threading and it's not working so far. I'm creating a program which allows serial communication with a device and it's working quite well without multi threading. Now I want to introduce threads, one thread to continuously send packets, one thread to receive and process packets and another thread for a GUI.
The first two threads need access to four classes in total, but using pthread_create() I can only pass one argument. I then stumled upon a post here on stack overflow (pthread function from a class) where Jeremy Friesner presents a very elegant way. I then figured that it's easiest to create a Core class which contains all the objects my threads need access to as well as all functions for the threads.So here's a sample from my class Core:
/** CORE.CPP **/
#include "SerialConnection.h" // Clas for creating a serial connection using termios
#include "PacketGenerator.h" // Allows to create packets to be transfered
#include <pthread.h>
#define NUM_THREADS 4
class Core{
private:
SerialConnection serial; // One of the objects my threads need access to
pthread_t threads[NUM_THREADS];
pthread_t = _thread;
public:
Core();
~Core();
void launch_threads(); // Supposed to launch all threads
static void *thread_send(void *arg); // See the linked post above
void thread_send_function(); // See the linked post above
};
Core::Core(){
// Open serial connection
serial.open_connection();
}
Core::~Core(){
// Close serial connection
serial.close_connection();
}
void Core::launch_threads(){
pthread_create(&threads[0], NULL, thread_send, this);
cout << "CORE: Killing threads" << endl;
pthread_exit(NULL);
}
void *Core::thread_send(void *arg){
cout << "THREAD_SEND launched" << endl;
((Core *)arg)->thread_send_function();
return NULL;
}
void Core::thread_send_function(){
generator.create_hello_packet();
generator.send_packet(serial);
pthread_exit(NULL);
}
Problem is now that my serial object crashes with segmentation fault (that pointer stuff going on in Core::thread_send(void *arg) makes me suspicious. Even when it does not crash, no data is transmitted over the serial connection even though the program executed without any errors. Execution form main:
/** MAIN.CPP (extract) VARIANT 1 **/
int main(){
Core core;
core.launch_threads(); // No data is transferred
}
However, if I call the thread_send_function directly (the one the thread is supposed to execute), the data is transmitted over the serial connection flawlessly:
/** MAIN.CPP (extract) VARIANT 1 **/
int main(){
Core core;
core.thread_send_function(); // Data transfer works
}
Now I'm wondering what the proper way of dealing with this situation is. Instead of that trickery in Core.cpp, should I just create a struct holding pointers to the different classes I need and then pass that struct to the pthread_create() function? What is the best solution for this problem in general?
The problem you have is that your main thread exits the moment it created the other thread, at which point the Core object is destroyed and the program then exits completely. This happens while your newly created thread tries to use the Core object and send data; you either see absolutely nothing happening (if the program exits before the thread ever gets to do anything) or a crash (if Core is destroyed while the thread tries to use it). In theory you could also see it working correctly, but because the thread probably takes a bit to create the packet and send it, that's unlikely.
You need to use pthread_join to block the main thread just before quitting, until the thread is done and has exited.
And anyway, you should be using C++11's thread support or at least Boost's. That would let you get rid of the low-level mess you have with the pointers.
recently I set out to port ucos-ii to Ubuntu PC.
As we know, it's not possible to simulate the "process" in the ucos-ii by simply adding a flag in "while" loop in the pthread's call-back function to perform pause and resume(like the solution below). Because the "process" in ucos-ii can be paused or resumed at any time!
How to sleep or pause a PThread in c on Linux
I have found one solution on the web-site below, but it can't be built because it's out of date. It uses the process in Linux to simulate the task(acts like the process in our Linux) in ucos-ii.
http://www2.hs-esslingen.de/~zimmerma/software/index_uk.html
If pthread can act like the process which can be paused and resumed at any time, please tell me some related functions, I can figure it out myself. If it can't, I think I should focus on the older solution. Thanks a lot.
The Modula-3 garbage collector needs to suspend pthreads at an arbitrary time, not just when they are waiting on a condition variable or mutex. It does it by registering a (Unix) signal handler that suspends the thread and then using pthread_kill to send a signal to the target thread. I think it works (it has been reliable for others but I'm debugging an issue with it right now...) It's a bit kludgy, though....
Google for ThreadPThread.m3 and look at the routines "StopWorld" and "StartWorld". Handler itself is in ThreadPThreadC.c.
If stopping at specific points with a condition variable is insufficient, then you can't do this with pthreads. The pthread interface does not include suspend/resume functionality.
See, for example, answer E.4 here:
The POSIX standard provides no mechanism by which a thread A can suspend the execution of another thread B, without cooperation from B. The only way to implement a suspend/restart mechanism is to have B check periodically some global variable for a suspend request and then suspend itself on a condition variable, which another thread can signal later to restart B.
That FAQ answer goes on to describe a couple of non-standard ways of doing it, one in Solaris and one in LinuxThreads (which is now obsolete; do not confuse it with current threading on Linux); neither of those apply to your situation.
On Linux you can probably setup custom signal handler (eg. using signal()) that will contain wait for another signal (eg. using sigsuspend()). You then send the signals using pthread_kill() or tgkill(). It is important to use so-called "realtime signals" for this, because normal signals like SIGUSR1 and SIGUSR2 don't get queued, which means that they can get lost under high load conditions. You send a signal several times, but it gets received only once, because before while signal handler is running, new signals of the same kind are ignored. So if you have concurent threads doing PAUSE/RESUME , you can loose RESUME event and cause deadlock. On the other hand, the pending realtime signals (like SIGRTMIN+1 and SIGRTMIN+2) are not deduplicated, so there can be several same rt signals in queue at the same time.
DISCLAIMER: I had not tried this yet. But in theory it should work.
Also see man 7 signal-safety. There is a list of calls that you can safely call in signal handlers. Fortunately sigsuspend() seems to be one of them.
UPDATE: I have working code right here:
//Filename: pthread_pause.c
//Author: Tomas 'Harvie' Mudrunka 2021
//Build: CFLAGS=-lpthread make pthread_pause; ./pthread_pause
//Test: valgrind --tool=helgrind ./pthread_pause
//I've wrote this code as excercise to solve following stack overflow question:
// https://stackoverflow.com/questions/9397068/how-to-pause-a-pthread-any-time-i-want/68119116#68119116
#define _GNU_SOURCE //pthread_yield() needs this
#include <signal.h>
#include <pthread.h>
//#include <pthread_extra.h>
#include <semaphore.h>
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <unistd.h>
#include <errno.h>
#include <sys/resource.h>
#include <time.h>
#define PTHREAD_XSIG_STOP (SIGRTMIN+0)
#define PTHREAD_XSIG_CONT (SIGRTMIN+1)
#define PTHREAD_XSIGRTMIN (SIGRTMIN+2) //First unused RT signal
pthread_t main_thread;
sem_t pthread_pause_sem;
pthread_once_t pthread_pause_once_ctrl = PTHREAD_ONCE_INIT;
void pthread_pause_once(void) {
sem_init(&pthread_pause_sem, 0, 1);
}
#define pthread_pause_init() (pthread_once(&pthread_pause_once_ctrl, &pthread_pause_once))
#define NSEC_PER_SEC (1000*1000*1000)
// timespec_normalise() from https://github.com/solemnwarning/timespec/
struct timespec timespec_normalise(struct timespec ts)
{
while(ts.tv_nsec >= NSEC_PER_SEC) {
++(ts.tv_sec); ts.tv_nsec -= NSEC_PER_SEC;
}
while(ts.tv_nsec <= -NSEC_PER_SEC) {
--(ts.tv_sec); ts.tv_nsec += NSEC_PER_SEC;
}
if(ts.tv_nsec < 0) { // Negative nanoseconds isn't valid according to POSIX.
--(ts.tv_sec); ts.tv_nsec = (NSEC_PER_SEC + ts.tv_nsec);
}
return ts;
}
void pthread_nanosleep(struct timespec t) {
//Sleep calls on Linux get interrupted by signals, causing premature wake
//Pthread (un)pause is built using signals
//Therefore we need self-restarting sleep implementation
//IO timeouts are restarted by SA_RESTART, but sleeps do need explicit restart
//We also need to sleep using absolute time, because relative time is paused
//You should use this in any thread that gets (un)paused
struct timespec wake;
clock_gettime(CLOCK_MONOTONIC, &wake);
t = timespec_normalise(t);
wake.tv_sec += t.tv_sec;
wake.tv_nsec += t.tv_nsec;
wake = timespec_normalise(wake);
while(clock_nanosleep(CLOCK_MONOTONIC, TIMER_ABSTIME, &wake, NULL)) if(errno!=EINTR) break;
return;
}
void pthread_nsleep(time_t s, long ns) {
struct timespec t;
t.tv_sec = s;
t.tv_nsec = ns;
pthread_nanosleep(t);
}
void pthread_sleep(time_t s) {
pthread_nsleep(s, 0);
}
void pthread_pause_yield() {
//Call this to give other threads chance to run
//Wait until last (un)pause action gets finished
sem_wait(&pthread_pause_sem);
sem_post(&pthread_pause_sem);
//usleep(0);
//nanosleep(&((const struct timespec){.tv_sec=0,.tv_nsec=1}), NULL);
//pthread_nsleep(0,1); //pthread_yield() is not enough, so we use sleep
pthread_yield();
}
void pthread_pause_handler(int signal) {
//Do nothing when there are more signals pending (to cleanup the queue)
//This is no longer needed, since we use semaphore to limit pending signals
/*
sigset_t pending;
sigpending(&pending);
if(sigismember(&pending, PTHREAD_XSIG_STOP)) return;
if(sigismember(&pending, PTHREAD_XSIG_CONT)) return;
*/
//Post semaphore to confirm that signal is handled
sem_post(&pthread_pause_sem);
//Suspend if needed
if(signal == PTHREAD_XSIG_STOP) {
sigset_t sigset;
sigfillset(&sigset);
sigdelset(&sigset, PTHREAD_XSIG_STOP);
sigdelset(&sigset, PTHREAD_XSIG_CONT);
sigsuspend(&sigset); //Wait for next signal
} else return;
}
void pthread_pause_enable() {
//Having signal queue too deep might not be necessary
//It can be limited using RLIMIT_SIGPENDING
//You can get runtime SigQ stats using following command:
//grep -i sig /proc/$(pgrep binary)/status
//This is no longer needed, since we use semaphores
//struct rlimit sigq = {.rlim_cur = 32, .rlim_max=32};
//setrlimit(RLIMIT_SIGPENDING, &sigq);
pthread_pause_init();
//Prepare sigset
sigset_t sigset;
sigemptyset(&sigset);
sigaddset(&sigset, PTHREAD_XSIG_STOP);
sigaddset(&sigset, PTHREAD_XSIG_CONT);
//Register signal handlers
//signal(PTHREAD_XSIG_STOP, pthread_pause_handler);
//signal(PTHREAD_XSIG_CONT, pthread_pause_handler);
//We now use sigaction() instead of signal(), because it supports SA_RESTART
const struct sigaction pause_sa = {
.sa_handler = pthread_pause_handler,
.sa_mask = sigset,
.sa_flags = SA_RESTART,
.sa_restorer = NULL
};
sigaction(PTHREAD_XSIG_STOP, &pause_sa, NULL);
sigaction(PTHREAD_XSIG_CONT, &pause_sa, NULL);
//UnBlock signals
pthread_sigmask(SIG_UNBLOCK, &sigset, NULL);
}
void pthread_pause_disable() {
//This is important for when you want to do some signal unsafe stuff
//Eg.: locking mutex, calling printf() which has internal mutex, etc...
//After unlocking mutex, you can enable pause again.
pthread_pause_init();
//Make sure all signals are dispatched before we block them
sem_wait(&pthread_pause_sem);
//Block signals
sigset_t sigset;
sigemptyset(&sigset);
sigaddset(&sigset, PTHREAD_XSIG_STOP);
sigaddset(&sigset, PTHREAD_XSIG_CONT);
pthread_sigmask(SIG_BLOCK, &sigset, NULL);
sem_post(&pthread_pause_sem);
}
int pthread_pause(pthread_t thread) {
sem_wait(&pthread_pause_sem);
//If signal queue is full, we keep retrying
while(pthread_kill(thread, PTHREAD_XSIG_STOP) == EAGAIN) usleep(1000);
pthread_pause_yield();
return 0;
}
int pthread_unpause(pthread_t thread) {
sem_wait(&pthread_pause_sem);
//If signal queue is full, we keep retrying
while(pthread_kill(thread, PTHREAD_XSIG_CONT) == EAGAIN) usleep(1000);
pthread_pause_yield();
return 0;
}
void *thread_test() {
//Whole process dies if you kill thread immediately before it is pausable
//pthread_pause_enable();
while(1) {
//Printf() is not async signal safe (because it holds internal mutex),
//you should call it only with pause disabled!
//Will throw helgrind warnings anyway, not sure why...
//See: man 7 signal-safety
pthread_pause_disable();
printf("Running!\n");
pthread_pause_enable();
//Pausing main thread should not cause deadlock
//We pause main thread here just to test it is OK
pthread_pause(main_thread);
//pthread_nsleep(0, 1000*1000);
pthread_unpause(main_thread);
//Wait for a while
//pthread_nsleep(0, 1000*1000*100);
pthread_unpause(main_thread);
}
}
int main() {
pthread_t t;
main_thread = pthread_self();
pthread_pause_enable(); //Will get inherited by all threads from now on
//you need to call pthread_pause_enable (or disable) before creating threads,
//otherwise first (un)pause signal will kill whole process
pthread_create(&t, NULL, thread_test, NULL);
while(1) {
pthread_pause(t);
printf("PAUSED\n");
pthread_sleep(3);
printf("UNPAUSED\n");
pthread_unpause(t);
pthread_sleep(1);
/*
pthread_pause_disable();
printf("RUNNING!\n");
pthread_pause_enable();
*/
pthread_pause(t);
pthread_unpause(t);
}
pthread_join(t, NULL);
printf("DIEDED!\n");
}
I am also working on library called "pthread_extra", which will have stuff like this and much more. Will publish soon.
UPDATE2: This is still causing deadlocks when calling pause/unpause rapidly (removed sleep() calls). Printf() implementation in glibc has mutex, so if you suspend thread which is in middle of printf() and then want to printf() from your thread which plans to unpause that thread later, it will never happen, because printf() is locked. Unfortunately i've removed the printf() and only run empty while loop in the thread, but i still get deadlocks under high pause/unpause rates. and i don't know why. Maybe (even realtime) Linux signals are not 100% safe. There is realtime signal queue, maybe it just overflows or something...
UPDATE3: i think i've managed to fix the deadlock, but had to completely rewrite most of the code. Now i have one (sig_atomic_t) variable per each thread which holds state whether that thread should be running or not. Works kinda like condition variable. pthread_(un)pause() transparently remembers this for each thread. I don't have two signals. now i only have one signal. handler of that signal looks at that variable and only blocks on sigsuspend() when that variable says the thread should NOT run. otherwise it returns from signal handler. in order to suspend/resume the thread i now set the sig_atomic_t variable to desired state and call that signal (which is common for both suspend and resume). It is important to use realtime signals to be sure handler will actualy run after you've modified the state variable. Code is bit complex because of the thread status database. I will share the code in separate solution as soon as i manage to simplify it enough. But i want to preserve the two signal version in here, because it kinda works, i like the simplicity and maybe people will give us more insight on how to optimize it.
UPDATE4: I've fixed the deadlock in original code (no need for helper variable holding the status) by using single handler for two signals and optimizing signal queue a bit. There is still some problem with printf() shown by helgrind, but it is not caused by my signals, it happens even when i do not call pause/unpause at all. Overall this was only tested on LINUX, not sure how portable the code is, because there seem to be some undocumented behaviour of signal handlers which was originaly causing the deadlock.
Please note that pause/unpause cannot be nested. if you pause 3 times, and unpause 1 time, the thread WILL RUN. If you need such behaviour, you should create some kind of wrapper which will count the nesting levels and signal the thread accordingly.
UPDATE5: I've improved robustness of the code by following changes: I ensure proper serialization of pause/unpause calls by use of semaphores. This hopefuly fixes last remaining deadlocks. Now you can be sure that when pause call returns, the target thread is actualy already paused. This also solves issues with signal queue overflowing. Also i've added SA_RESTART flag, which prevents internal signals from causing interuption of IO waits. Sleeps/delays still have to be restarted manualy, but i provide convenient wrapper called pthread_nanosleep() which does just that.
UPDATE6: i realized that simply restarting nanosleep() is not enough, because that way timeout does not run when thread is paused. Therefore i've modified pthread_nanosleep() to convert timeout interval to absolute time point in the future and sleep until that. Also i've hidden semaphore initialization, so user does not need to do that.
Here is example of thread function within a class with pause/resume functionality...
class SomeClass
{
public:
// ... construction/destruction
void Resume();
void Pause();
void Stop();
private:
static void* ThreadFunc(void* pParam);
pthread_t thread;
pthread_mutex_t mutex;
pthread_cond_t cond_var;
int command;
};
SomeClass::SomeClass()
{
pthread_mutex_init(&mutex, NULL);
pthread_cond_init(&cond_var, NULL);
// create thread in suspended state..
command = 0;
pthread_create(&thread, NULL, ThreadFunc, this);
}
SomeClass::~SomeClass()
{
// we should stop the thread and exit ThreadFunc before calling of blocking pthread_join function
// also it prevents the mutex staying locked..
Stop();
pthread_join(thread, NULL);
pthread_cond_destroy(&cond_var);
pthread_mutex_destroy(&mutex);
}
void* SomeClass::ThreadFunc(void* pParam)
{
SomeClass* pThis = (SomeClass*)pParam;
timespec time_ns = {0, 50*1000*1000}; // 50 milliseconds
while(1)
{
pthread_mutex_lock(&pThis->mutex);
if (pThis->command == 2) // command to stop thread..
{
// be sure to unlock mutex before exit..
pthread_mutex_unlock(&pThis->mutex);
return NULL;
}
else if (pThis->command == 0) // command to pause thread..
{
pthread_cond_wait(&pThis->cond_var, &pThis->mutex);
// dont forget to unlock the mutex..
pthread_mutex_unlock(&pThis->mutex);
continue;
}
if (pThis->command == 1) // command to run..
{
// normal runing process..
fprintf(stderr, "*");
}
pthread_mutex_unlock(&pThis->mutex);
// it's important to give main thread few time after unlock 'this'
pthread_yield();
// ... or...
//nanosleep(&time_ns, NULL);
}
pthread_exit(NULL);
}
void SomeClass::Stop()
{
pthread_mutex_lock(&mutex);
command = 2;
pthread_cond_signal(&cond_var);
pthread_mutex_unlock(&mutex);
}
void SomeClass::Pause()
{
pthread_mutex_lock(&mutex);
command = 0;
// in pause command we dont need to signal cond_var because we not in wait state now..
pthread_mutex_unlock(&mutex);
}
void SomeClass::Resume()
{
pthread_mutex_lock(&mutex);
command = 1;
pthread_cond_signal(&cond_var);
pthread_mutex_unlock(&mutex);
}