Thread safety of curl_multi_remove_handle - libcurl

It seems like some sources recommend using curl_multi_remove_handle to "invalidate" a curl handle and cause curl_multi_wait to return early. This seems not to be covered under the thread safety guarantee (if done from another thread), or am I wrong (the threads safety guarantees are basically just reentrancy guarantees)?
What is the recommended way signal curl_multi_wait to return early? Is it really required to do it via timeouts? (Under Linux, I would use an eventfd in the epoll set to effectively have the case "wait on these sockets OR this event fd OR the given timeout".) It seems I could use custom curl_waitfd structures, but this would require platform specific setup for dummy sockets.

You must not call curl_multi_remove_handle from thread B if curl_multi_wait for that handle is running in thread A. That will just cause tears and misery.
You can opt to, for example:
user sufficiently short timeouts for curl_multi_wait() so that you don't need to abort it
add a private socket/file descriptor to send data on to abort when you want to
return error from the progress callback (or another callback) for the transfer(s) you need to stop - by setting a flag that they all check (global, or global like)
rework your app logic so that you can consider the transfer to "dead" without it having stopped yet, and have libcurl have its cause and close it later and you don't have to care much about it being done a bit after you decided you can ignore it.
curl_multi_poll()
After I first wrote this answer, we introduced curl_multi_poll in libcurl. This function is very similar to curl_multi_wait but also allows it to pre-emptively return with the use of curl_multi_wakeup, thus offering applications a few more alternative approaches.

Unfortunately, curl_multi is not, what people these days would deem as "thread safe". Yes, you can use a CURLM handle in two different threads, as long, as they don't access it at the same time. But hey, this is true for almost any data structure in C or C++.
So, if you have one thread running an event loop with curl_multi_wait(), you cannot use a second thread to add new jobs via curl_multi_add_handle() or remove jobs via curl_multi_remove_handle(). Well, it will work most of the times, but especially during high load, you will start getting data corruptions and segfaults due to the concurrent access to libcurl's internal data structures.
There are two ways around this problem, but both require a bit of coding:
Use the newer curl_multi_poll() interface, which (unlike curl_multi_wait()) is externally interruptible via curl_multi_wakeup(). Yes, curl_multi_wakeup() is the ONLY function on CURLM handles, that is safe to call concurrently from another thread (or even multiple threads). To add new requests to the event loop or remove requests from it, you would need some request queue and a mutex, which secures access to that queue. Then, to add a new job, you would do:
(thread 1 is running curl_multi_poll() in an endless loop)
thread 2 acquires said mutex
thread 2 posts an "add easy handle request" into the request queue
thread 2 releases said mutex again
thread 2 calls curl_multi_wakeup()
thread 1 acquires the mutex after curl_multi_poll() returns
thread 1 then processes the "add easy handle request" in the job list and performs curl_multi_add_handle()
thread 1 then releases the mutex again
thread 1 does all other necessary work (in particular call curl_multi_perform() and pass finished transfers to the application etc.)
thread 1 calls curl_multi_poll() again
To remove a job, you would use the same procedure, just let thread 2 post an "remove easy handle request" instead of an "add easy handle request" to the request queue and then let thread 1 call curl_multi_remove_handle() instead of curl_multi_add_handle().
In this solution, ALL calls to the CURLM handle are performed from thread 1, with the sole exception of curl_multi_wakeup(), which is used by other threads to signal thread 1 of new work waiting in the request queue.
Or use the curl_action() interface, where you have to provide two callbacks to libcurl, with which it reports file descriptors to watch and a timeout to your application. You then have to call epoll() or a similiar OS function yourself to wait for activity (or timeout) in the event loop thread. Then add a mutex again to serialize access to the CURLM handle: Your event loop thread should lock that mutex just before it calls curl_action() (or any other function on the CURLM handle) and unlock it immediately after. As curl_action() (unlike curl_multi_poll()) does not sleep, that mutex will be locked only for brief intervals. So other threads can then easily directly lock that mutex for themselves, too, and call curl_multi_add_handle() or curl_multi_remove_handle() as needed. Be aware, though, that those intervening additions or removals of handles can modify the active FD set, and that you may need some synchronisation with the event loop thread to notify it of the modified epoll() set.
The first solution is likely easier to implement. You should be able to find libcurl wrappers for both variants on Github, but be sure to test them intensively before using them in any critical application.

Related

Interrupting threads if not joined

I am looking for a way(preferably with boost threads), to interrupt a thread if it has not joined. I start multiple threads, and would like to end any of them that have not finished by 200 milliseconds. I tried something like this
boost::thread_group tgroup;
tgroup.create_thread(boost::bind(&print_f));
tgroup.create_thread(boost::bind(&print_g));
boost::this_thread::sleep(boost::posix_time::milliseconds(200));
tgroup.interrupt_all();
Now this works, and all threads are ended after 200 milliseconds; however I would like to try and join these threads if they finish before 200 milliseconds, is there a way to join and interrupt if not finished by a certain amount of time?
Edit: reason why I need join to happen before timeout:
I am creating a server where speed is very important. Unfortunately I have to make requests to other servers for some information. So I would like to make these calls in parallel, and finish as soon as possible. If a server is taking too long, I have to just ignore the information coming from that server, and continue on without it. So my timeout time is my maximum amount of time I can wait. It will be extremely beneficial to me to be able to continue on with contemplation when all responses are received, instead of waiting for time timeout timer. So what my program will:
-Get a request from a client.
-Parse information.
Create threads
-Send information to multiple other servers.
-Get information back from servers.
-Put information from servers on a shared queue.
End Threads
-Parse information from shared queue.
-Return information back to client
What you want to use is probably a set of scoped threads, and call terminate on all the remaining threads after timeout. thread groups and scoped threads are not useable together unfortunately.
The thread group class is actually a very simple container: you cannot remove a thread of it if you don't have a pointer to it already, and you cannot get a pointer to a thread which has been created by the group. The class API doesn't provide much either. This is a bit hindering for management in your situation.
The remaining solutions rely on creating the threads outside the goup, and have each of them do a specific task just before finishing. It could:
remove itself from the group,
then add itself to another group
The managing thread will have to call join_all on the later group, and act as before with the former.
using namespace boost;
void thread_end(auto &thmap, thread_group& t1, thread_group& t2, auto &task){
task();
thread *self = thmap[this_thread::get_id()];
t1.remove_thread(&self);
t2.add_thread(&self);
}
std::map<thread::id, thread *> thmap;
thread_group trunninggroup;
thread_group tfinishedgroup;
thread *th;
th = new thread(
bind(&thread_end, thmap, trunninggroup, tfinishedgroup, bind(&print_f)));
thmap[th->get_id()] = th;
trunninggroup.add_thread(th);
th = new thread(
bind(&thread_end, thmap, trunning_group, tfinishedgroup, bind(&print_g)));
thmap[th->get_id()] = th;
trunninggroup.add_thread(th);
boost::this_thread::sleep(boost::posix_time::milliseconds(200));
tfinishedgroup.join_all();
trunninggroup.interrupt_all();
But this is not ideal if you actually want the managing thread to be notified of a thread end when it actually happens (and I'm not really certain it does anything useful anyway). A solution for getting notified is perhaps to:
do the group migration as above
then trigger a condition variable on which the management thread is doing a timed_wait
but you will have to do some time computation to keep track of the remaining time after being notified, and resume sleep with that time left. That would be entirely dependent on the Duration class used for that task.
Update
Seeing the big picture, I would try a completely different approach: I don't think that terminating a thread which is already finished is a problem, so I would leave them all in the group, and use the group to create them, as your code demonstrate it.
However, I would try to wake up the managing thread as soon as all the threads are done, or after timeout. This is not doable with what the thread_group class offers alone, but it can be done with a custom made semaphore, or a patched version of boost::barrier to allow a timed wait.
Basically, you set a barrier to the number of threads in the group plus one (the main thread), and have the main thread time wait on it. Each worker thread does its work, and when finished, post its result in the queue, and wait on the barrier. If all the worker threads finish their task, everyone will wait and the barrier gets triggered.
Then main thread (as well as all others, but it doesn't matter), wakes up and can proceed by terminating the group and process the result. Otherwise, it will be awaken at timeout and do the same anyway.
The patching of boost::barrier should not be too difficult, you should only need to duplicate the wait method and replace the condition variable wait inside by a timed_wait (I didn't look at the code, this assumption might be totally of the mark though). Otherwise I provided a sample semaphore implementation for this question, which shouldn't be difficult to patch either.
Some last consideration: terminating a thread is usually not the best approach. You should instead try to signal the threads they have to abort, and wait for them, or somehow havecthem pass their unfinished task to an auxiliary thread which should clean things up serially. Then your thread group would be ready to tackle on the next task, and you wouldn't have to destroy and create threads all the time, which is a somewhzt costly operation. It will require to formalize the idea of a task in the context of your application, and make the threads run on a loop for taking new tasks and process them.
If you're using a very recent Boost and C++11, use try_join_for() (http://www.boost.org/doc/libs/1_53_0/doc/html/thread/thread_management.html#thread.thread_management.thread.try_join_for). Otherwise, use timed_join() (http://www.boost.org/doc/libs/1_53_0/doc/html/thread/thread_management.html#thread.thread_management.thread.timed_join).

how to pass data to running thread

When using pthread, I can pass data at thread creation time.
What is the proper way of passing new data to an already running thread?
I'm considering making a global variable and make my thread read from that.
Thanks
That will certainly work. Basically, threads are just lightweight processes that share the same memory space. Global variables, being in that memory space, are available to every thread.
The trick is not with the readers so much as the writers. If you have a simple chunk of global memory, like an int, then assigning to that int will probably be safe. Bt consider something a little more complicated, like a struct. Just to be definite, let's say we have
struct S { int a; float b; } s1, s2;
Now s1,s2 are variables of type struct S. We can initialize them
s1 = { 42, 3.14f };
and we can assign them
s2 = s1;
But when we assign them the processor isn't guaranteed to complete the assignment to the whole struct in one step -- we say it's not atomic. So let's now imagine two threads:
thread 1:
while (true){
printf("{%d,%f}\n", s2.a, s2.b );
sleep(1);
}
thread 2:
while(true){
sleep(1);
s2 = s1;
s1.a += 1;
s1.b += 3.14f ;
}
We can see that we'd expect s2 to have the values {42, 3.14}, {43, 6.28}, {44, 9.42} ....
But what we see printed might be anything like
{42,3.14}
{43,3.14}
{43,6.28}
or
{43,3.14}
{44,6.28}
and so on. The problem is that thread 1 may get control and "look at" s2 at any time during that assignment.
The moral is that while global memory is a perfectly workable way to do it, you need to take into account the possibility that your threads will cross over one another. There are several solutions to this, with the basic one being to use semaphores. A semaphore has two operations, confusingly named from Dutch as P and V.
P simply waits until a variable is 0 and the goes on, adding 1 to the variable; V subtracts 1 from the variable. The only thing special is that they do this atomically -- they can't be interrupted.
Now, do you code as
thread 1:
while (true){
P();
printf("{%d,%f}\n", s2.a, s2.b );
V();
sleep(1);
}
thread 2:
while(true){
sleep(1);
P();
s2 = s1;
V();
s1.a += 1;
s1.b += 3.14f ;
}
and you're guaranteed that you'll never have thread 2 half-completing an assignment while thread 1 is trying to print.
(Pthreads has semaphores, by the way.)
I have been using the message-passing, producer-consumer queue-based, comms mechanism, as suggested by asveikau, for decades without any problems specifically related to multiThreading. There are some advantages:
1) The 'threadCommsClass' instances passed on the queue can often contain everything required for the thread to do its work - member/s for input data, member/s for output data, methods for the thread to call to do the work, somewhere to put any error/exception messages and a 'returnToSender(this)' event to call so returning everything to the requester by some thread-safe means that the worker thread does not need to know about. The worker thread then runs asynchronously on one set of fully encapsulated data that requires no locking. 'returnToSender(this)' might queue the object onto a another P-C queue, it might PostMessage it to a GUI thread, it might release the object back to a pool or just dispose() it. Whatever it does, the worker thread does not need to know about it.
2) There is no need for the requesting thread to know anything about which thread did the work - all the requestor needs is a queue to push on. In an extreme case, the worker thread on the other end of the queue might serialize the data and communicate it to another machine over a network, only calling returnToSender(this) when a network reply is received - the requestor does not need to know this detail - only that the work has been done.
3) It is usually possible to arrange for the 'threadCommsClass' instances and the queues to outlive both the requester thread and the worker thread. This greatly eases those problems when the requester or worker are terminated and dispose()'d before the other - since they share no data directly, there can be no AV/whatever. This also blows away all those 'I can't stop my work thread because it's stuck on a blocking API' issues - why bother stopping it if it can be just orphaned and left to die with no possibility of writing to something that is freed?
4) A threadpool reduces to a one-line for loop that creates several work threads and passes them the same input queue.
5) Locking is restricted to the queues. The more mutexes, condVars, critical-sections and other synchro locks there are in an app, the more difficult it is to control it all and the greater the chance of of an intermittent deadlock that is a nightmare to debug. With queued messages, (ideally), only the queue class has locks. The queue class must work 100% with mutiple producers/consumers, but that's one class, not an app full of uncooordinated locking, (yech!).
6) A threadCommsClass can be raised anytime, anywhere, in any thread and pushed onto a queue. It's not even necessary for the requester code to do it directly, eg. a call to a logger class method, 'myLogger.logString("Operation completed successfully");' could copy the string into a comms object, queue it up to the thread that performs the log write and return 'immediately'. It is then up to the logger class thread to handle the log data when it dequeues it - it may write it to a log file, it may find after a minute that the log file is unreachable because of a network problem. It may decide that the log file is too big, archive it and start another one. It may write the string to disk and then PostMessage the threadCommsClass instance on to a GUI thread for display in a terminal window, whatever. It doesn't matter to the log requesting thread, which just carries on, as do any other threads that have called for logging, without significant impact on performance.
7) If you do need to kill of a thread waiting on a queue, rather than waiing for the OS to kill it on app close, just queue it a message telling it to teminate.
There are surely disadvantages:
1) Shoving data directly into thread members, signaling it to run and waiting for it to finish is easier to understand and will be faster, assuming that the thread does not have to be created each time.
2) Truly asynchronous operation, where the thread is queued some work and, sometime later, returns it by calling some event handler that has to communicate the results back, is more difficult to handle for developers used to single-threaded code and often requires state-machine type design where context data must be sent in the threadCommsClass so that the correct actions can be taken when the results come back. If there is the occasional case where the requestor just has to wait, it can send an event in the threadCommsClass that gets signaled by the returnToSender method, but this is obviously more complex than simply waiting on some thread handle for completion.
Whatever design is used, forget the simple global variables as other posters have said. There is a case for some global types in thread comms - one I use very often is a thread-safe pool of threadCommsClass instances, (this is just a queue that gets pre-filled with objects). Any thread that wishes to communicate has to get a threadCommsClass instance from the pool, load it up and queue it off. When the comms is done, the last thread to use it releases it back to the pool. This approach prevents runaway new(), and allows me to easily monitor the pool level during testing without any complex memory-managers, (I usually dump the pool level to a status bar every second with a timer). Leaking objects, (level goes down), and double-released objects, (level goes up), are easily detected and so get fixed.
MultiThreading can be safe and deliver scaleable, high-performance apps that are almost a pleasure to maintain/enhance, (almost:), but you have to lay off the simple globals - treat them like Tequila - quick and easy high for now but you just know they'll blow your head off tomorrow.
Good luck!
Martin
Global variables are bad to begin with, and even worse with multi-threaded programming. Instead, the creator of the thread should allocate some sort of context object that's passed to pthread_create, which contains whatever buffers, locks, condition variables, queues, etc. are needed for passing information to and from the thread.
You will need to build this yourself. The most typical approach requires some cooperation from the other thread as it would be a bit of a weird interface to "interrupt" a running thread with some data and code to execute on it... That would also have some of the same trickiness as something like POSIX signals or IRQs, both of which it's easy to shoot yourself in the foot while processing, if you haven't carefully thought it through... (Simple example: You can't call malloc inside a signal handler because you might be interrupted in the middle of malloc, so you might crash while accessing malloc's internal data structures which are only partially updated.)
The typical approach is to have your thread creation routine basically be an event loop. You can build a queue structure and pass that as the argument to the thread creation routine. Then other threads can enqueue things and the thread's event loop will dequeue it and process the data. Note this is cleaner than a global variable (or global queue) because it can scale to have multiple of these queues.
You will need some synchronization on that queue data structure. Entire books could be written about how to implement your queue structure's synchronization, but the most simple thing would have a lock and a semaphore. When modifying the queue, threads take a lock. When waiting for something to be dequeued, consumer threads would wait on a semaphore which is incremented by enqueuers. It's also a good idea to implement some mechanism to shut down the consumer thread.

Can't unblock/"wake up" thread with pthread_kill & sigwait

I'm working on a C/C++ networking project and am having difficulties synchronizing/signaling my threads. Here is what I am trying to accomplish:
Poll a bunch of sockets using the poll function
If any sockets are ready from the POLLIN event then send a signal to a reader thread and a writer thread to "wake up"
I have a class called MessageHandler that sets the signals mask and spawns the reader and writer threads. Inside them I then wait on the signal(s) that ought to wake them up.
The problem is that I am testing all this functionality by sending a signal to a thread yet it never wakes up.
Here is the problem code with further explanation. Note I just have highlighted how it works with the reader thread as the writer thread is essentially the same.
// Called once if allowedSignalsMask == 0 in constructor
// STATIC
void MessageHandler::setAllowedSignalsMask() {
allowedSignalsMask = (sigset_t*)std::malloc(sizeof(sigset_t));
sigemptyset(allowedSignalsMask);
sigaddset(allowedSignalsMask, SIGCONT);
}
// STATIC
sigset_t *MessageHandler::allowedSignalsMask = 0;
// STATIC
void* MessageHandler::run(void *arg) {
// Apply the signals mask to any new threads created after this point
pthread_sigmask(SIG_BLOCK, allowedSignalsMask, 0);
MessageHandler *mh = (MessageHandler*)arg;
pthread_create(&(mh->readerThread), 0, &runReaderThread, arg);
sleep(1); // Just sleep for testing purposes let reader thread execute first
pthread_kill(mh->readerThread, SIGCONT);
sleep(1); // Just sleep for testing to let reader thread print without the process terminating
return 0;
}
// STATIC
void* MessageHandler::runReaderThread(void *arg) {
int signo;
for (;;) {
sigwait(allowedSignalsMask, &signo);
fprintf(stdout, "Reader thread signaled\n");
}
return 0;
}
I took out all the error handling I had in the code to condense it but do know for a fact that the thread starts properly and gets to the sigwait call.
The error may be obvious (its not a syntax error - the above code is condensed from compilable code and I might of screwed it up while editing it) but I just can't seem to find/see it since I have spent far to much time on this problem and confused myself.
Let me explain what I think I am doing and if it makes sense.
Upon creating an object of type MessageHandler it will set allowedSignalsMask to the set of the one signal (for the time being) that I am interested in using to wake up my threads.
I add the signal to the blocked signals of the current thread with pthread_sigmask. All further threads created after this point ought to have the same signal mask now.
I then create the reader thread with pthread_create where arg is a pointer to an object of type MessageHandler.
I call sleep as a cheap way to ensure that my readerThread executes all the way to sigwait()
I send the signal SIGCONT to the readerThread as I am interested in sigwait to wake up/unblock once receiving it.
Again I call sleep as a cheap way to ensure that my readerThread can execute all the way after it woke up/unblocked from sigwait()
Other helpful notes that may be useful but I don't think affect the problem:
MessageHandler is constructed and then a different thread is created given the function pointer that points to run. This thread will be responsible for creating the reader and writer threads, polling the sockets with the poll function, and then possibly sending signals to both the reader and writer threads.
I know its a long post but do appreciate you reading it and any help you can offer. If I wasn't clear enough or you feel like I didn't provide enough information please let me know and I will correct the post.
Thanks again.
POSIX threads have condition variables for a reason; use them. You're not supposed to need signal hackery to accomplish basic synchronization tasks when programming with threads.
Here is a good pthread tutorial with information on using condition variables:
https://computing.llnl.gov/tutorials/pthreads/
Or, if you're more comfortable with semaphores, you could use POSIX semaphores (sem_init, sem_post, and sem_wait) instead. But once you figure out why the condition variable and mutex pairing makes sense, I think you'll find condition variables are a much more convenient primitive.
Also, note that your current approach incurs several syscalls (user-space/kernel-space transitions) per synchronization. With a good pthreads implementation, using condition variables should drop that to at most one syscall, and possibly none at all if your threads keep up with each other well enough that the waited-for event occurs while they're still spinning in user-space.
This pattern seems a bit odd, and most likely error prone. The pthread library is rich in synchronization methods, the one most likely to serve your need being in the pthread_cond_* family. These methods handle condition variables, which implement the Wait and Signal approach.
Use SIGUSR1 instead of SIGCONT. SIGCONT doesn't work. Maybe a signal expert knows why.
By the way, we use this pattern because condition variables and mutexes are too slow for our particular application. We need to sleep and wake individual threads very rapidly.
R. points out there is extra overhead due to additional kernel space calls. Perhaps if you sleep > N threads, then a single condition variable would beat out multiple sigwaits and pthread_kills. In our application, we only want to wake one thread when work arrives. You have to have a condition variable and mutex for each thread to do this otherwise you get the stampede. In a test where we slept and woke N threads M times, signals beat mutexes and condition variables by a factor of 5 (it could have been a factor of 40 but I cant remember anymore....argh). We didn't test Futexes which can wake 1 thread at a time and specifically are coded to limit trips to kernel space. I suspect futexes would be faster than mutexes.

Waiting win32 threads

I have a totally thread-safe FIFO structure( TaskList ) to store task classes, multiple number of threads, some of which creates and stores task and the others processes the tasks. TaskList class has a pop_front() method which returns the first task if there is at least one. Otherwise it returns NULL.
Here is an example of processing function:
TaskList tlist;
unsigned _stdcall ThreadFunction(void * qwe)
{
Task * task;
while(!WorkIsOver) // a global bool to end all threads.
{
while(task = tlist.pop_front())
{
// process Task
}
}
return 0;
}
My problem is, sometimes, there is no new task in the task list, so the processing threads enters in an endless loop (while(!WorkIsOver)) and CPU load increases. Somehow I have to make the threads wait until a new task is stored in the list. I think about Suspending and Resuming but then I need extra info about which threads are suspending or running which brings a greater complexity to coding.
Any ideas?
PS. I am using winapi, not Boost or TBB for threading. Because sometimes I have to terminate threads that process for too long, and create new ones immediately. This is critical for me. Please do not suggest any of these two.
Thanks
Assuming you are developing this in DevStudio, you can get the control you want using [IO Completion Ports]. Scary name, for a simple tool.
First, create an IOCompletion Port: CreateIOCompletionPort
Create your pool of worker threads using _beginthreadex / CreateThread
In each worker thread, implement a loop that calls GetQueuedCompletionStatus - The returned lpCompletionKey will be pointing to a work item to process.
Now, whenever you get a work item to process: call PostQueuedCompletionStatus from any thread - passing in the pointer to your work item as the completion key parameter.
Thats it. 3 API calls and you have implemented a thread pooling mechanism based on a kernel implemented queue object. Each call to PostQueuedCompletionStatus will automatically be deserialized onto a thread pool thread thats blocking on GetQueuedCompletionStatus. The pool of worker threads is created, and maintained - by you - so you can call TerminateThread on any worker threads that are taking too long. Even better - depending on how it is set up the kernel will only wake up as many threads as needed to ensure that each CPU core is running at ~100% load.
NB. TerminateThread is really not an appropriate API to use. Unless you really know what you are doing the threads are going to leak their stacks, none of the memory allocated by code on the thread will be deallocated and so on. TerminateThread is really only useful during process shutdown. There are some articles on the net detailing how to release the known OS resources that are leaked each time TerminateThread is called - if you persist in this approach you really need to find and read them if you haven't already.
Use a semaphore in your queue to indicate whether there are elements ready to be processed.
Every time you add an item, call ::ReleaseSemaphore to increment the count associated with the semaphore
In the loop in your thread process, call ::WaitForSingleObject() on the handle of your semaphore object -- you can give that wait a timeout so that you have an opportunity to know that your thread should exit. Otherwise, your thread will be woken up whenever there's one or more items for it to process, and also has the nice side effect of decrementing the semaphore count for you.
If you haven't read it, you should devour Herb Sutter's Effective Concurrency series which covers this topic and many many more.
Use condition variables to implement a producer/consumer queue - example code here.
If you need to support earlier versions of Windows you can use the condition variable in Boost. Or you could build your own by copying the Windows-specific code out of the Boost headers, they use the same Win32 APIs under the covers as you would if you build your own.
Why not just use the existing thread pool? Let Windows manage all of this.
You can use windows threadpool!
Or you can use api call
WaitForSingleObject or
WaitForMultipleObjects.
Use at least SwitchToThread api call
when thread is workless.
If TaskList has some kind of wait_until_not_empty method then use it. If it does not then one Sleep(1000) (or some other value) may just do the trick. Proper solution would be to create a wrapper around TaskList that uses an auto-reset event handle to indicate if list is not empty. You would need to reinvent current methods for pop/push, with new task list being the member of new class:
WaitableTaskList::WaitableTaskList()
{
// task list is empty upon creation
non_empty_event = CreateEvent(NULL, FALSE, FALSE, NULL);
}
Task* WaitableTaskList::wait_and_pop_front(DWORD timeout)
{
WaitForSingleObject(non_empty_event, timeout);
// .. handle error, return NULL on timeout
Task* result = task_list.pop_front();
if (!task_list.empty())
SetEvent(non_empty_event);
return result;
}
void WaitableTaskList::push_back(Task* item)
{
task_list.push_back(item);
SetEvent(non_empty_event);
}
You must pop items in task list only through methods such as this wait_and_pop_front().
EDIT: actually this is not a good solution. There is a way to have non_empty_event raised even if the list is empty. The situation requires 2 threads trying to pop and list having 2 items. If list becomes empty between if and SetEvent we will have the wrong state. Obviously we need to implement syncronization as well. At this point I would reconsider simple Sleep again :-)

How to tell the parent that the thread is done in C++ using pthreads?

I have a TCP Server application that serves each client in a new thread using POSIX Threads and C++.
The server calls "listen" on its socket and when a client connects, it makes a new object of class Client. The new object runs in its own thread and processes the client's requests.
When a client disconnects, i want some way to tell my main() thread that this thread is done, and main() can delete this object and log something like "Client disconnected".
My question is, how do i tell to the main thread, that a thread is done ?
The most straightforward way that I can see, is to join the threads. See here. The idea is that on a join call, a command thread will then wait until worker threads exit, and then resume.
Alternatively, you could roll something up with some shared variables and mutexes.
If the child thread is really exiting when it is done (rather than waiting for more work), the parent thread can call pthread_join on it which will block until the child thread exits.
Obviously, if the parent thread is doing other things, it can't constantly be blocking on pthread_join, so you need a way to send a message to the main thread to tell it to call pthread_join. There are a number of IPC mechanisms that you could use for this, but in your particular case (a TCP server), I suspect the main thread is probably a select loop, right? If that's the case, I would recommend using pipe to create a logical pipe, and have the read descriptor for the pipe be one of the descriptors that the main thread selects from.
When a child thread is done, it would then write some sort of message to the pipe saying "I'm Done!" and then the server would know to call pthread_join on that thread and then do whatever else it needs to do when a connection finishes.
Note that you don't have to call pthread_join on a finished child thread, unless you need its return value. However, it is generally a good idea to do so if the child thread has any access to shared resources, since when pthread_join returns without error, it assures you that the child thread is really gone and not in some intermediate state between having sent the "I'm Done!" message and actually having exited.
pthreads return 0 if everything went okay or they return errno if something didn't work.
int ret, joined;
ret = pthread_create(&thread, NULL, connect, (void*) args);
joined = pthread_join(&thread, NULL);
If joined is zero, the thread is done. Clean up that thread's object.
While it is possible to implement IPC mechanisms to notify a main thread when other threads are about to terminate, if you want to do something when a thread terminates you should try to let the terminating thread do it itself.
You might look into using pthread_cleanup_push() to establish a routine to be called when the thread is cancelled or exits. Another option might be to use pthread_key_create() to create a thread-specific data key and associated destructor function.
If you don't want to call pthread_join() from the main thread due to blocking, you should detach the client threads by either setting it as option when creating the thread or calling pthread_detach().
You could use a queue of "thread objects to be deleted", protect access to the queue with a mutex, and then signal a pthread condition variable to indicate that something was available on the queue.
But do you really want to do that? A better model is for each thread to just clean up after itself, and not worry about synchronizing with the main thread in the first place.
Calling pthread_join will block execution of the main thread. Given the description of the problem I don't think it will provide the desired solution.
My preferred solution, in most cases, would be to have the thread perform its own cleanup. If that isn't possible you'll either have to use some kind of polling scheme with shared variables (just remember to make them thread safe, hint:volatile), or perhaps some sort of OS dependant callback mechanism. Remember, you want to be blocked on the call to listen, so really consider having the thread clean itself up.
As others have mentioned, it's easy to handle termination of a given thread with pthread_join. But a weak spot of pthreads is funneling information from several sources into a synchronous stream. (Alternately, you could say its strong spot is performance.)
By far the easiest solution for you would be to handle cleanup in the worker thread. Log the disconnection (add a mutex to the log), delete resources as appropriate, and exit the worker thread without signaling the parent.
Adding mutexes to allow manipulation of shared resources is a tough problem, so be flexible and creative. Always err on caution when synchronizing, and profile before optimizing.
I had exactly the same problem as you described. After ~300 opened client connections my Linux application was not able to create new thread because pthread_join was never called. For me, usage of pthread_tryjoin_np helped.
Briefly:
have a map that holds all opened thread descriptors
from the main thread before new client thread is opened I iterate through map and call pthread_tryjoin_np for each thread recorded in map. If thread is done the result of call is zero meaning that I can clean up resources from that thread. At the same time pthread_tryjoin_np takes care about releasing thread resources. If pthread_tryjoin_np call returns number different from 0 this means that thread is still running and I simply do nothing.
Potential problem with this is that I do not see pthread_tryjoin_np as part official POSIX standard so this solution might not be portable.