C++ pthreads multi-tasking [windows]

C++ pthreads multi-tasking [windows] - c++

For a solution to an earlier problem, I was kindly pointed to multi-threading (via pthreads).
The original problem is thus:
I have two functions, one of which is the main body, which is real-time; the other is a continually running function that blocks. The real-time, when attempting to run the blocking function, obvious blocks, making it unresponsive to the user which is unacceptable as a real-time process.
The original aim was to make the blocking function independent of the real-time solution (or at least, pseudo-independent), which I attempted with pthreads.
Here's a simplified version of the code:
void * RenderImages(void * Data)
{
while(1); //Simulating a permanently blocking process
return NULL;
}
int main(int ArgC, char *ArgVar[])
{
pthread_t threads[PTHREAD_NUMBER];
void *Ptr = NULL;
int I = 0;
I = pthread_create(&threads[0], NULL, RenderImages, Ptr);
if(I != 0)
{
printf("pthread_create Error!\n");
return -1;
}
I = pthread_join(threads[0],NULL);
//Doesn't reach here as pthread_join is blocking
printf("Testing!\n");
return 0;
}
The code above, however, blocks on calling pthread_join (which makes pthread nothing more than an unnecessarily complicated way of calling the function directly - which defeats the point).
My question is thus:
What functions would I have to use, to make it so I can run a pthread for a few milliseconds, suspend the process, then run another function, then go back and run the process for a few more milli-seconds etc?
OR
If the above isn't possible, what solution is there to the original problem?

Assuming that the "main" thread only cares when the "blocking" thread has completed its work, I think you want condition variables. Look into pthread_cond_wait and pthread_cond_signal.

pthread_join is the function you use to wait for a thread to end.
http://www.yolinux.com/TUTORIALS/LinuxTutorialPosixThreads.html
Use pthread_sigmask to manage suspend states:
http://man.yolinux.com/cgi-bin/man2html?cgi_command=pthread_sigmask

You can always use 3 threads, one for each function plus the main thread.

What you need is a queuing mechanism. Your main thread will create 'Jobs'. You then place these 'Jobs' onto your backlog queue where your Worker Thread will pick them up and process then. When the job is done. The worker thread places the now completed 'Jobs' onto the completed queue. You main thread can intermittently check the completed queue and if there is a completed job,it will pick up the 'Job' and do whatever it needs to with it. Your worker thread then goes into a wait state until the next job comes along.
There are numerous ways to roll out the queues. The queue can be a unix pipe. A windows IO Completion Port or you can roll out your own with a linked list/arrays, conditional variables and mutexes.

Related

Is there a reliable way to force a thread to stop in C++? (especially detached ones)

I am recently working with threads in C++11. now I am thinking about how to force stop a thread. I couldn't find it on stackoverflow, and also tried these.
One variable each thread : not so reliable
return in the main thread : I have to force quit only one not all
and I have no more ideas. I have heard about WinAPI, but I want a portable solution. (that also means I wont use fork())
Can you please give me a solution of this? I really want to do it.

One of the biggest problems with force closing a thread in C++ is the RAII violation.
When a function (and subsequently, a thread), gracefully finishes, everything it held is gracefully cleaned up by the destructors of the objects the functions/threads created.
Memory gets freed,
OS resources (handles, file descriptors etc.) are closed and returned to the OS
Locks are getting unlocked so other threads can use the shared resources they protect.
other important tasks are preformed (such as updating counters, logging, etc.).
If you brutally kill a thread (aka by TerminateThread on Windows, for example), non of these actually happen, and the program is left in a very dangerous state.
A (not-so) common pattern that can be used is to register a "cancellation token" on which you can monitor and gracefully shut the thread if other thread asks so (a la TPL/PPL). something like
auto cancellationToken = std::make_shared<std::atomic_bool>();
cancellationToken->store(false);
class ThreadTerminator : public std::exception{/*...*/};
std::thread thread([cancellationToken]{
try{
//... do things
if (cancellationToken->load()){
//somone asked the thred to close
throw ThreadTerminator ();
}
//do other things...
if (cancellationToken->load()){
//somone asked the thred to close
throw ThreadTerminator ();
}
//...
}catch(ThreadTerminator){
return;
}
});
Usually, one doesn't even open a new thread for a small task, it's better to think of a multi threaded application as a collection of concurrent tasks and parallel algorithms. one opens a new thread for some long ongoing background task which is usually performed in some sort of a loop (such as, accepting incoming connections).
So, anyway, the cases for asking a small task to be cancelled are rare anyway.
tldr:
Is there a reliable way to force a thread to stop in C++?
No.

Here is my approach for most of my designs:
Think of 2 kinds of Threads:
1) primary - I call main.
2) subsequent - any thread launched by main or any subsequent thread
When I launch std::thread's in C++ (or posix threads in C++):
a) I provide all subsequent threads access to a boolean "done", initialized to false. This bool can be directly passed from main (or indirectly through other mechanisms).
b) All my threads have a regular 'heartbeat', typically with a posix semaphore or std::mutex, sometimes with just a timer, and sometimes simply during normal thread operation.
Note that a 'heartbeat' is not polling.
Also note that checking a boolean is really cheap.
Thus, whenever main wants to shut down, it merely sets done to true and 'join's with the subsequent threads.
On occasion main will also signal any semaphore (prior to join) that a subsequent thread might be waiting on.
And sometimes, a subsequent thread has to let its own subsequent thread know it is time to end.
Here is an example -
main launching a subsequent thread:
std::thread* thrd =
new std::thread(&MyClass_t::threadStart, this, id);
assert(nullptr != thrd);
Note that I pass the this pointer to this launch ... within this class instance is a boolean m_done.
Main Commanding shutdown:
In main thread, of course, all I do is
m_done = true;
In a subsequent thread (and in this design, all are using the same critical section):
void threadStart(uint id) {
std::cout << id << " " << std::flush; // thread announce
do {
doOnce(id); // the critical section is in this method
}while(!m_done); // exit when done
}
And finally, at an outer scope, main invokes the join.
Perhaps the take away is - when designing a threaded system, you should also design the system shut down, not just add it on.

How to go about multithreading with "priority"?

I have multiple threads processing multiple files in the background, while the program is idle.
To improve disk throughput, I use critical sections to ensure that no two threads ever use the same disk simultaneously.
The (pseudo-)code looks something like this:
void RunThread(HANDLE fileHandle)
{
// Acquire CRITICAL_SECTION for disk
CritSecLock diskLock(GetDiskLock(fileHandle));
for (...)
{
// Do some processing on file
}
}
Once the user requests a file to be processed, I need to stop all threads -- except the one which is processing the requested file. Once the file is processed, then I'd like to resume all the threads again.
Given the fact that SuspendThread is a bad idea, how do I go about stopping all threads except the one that is processing the relevant input?
What kind of threading objects/features would I need -- mutexes, semaphores, events, or something else? And how would I use them? (I'm hoping for compatibility with Windows XP.)

I recommend you go about it in a completely different fashion. If you really want only one thread for every disk (I'm not convinced this is a good idea) then you should create one thread per disk, and distribute files as you queue them for processing.
To implement priority requests for specific files I would then have a thread check a "priority slot" at several points during its normal processing (and of course in its main queue wait loop).

The difficulty here isn't priority as such, it's the fact that you want a thread to back out of a lock that it's holding, to let another thread take it. "Priority" relates to which of a set of runnable threads should be scheduled to run -- you want to make a thread runnable that isn't (because it's waiting on a lock held by another thread).
So, you want to implement (as you put it):
if (ThisThreadNeedsToSuspend()) { ReleaseDiskLock(); WaitForResume(); ReacquireDiskLock(); }
Since you're (wisely) using a scoped lock I would want to invert the logic:
while (file_is_not_finished) {
WaitUntilThisThreadCanContinue();
CritSecLock diskLock(blah);
process_part_of_the_file();
}
ReleasePriority();
...
void WaitUntilThisThreadCanContinue() {
MutexLock lock(thread_priority_mutex);
while (thread_with_priority != NOTHREAD and thread_with_priority != thisthread) {
condition_variable_wait(thread_priority_condvar);
}
}
void GiveAThreadThePriority(threadid) {
MutexLock lock(thread_priority_mutex);
thread_with_priority = threadid;
condition_variable_broadcast(thread_priority_condvar);
}
void ReleasePriority() {
MutexLock lock(thread_priority_mutex);
if (thread_with_priority == thisthread) {
thread_with_priority = NOTHREAD;
condition_variable_broadcast(thread_priority_condvar);
}
}
Read up on condition variables -- all recent OSes have them, with similar basic operations. They're also in Boost and in C++11.
If it's not possible for you to write a function process_part_of_the_file then you can't structure it this way. Instead you need a scoped lock that can release and regain the disklock. The easiest way to do that is to make it a mutex, then you can wait on a condvar using that same mutex. You can still use the mutex/condvar pair and the thread_with_priority object in much the same way.
You choose the size of "part of the file" according to how responsive you need the system to be to a change in priority. If you need it to be extremely responsive then the scheme doesn't really work -- this is co-operative multitasking.
I'm not entirely happy with this answer, the thread with priority can be starved for a long time if there are a lot of other threads that are already waiting on the same disk lock. I'd put in more thought to avoid that. Possibly there should not be a per-disk lock, rather the whole thing should be handled under the condition variable and its associated mutex. I hope this gets you started, though.

You may ask the threads to stop gracefully. Just check some variable in loop inside threads and continue or terminate work depending on its value.
Some thoughts about it:
The setting and checking of this value should be done inside critical section.
Because the critical section slows down the thread, the checking should be done often enough to quickly stop the thread when needed and rarely enough, such that thread won't be stalled by acquiring and releasing the critical section.

After each worker thread processes a file, check a condition variable associated with that thread. The condition variable could implemented simply as a bool + critical section. Or with InterlockedExchange* functions. And to be honest, I usually just use an unprotected bool between threads to signal "need to exit" - sometimes with an event handle if the worker thread could be sleeping.
After setting the condition variable for each thread, Main thread waits for each thread to exit via WaitForSingleObject.
DWORD __stdcall WorkerThread(void* pThreadData)
{
ThreadData* pData = (ThreadData*) pTheradData;
while (pData->GetNeedToExit() == false)
{
ProcessNextFile();
}
return 0;
}
void StopWokerThread(HANDLE hThread, ThreadData* pData)
{
pData->SetNeedToExit = true;
WaitForSingleObject(hThread);
CloseHandle(hThread);
}
struct ThreadData()
{
CRITICAL_SECITON _cs;
ThreadData()
{
InitializeCriticalSection(&_cs);
}
~ThreadData()
{
DeleteCriticalSection(&_cs);
}
ThreadData::SetNeedToExit()
{
EnterCriticalSection(&_cs);
_NeedToExit = true;
LeaveCriticalSeciton(&_cs);
}
bool ThreadData::GetNeedToExit()
{
bool returnvalue;
EnterCriticalSection(&_cs);
returnvalue = _NeedToExit = true;
LeaveCriticalSeciton(&_cs);
return returnvalue;
}
};

You can also use the pool of threads and regulate their work by using the I/O Completion port.
Normally threads from the pool would sleep awaiting for the I/O Completion port event/activity.
When you have a request the I/O Completion port releases the thread and it starts to do a job.

OK, how about this:
Two threads per disk, for high and low priority requests, each with its own input queue.
A high-priority disk task, when initially submitted, will then issue its disk requests in parallel with any low-priority task that is running. It can reset a ManualResetEvent that the low-priority thread waits on when it can, (WaitForSingleObject) and so will get blocked if the high-prioriy thread is perfoming disk ops. The high-priority thread should set the event after finishing a task.
This should limit the disk-thrashing to the interval, (if any), between the submission of the high-priority task and whenver the low-priority thread can wait on the MRE. Raising the CPU priority of the thread servicing the high-priority queue may assist in improving performance of the high-priority work in this interval.
Edit: by 'queue', I mean a thread-safe, blocking, producer-consumer queue, (just to be clear:).
More edit - if the issuing threads needs notification of job completion, the tasks issued to the queues could contain an 'OnCompletion' event to call with the task object as a parameter. The event handler could, for example, signal an AutoResetEvent that the originating thread is waiting on, so providing synchronous notification.

Can't unblock/"wake up" thread with pthread_kill & sigwait

I'm working on a C/C++ networking project and am having difficulties synchronizing/signaling my threads. Here is what I am trying to accomplish:
Poll a bunch of sockets using the poll function
If any sockets are ready from the POLLIN event then send a signal to a reader thread and a writer thread to "wake up"
I have a class called MessageHandler that sets the signals mask and spawns the reader and writer threads. Inside them I then wait on the signal(s) that ought to wake them up.
The problem is that I am testing all this functionality by sending a signal to a thread yet it never wakes up.
Here is the problem code with further explanation. Note I just have highlighted how it works with the reader thread as the writer thread is essentially the same.
// Called once if allowedSignalsMask == 0 in constructor
// STATIC
void MessageHandler::setAllowedSignalsMask() {
allowedSignalsMask = (sigset_t*)std::malloc(sizeof(sigset_t));
sigemptyset(allowedSignalsMask);
sigaddset(allowedSignalsMask, SIGCONT);
}
// STATIC
sigset_t *MessageHandler::allowedSignalsMask = 0;
// STATIC
void* MessageHandler::run(void *arg) {
// Apply the signals mask to any new threads created after this point
pthread_sigmask(SIG_BLOCK, allowedSignalsMask, 0);
MessageHandler *mh = (MessageHandler*)arg;
pthread_create(&(mh->readerThread), 0, &runReaderThread, arg);
sleep(1); // Just sleep for testing purposes let reader thread execute first
pthread_kill(mh->readerThread, SIGCONT);
sleep(1); // Just sleep for testing to let reader thread print without the process terminating
return 0;
}
// STATIC
void* MessageHandler::runReaderThread(void *arg) {
int signo;
for (;;) {
sigwait(allowedSignalsMask, &signo);
fprintf(stdout, "Reader thread signaled\n");
}
return 0;
}
I took out all the error handling I had in the code to condense it but do know for a fact that the thread starts properly and gets to the sigwait call.
The error may be obvious (its not a syntax error - the above code is condensed from compilable code and I might of screwed it up while editing it) but I just can't seem to find/see it since I have spent far to much time on this problem and confused myself.
Let me explain what I think I am doing and if it makes sense.
Upon creating an object of type MessageHandler it will set allowedSignalsMask to the set of the one signal (for the time being) that I am interested in using to wake up my threads.
I add the signal to the blocked signals of the current thread with pthread_sigmask. All further threads created after this point ought to have the same signal mask now.
I then create the reader thread with pthread_create where arg is a pointer to an object of type MessageHandler.
I call sleep as a cheap way to ensure that my readerThread executes all the way to sigwait()
I send the signal SIGCONT to the readerThread as I am interested in sigwait to wake up/unblock once receiving it.
Again I call sleep as a cheap way to ensure that my readerThread can execute all the way after it woke up/unblocked from sigwait()
Other helpful notes that may be useful but I don't think affect the problem:
MessageHandler is constructed and then a different thread is created given the function pointer that points to run. This thread will be responsible for creating the reader and writer threads, polling the sockets with the poll function, and then possibly sending signals to both the reader and writer threads.
I know its a long post but do appreciate you reading it and any help you can offer. If I wasn't clear enough or you feel like I didn't provide enough information please let me know and I will correct the post.
Thanks again.

POSIX threads have condition variables for a reason; use them. You're not supposed to need signal hackery to accomplish basic synchronization tasks when programming with threads.
Here is a good pthread tutorial with information on using condition variables:
https://computing.llnl.gov/tutorials/pthreads/
Or, if you're more comfortable with semaphores, you could use POSIX semaphores (sem_init, sem_post, and sem_wait) instead. But once you figure out why the condition variable and mutex pairing makes sense, I think you'll find condition variables are a much more convenient primitive.
Also, note that your current approach incurs several syscalls (user-space/kernel-space transitions) per synchronization. With a good pthreads implementation, using condition variables should drop that to at most one syscall, and possibly none at all if your threads keep up with each other well enough that the waited-for event occurs while they're still spinning in user-space.

This pattern seems a bit odd, and most likely error prone. The pthread library is rich in synchronization methods, the one most likely to serve your need being in the pthread_cond_* family. These methods handle condition variables, which implement the Wait and Signal approach.

Use SIGUSR1 instead of SIGCONT. SIGCONT doesn't work. Maybe a signal expert knows why.
By the way, we use this pattern because condition variables and mutexes are too slow for our particular application. We need to sleep and wake individual threads very rapidly.
R. points out there is extra overhead due to additional kernel space calls. Perhaps if you sleep > N threads, then a single condition variable would beat out multiple sigwaits and pthread_kills. In our application, we only want to wake one thread when work arrives. You have to have a condition variable and mutex for each thread to do this otherwise you get the stampede. In a test where we slept and woke N threads M times, signals beat mutexes and condition variables by a factor of 5 (it could have been a factor of 40 but I cant remember anymore....argh). We didn't test Futexes which can wake 1 thread at a time and specifically are coded to limit trips to kernel space. I suspect futexes would be faster than mutexes.

Waiting win32 threads

I have a totally thread-safe FIFO structure( TaskList ) to store task classes, multiple number of threads, some of which creates and stores task and the others processes the tasks. TaskList class has a pop_front() method which returns the first task if there is at least one. Otherwise it returns NULL.
Here is an example of processing function:
TaskList tlist;
unsigned _stdcall ThreadFunction(void * qwe)
{
Task * task;
while(!WorkIsOver) // a global bool to end all threads.
{
while(task = tlist.pop_front())
{
// process Task
}
}
return 0;
}
My problem is, sometimes, there is no new task in the task list, so the processing threads enters in an endless loop (while(!WorkIsOver)) and CPU load increases. Somehow I have to make the threads wait until a new task is stored in the list. I think about Suspending and Resuming but then I need extra info about which threads are suspending or running which brings a greater complexity to coding.
Any ideas?
PS. I am using winapi, not Boost or TBB for threading. Because sometimes I have to terminate threads that process for too long, and create new ones immediately. This is critical for me. Please do not suggest any of these two.
Thanks

Assuming you are developing this in DevStudio, you can get the control you want using [IO Completion Ports]. Scary name, for a simple tool.
First, create an IOCompletion Port: CreateIOCompletionPort
Create your pool of worker threads using _beginthreadex / CreateThread
In each worker thread, implement a loop that calls GetQueuedCompletionStatus - The returned lpCompletionKey will be pointing to a work item to process.
Now, whenever you get a work item to process: call PostQueuedCompletionStatus from any thread - passing in the pointer to your work item as the completion key parameter.
Thats it. 3 API calls and you have implemented a thread pooling mechanism based on a kernel implemented queue object. Each call to PostQueuedCompletionStatus will automatically be deserialized onto a thread pool thread thats blocking on GetQueuedCompletionStatus. The pool of worker threads is created, and maintained - by you - so you can call TerminateThread on any worker threads that are taking too long. Even better - depending on how it is set up the kernel will only wake up as many threads as needed to ensure that each CPU core is running at ~100% load.
NB. TerminateThread is really not an appropriate API to use. Unless you really know what you are doing the threads are going to leak their stacks, none of the memory allocated by code on the thread will be deallocated and so on. TerminateThread is really only useful during process shutdown. There are some articles on the net detailing how to release the known OS resources that are leaked each time TerminateThread is called - if you persist in this approach you really need to find and read them if you haven't already.

Use a semaphore in your queue to indicate whether there are elements ready to be processed.
Every time you add an item, call ::ReleaseSemaphore to increment the count associated with the semaphore
In the loop in your thread process, call ::WaitForSingleObject() on the handle of your semaphore object -- you can give that wait a timeout so that you have an opportunity to know that your thread should exit. Otherwise, your thread will be woken up whenever there's one or more items for it to process, and also has the nice side effect of decrementing the semaphore count for you.

If you haven't read it, you should devour Herb Sutter's Effective Concurrency series which covers this topic and many many more.

Use condition variables to implement a producer/consumer queue - example code here.
If you need to support earlier versions of Windows you can use the condition variable in Boost. Or you could build your own by copying the Windows-specific code out of the Boost headers, they use the same Win32 APIs under the covers as you would if you build your own.

Why not just use the existing thread pool? Let Windows manage all of this.

You can use windows threadpool!
Or you can use api call
WaitForSingleObject or
WaitForMultipleObjects.
Use at least SwitchToThread api call
when thread is workless.

If TaskList has some kind of wait_until_not_empty method then use it. If it does not then one Sleep(1000) (or some other value) may just do the trick. Proper solution would be to create a wrapper around TaskList that uses an auto-reset event handle to indicate if list is not empty. You would need to reinvent current methods for pop/push, with new task list being the member of new class:
WaitableTaskList::WaitableTaskList()
{
// task list is empty upon creation
non_empty_event = CreateEvent(NULL, FALSE, FALSE, NULL);
}
Task* WaitableTaskList::wait_and_pop_front(DWORD timeout)
{
WaitForSingleObject(non_empty_event, timeout);
// .. handle error, return NULL on timeout
Task* result = task_list.pop_front();
if (!task_list.empty())
SetEvent(non_empty_event);
return result;
}
void WaitableTaskList::push_back(Task* item)
{
task_list.push_back(item);
SetEvent(non_empty_event);
}
You must pop items in task list only through methods such as this wait_and_pop_front().
EDIT: actually this is not a good solution. There is a way to have non_empty_event raised even if the list is empty. The situation requires 2 threads trying to pop and list having 2 items. If list becomes empty between if and SetEvent we will have the wrong state. Obviously we need to implement syncronization as well. At this point I would reconsider simple Sleep again :-)

How to tell the parent that the thread is done in C++ using pthreads?

I have a TCP Server application that serves each client in a new thread using POSIX Threads and C++.
The server calls "listen" on its socket and when a client connects, it makes a new object of class Client. The new object runs in its own thread and processes the client's requests.
When a client disconnects, i want some way to tell my main() thread that this thread is done, and main() can delete this object and log something like "Client disconnected".
My question is, how do i tell to the main thread, that a thread is done ?

The most straightforward way that I can see, is to join the threads. See here. The idea is that on a join call, a command thread will then wait until worker threads exit, and then resume.
Alternatively, you could roll something up with some shared variables and mutexes.

If the child thread is really exiting when it is done (rather than waiting for more work), the parent thread can call pthread_join on it which will block until the child thread exits.
Obviously, if the parent thread is doing other things, it can't constantly be blocking on pthread_join, so you need a way to send a message to the main thread to tell it to call pthread_join. There are a number of IPC mechanisms that you could use for this, but in your particular case (a TCP server), I suspect the main thread is probably a select loop, right? If that's the case, I would recommend using pipe to create a logical pipe, and have the read descriptor for the pipe be one of the descriptors that the main thread selects from.
When a child thread is done, it would then write some sort of message to the pipe saying "I'm Done!" and then the server would know to call pthread_join on that thread and then do whatever else it needs to do when a connection finishes.
Note that you don't have to call pthread_join on a finished child thread, unless you need its return value. However, it is generally a good idea to do so if the child thread has any access to shared resources, since when pthread_join returns without error, it assures you that the child thread is really gone and not in some intermediate state between having sent the "I'm Done!" message and actually having exited.

pthreads return 0 if everything went okay or they return errno if something didn't work.
int ret, joined;
ret = pthread_create(&thread, NULL, connect, (void*) args);
joined = pthread_join(&thread, NULL);
If joined is zero, the thread is done. Clean up that thread's object.

While it is possible to implement IPC mechanisms to notify a main thread when other threads are about to terminate, if you want to do something when a thread terminates you should try to let the terminating thread do it itself.
You might look into using pthread_cleanup_push() to establish a routine to be called when the thread is cancelled or exits. Another option might be to use pthread_key_create() to create a thread-specific data key and associated destructor function.
If you don't want to call pthread_join() from the main thread due to blocking, you should detach the client threads by either setting it as option when creating the thread or calling pthread_detach().

You could use a queue of "thread objects to be deleted", protect access to the queue with a mutex, and then signal a pthread condition variable to indicate that something was available on the queue.
But do you really want to do that? A better model is for each thread to just clean up after itself, and not worry about synchronizing with the main thread in the first place.

Calling pthread_join will block execution of the main thread. Given the description of the problem I don't think it will provide the desired solution.
My preferred solution, in most cases, would be to have the thread perform its own cleanup. If that isn't possible you'll either have to use some kind of polling scheme with shared variables (just remember to make them thread safe, hint:volatile), or perhaps some sort of OS dependant callback mechanism. Remember, you want to be blocked on the call to listen, so really consider having the thread clean itself up.

As others have mentioned, it's easy to handle termination of a given thread with pthread_join. But a weak spot of pthreads is funneling information from several sources into a synchronous stream. (Alternately, you could say its strong spot is performance.)
By far the easiest solution for you would be to handle cleanup in the worker thread. Log the disconnection (add a mutex to the log), delete resources as appropriate, and exit the worker thread without signaling the parent.
Adding mutexes to allow manipulation of shared resources is a tough problem, so be flexible and creative. Always err on caution when synchronizing, and profile before optimizing.

I had exactly the same problem as you described. After ~300 opened client connections my Linux application was not able to create new thread because pthread_join was never called. For me, usage of pthread_tryjoin_np helped.
Briefly:
have a map that holds all opened thread descriptors
from the main thread before new client thread is opened I iterate through map and call pthread_tryjoin_np for each thread recorded in map. If thread is done the result of call is zero meaning that I can clean up resources from that thread. At the same time pthread_tryjoin_np takes care about releasing thread resources. If pthread_tryjoin_np call returns number different from 0 this means that thread is still running and I simply do nothing.
Potential problem with this is that I do not see pthread_tryjoin_np as part official POSIX standard so this solution might not be portable.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js