I have 3 threads, resumed at the same time, calling the same function with different arguments. How can I force the thread to leave Critical Section and pass it to another thread?
When I run the code below, the while loop is called many times until another thread enters the Critical Section (and it also loops many times).
DWORD WINAPI ClientThread(LPVOID lpParam)
{
// thread logic
while(true)
{
EnterCriticalSection(&critical);
// thread logic
LeaveCriticalSection(&critical);
Sleep(0);
}
// thread logic
return 0;
}
In other words, how can I prevent a thread from instantly reentering a section again?
You can't ask a thread directly to leave the critical section. The thread will leave it, when it has finished executing.
So the only way would be to prevent it from entering the critical section, or "ask" it to finish early. Eg. by checking in the section continuously for an atomic_flag and aborting stopping the thread's operation if it has been checked.
If you want to prevent a thread from reentering a section directly after it has left, you could yield it, this will reschedule the execution of threads.
If you want an exact ordering from threads (A->B->C->D->A->B ...) you need to write a custom scheduler or a custom "fair_mutex" who detects other waiting threads.
Edit:
Such a function would be BOOL SwitchToThread(); doc
As mentioned in another answer, you need Fair Mutex, and Ticket Lock may be one of ways to implement it.
There's another way, based on binary semaphore, and it is actually close to what Critical Section used to be. Like this:
class old_cs
{
public:
old_cs()
{
event = CreateEvent(NULL, /* bManualReset = */ FALSE, /* bInitialState =*/ TRUE, NULL);
if (event == NULL) throw std::runtime_error("out of resources");
}
~old_cs()
{
CloseHandle(event);
}
void lock()
{
if (count.fetch_add(1, std::memory_order_acquire) > 0)
WaitForSingleObject(event, INFINITE);
}
void unlock()
{
if (count.fetch_sub(1, std::memory_order_release) > 1)
SetEvent(event);
}
old_cs(const old_cs&) = delete;
old_cs(old_cs&&) = delete;
old_cs& operator=(const old_cs&) = delete;
old_cs& operator=(old_cs&&) = delete;
private:
HANDLE event;
std::atomic<std::size_t> count = 0;
};
You may find the following in Critical Section Objects documentation:
Starting with Windows Server 2003 with Service Pack 1 (SP1), threads
waiting on a critical section do not acquire the critical section on a
first-come, first-serve basis. This change increases performance
significantly for most code. However, some applications depend on
first-in, first-out (FIFO) ordering and may perform poorly or not at
all on current versions of Windows (for example, applications that
have been using critical sections as a rate-limiter). To ensure that
your code continues to work correctly, you may need to add an
additional level of synchronization. For example, suppose you have a
producer thread and a consumer thread that are using a critical
section object to synchronize their work. Create two event objects,
one for each thread to use to signal that it is ready for the other
thread to proceed. The consumer thread will wait for the producer to
signal its event before entering the critical section, and the
producer thread will wait for the consumer thread to signal its event
before entering the critical section. After each thread leaves the
critical section, it signals its event to release the other thread.
So the algorithm inthis post is a simplified version of what Critical Section used to be in Windows XP and earlier.
The above algorithm is not a complete critical section, it lack recursion support, spinning, low resources situation handling.
Also it relies on Windows Event fairness.
Related
I am in process of implementing messages passing from one thread to another
Thread 1: Callback functions are registered with libraries, on callback, functions are invoked and needs to be send to another thread for processing as it takes time.
Thread 2: Thread to check if any messages are available(preferrednas in queue) and process the same.
Is condition_variable usage with mutex a correct approach to start considering thread 2 processing takes time in which multiple other messages can be added by thread 1?
Is condition_variable usage with mutex a correct approach to start considering thread 2 processing takes time in which multiple other messages can be added by thread 1?
The question is a bit vague about how a condition variable and mutex would be used, but yes, there would definitely be a role for such objects. The high-level view would be something like this:
The mutex would protect access to the message queue. Any read or modification of the queue, by any thread, would be done only while holding the mutex locked.
The message-processing thread would block on the CV in the event that it became ready to process a new message but the queue was empty.
The message-generating thread would signal the CV each time it enqueued a new message.
This is exactly a producer / consumer problem, and you can find a lot of information about such problems using that terminology.
But note also that there are multiple message queue implementations already available to serve exactly your purpose ("message queue" is in fact a standard term for these), so you should consider whether you really want to reinvent this wheel.
In general, mutexes are intended to control access between threads; but not great for notifying between threads.
If you design Thread2 to wait on the condition; you can simply process messages as they are received from Thread1.
Here would be a rough implementation
void pushFunction
{
// Obtain the mutex (preferrably scoped lock in boost or c++17)
std::lock_guard lock(myMutex);
const bool empty = myQueue.empty();
myQueue.push(data);
lock.unlock();
if(empty)
{
conditionVar.notify_one();
}
}
In Thread 2
void waitForMessage()
{
std::lock_guard lock(myMutex);
while (myQueue.empty())
{
conditionVar.wait(lock);
}
rxMessage = myQueue.front();
myQueue.pop();
}
It's important to note that the condition can spuriously wake up so it's important to keep it in the 'while empty' loop.
See https://en.cppreference.com/w/cpp/thread/condition_variable
We're programming on a proprietary embedded platform sitting atop of VxWorks 5.5. In our toolbox, we have a condition variable, that is implemented using a VxWorks binary semaphore.
Now, POSIX provides a wait function that also takes a mutex. This will unlock the mutex (so that some other task might write to the data) and waits for the other task to signal (it is done writing the data). I believe this implements what's called a Monitor, ICBWT.
We need such a wait function, but implementing it is tricky. A simple approach would do this:
bool condition::wait_for(mutex& mutex) const {
unlocker ul(mutex); // relinquish mutex
return wait(event);
} // ul's dtor grabs mutex again
However, this sports a race condition because it allows another task to preempt this one after the unlocking and before the waiting. The other task can write to the date after it was unlocked and signal the condition before this task starts to wait for the semaphore. (We have tested this and this indeed happens and blocks the waiting task forever.)
Given that VxWorks 5.5 doesn't seem to provide an API to temporarily relinquish a semaphore while waiting for a signal, is there a way to implement this on top of the provided synchronization routines?
Note: This is a very old VxWorks version that has been compiled without POSIX support (by the vendor of the proprietary hardware, from what I understood).
This should be quite easy with native vxworks, a message queue is what is required here. Your wait_for method can be used as is.
bool condition::wait_for(mutex& mutex) const
{
unlocker ul(mutex); // relinquish mutex
return wait(event);
} // ul's dtor grabs mutex again
but the wait(event) code would look like this:
wait(event)
{
if (msgQRecv(event->q, sigMsgBuf, sigMsgSize, timeoutTime) == OK)
{
// got it...
}
else
{
// timeout, report error or something like that....
}
}
and your signal code would like something like this:
signal(event)
{
msgQSend(event->q, sigMsg, sigMsgSize, NO_WAIT, MSG_PRI_NORMAL);
}
So if the signal gets triggered before you start waiting, then msgQRecv will return immediately with the signal when it eventually gets invoked and you can then take the mutex again in the ul dtor as stated above.
The event->q is a MSG_Q_ID that is created at event creation time with a call to msgQCreate, and the data in sigMsg is defined by you... but can be just a random byte of data, or you can come up with a more intelligent structure with information regarding who signaled or something else that may be nice to know.
Update for multiple waiters, this is a little tricky: So there are a couple of assumptions I will make to simplify things
The number of tasks that will be pending is known at event creation time and is constant.
There will be one task that is always responsible for indicating when it is ok to unlock the mutex, all other tasks just want notification when the event is signaled/complete.
This approach uses a counting semaphore, similar to the above with just a little extra logic:
wait(event)
{
if (semTake(event->csm, timeoutTime) == OK)
{
// got it...
}
else
{
// timeout, report error or something like that....
}
}
and your signal code would like something like this:
signal(event)
{
for (int x = 0; x < event->numberOfWaiters; x++)
{
semGive(event->csm);
}
}
The creation of the event is something like this, remember in this example the number of waiters is constant and known at event creation time. You could make it dynamic, but the key is that every time the event is going to happen the numberOfWaiters must be correct before the unlocker unlocks the mutex.
createEvent(numberOfWaiters)
{
event->numberOfWaiters = numberOfWaiters;
event->csv = semCCreate(SEM_Q_FIFO, 0);
return event;
}
You cannot be wishy-washy about the numberOfWaiters :D I will say it again: The numberOfWaiters must be correct before the unlocker unlocks the mutex. To make it dynamic (if that is a requirement) you could add a setNumWaiters(numOfWaiters) function, and call that in the wait_for function before the unlocker unlocks the mutex, so long as it always sets the number correctly.
Now for the last trick, as stated above the assumption is that one task is responsible for unlocking the mutex, the rest just wait for the signal, which means that one and only one task will call the wait_for() function above, and the rest of the tasks just call the wait(event) function.
With this in mind the numberOfWaiters is computed as follows:
The number of tasks who will call wait()
plus 1 for the task that calls wait_for()
Of course you can also make this more complex if you really need to, but chances are this will work because normally 1 task triggers an event, but many tasks want to know it is complete, and that is what this provides.
But your basic flow is as follows:
init()
{
event->createEvent(3);
}
eventHandler()
{
locker l(mutex);
doEventProcessing();
signal(event);
}
taskA()
{
doOperationThatTriggersAnEvent();
wait_for(mutex);
eventComplete();
}
taskB()
{
doWhateverIWant();
// now I need to know if the event has occurred...
wait(event);
coolNowIKnowThatIsDone();
}
taskC()
{
taskCIsFun();
wait(event);
printf("event done!\n");
}
When I write the above I feel like all OO concepts are dead, but hopefully you get the idea, in reality wait and wait_for should take the same parameter, or no parameter but rather be members of the same class that also has all the data they need to know... but none the less that is the overview of how it works.
Race conditions can be avoided if each waiting task waits on a separate binary semaphore.
These semaphores must be registered in a container which the signaling task uses to unblock all waiting tasks. The container must be protected by a mutex.
The wait_for() method obtains a binary semaphore, waits on it and finally deletes it.
void condition::wait_for(mutex& mutex) {
SEM_ID sem = semBCreate(SEM_Q_PRIORITY, SEM_EMPTY);
{
lock l(listeners_mutex); // assure exclusive access to listeners container
listeners.push_back(sem);
} // l's dtor unlocks listeners_mutex again
unlocker ul(mutex); // relinquish mutex
semTake(sem, WAIT_FOREVER);
{
lock l(listeners_mutex);
// remove sem from listeners
// ...
semDelete(sem);
}
} // ul's dtor grabs mutex again
The signal() method iterates over all registered semaphores and unlocks them.
void condition::signal() {
lock l(listeners_mutex);
for_each (listeners.begin(), listeners.end(), /* call semGive()... */ )
}
This approach assures that wait_for() will never miss a signal. A disadvantage is the need of additional system resources.
To avoid creating and destroying semaphores for every wait_for() call, a pool could be used.
From the description, it looks like you may want to implement (or use) a semaphore - it's a standard CS algorithm with semantics similar to condvars, and there are tons of textbooks on how to implement them (https://www.google.com/search?q=semaphore+algorithm).
A random Google result which explains semaphores is at: http://www.cs.cornell.edu/courses/cs414/2007sp/lectures/08-bakery.ppt (see slide 32).
I have multiple threads processing multiple files in the background, while the program is idle.
To improve disk throughput, I use critical sections to ensure that no two threads ever use the same disk simultaneously.
The (pseudo-)code looks something like this:
void RunThread(HANDLE fileHandle)
{
// Acquire CRITICAL_SECTION for disk
CritSecLock diskLock(GetDiskLock(fileHandle));
for (...)
{
// Do some processing on file
}
}
Once the user requests a file to be processed, I need to stop all threads -- except the one which is processing the requested file. Once the file is processed, then I'd like to resume all the threads again.
Given the fact that SuspendThread is a bad idea, how do I go about stopping all threads except the one that is processing the relevant input?
What kind of threading objects/features would I need -- mutexes, semaphores, events, or something else? And how would I use them? (I'm hoping for compatibility with Windows XP.)
I recommend you go about it in a completely different fashion. If you really want only one thread for every disk (I'm not convinced this is a good idea) then you should create one thread per disk, and distribute files as you queue them for processing.
To implement priority requests for specific files I would then have a thread check a "priority slot" at several points during its normal processing (and of course in its main queue wait loop).
The difficulty here isn't priority as such, it's the fact that you want a thread to back out of a lock that it's holding, to let another thread take it. "Priority" relates to which of a set of runnable threads should be scheduled to run -- you want to make a thread runnable that isn't (because it's waiting on a lock held by another thread).
So, you want to implement (as you put it):
if (ThisThreadNeedsToSuspend()) { ReleaseDiskLock(); WaitForResume(); ReacquireDiskLock(); }
Since you're (wisely) using a scoped lock I would want to invert the logic:
while (file_is_not_finished) {
WaitUntilThisThreadCanContinue();
CritSecLock diskLock(blah);
process_part_of_the_file();
}
ReleasePriority();
...
void WaitUntilThisThreadCanContinue() {
MutexLock lock(thread_priority_mutex);
while (thread_with_priority != NOTHREAD and thread_with_priority != thisthread) {
condition_variable_wait(thread_priority_condvar);
}
}
void GiveAThreadThePriority(threadid) {
MutexLock lock(thread_priority_mutex);
thread_with_priority = threadid;
condition_variable_broadcast(thread_priority_condvar);
}
void ReleasePriority() {
MutexLock lock(thread_priority_mutex);
if (thread_with_priority == thisthread) {
thread_with_priority = NOTHREAD;
condition_variable_broadcast(thread_priority_condvar);
}
}
Read up on condition variables -- all recent OSes have them, with similar basic operations. They're also in Boost and in C++11.
If it's not possible for you to write a function process_part_of_the_file then you can't structure it this way. Instead you need a scoped lock that can release and regain the disklock. The easiest way to do that is to make it a mutex, then you can wait on a condvar using that same mutex. You can still use the mutex/condvar pair and the thread_with_priority object in much the same way.
You choose the size of "part of the file" according to how responsive you need the system to be to a change in priority. If you need it to be extremely responsive then the scheme doesn't really work -- this is co-operative multitasking.
I'm not entirely happy with this answer, the thread with priority can be starved for a long time if there are a lot of other threads that are already waiting on the same disk lock. I'd put in more thought to avoid that. Possibly there should not be a per-disk lock, rather the whole thing should be handled under the condition variable and its associated mutex. I hope this gets you started, though.
You may ask the threads to stop gracefully. Just check some variable in loop inside threads and continue or terminate work depending on its value.
Some thoughts about it:
The setting and checking of this value should be done inside critical section.
Because the critical section slows down the thread, the checking should be done often enough to quickly stop the thread when needed and rarely enough, such that thread won't be stalled by acquiring and releasing the critical section.
After each worker thread processes a file, check a condition variable associated with that thread. The condition variable could implemented simply as a bool + critical section. Or with InterlockedExchange* functions. And to be honest, I usually just use an unprotected bool between threads to signal "need to exit" - sometimes with an event handle if the worker thread could be sleeping.
After setting the condition variable for each thread, Main thread waits for each thread to exit via WaitForSingleObject.
DWORD __stdcall WorkerThread(void* pThreadData)
{
ThreadData* pData = (ThreadData*) pTheradData;
while (pData->GetNeedToExit() == false)
{
ProcessNextFile();
}
return 0;
}
void StopWokerThread(HANDLE hThread, ThreadData* pData)
{
pData->SetNeedToExit = true;
WaitForSingleObject(hThread);
CloseHandle(hThread);
}
struct ThreadData()
{
CRITICAL_SECITON _cs;
ThreadData()
{
InitializeCriticalSection(&_cs);
}
~ThreadData()
{
DeleteCriticalSection(&_cs);
}
ThreadData::SetNeedToExit()
{
EnterCriticalSection(&_cs);
_NeedToExit = true;
LeaveCriticalSeciton(&_cs);
}
bool ThreadData::GetNeedToExit()
{
bool returnvalue;
EnterCriticalSection(&_cs);
returnvalue = _NeedToExit = true;
LeaveCriticalSeciton(&_cs);
return returnvalue;
}
};
You can also use the pool of threads and regulate their work by using the I/O Completion port.
Normally threads from the pool would sleep awaiting for the I/O Completion port event/activity.
When you have a request the I/O Completion port releases the thread and it starts to do a job.
OK, how about this:
Two threads per disk, for high and low priority requests, each with its own input queue.
A high-priority disk task, when initially submitted, will then issue its disk requests in parallel with any low-priority task that is running. It can reset a ManualResetEvent that the low-priority thread waits on when it can, (WaitForSingleObject) and so will get blocked if the high-prioriy thread is perfoming disk ops. The high-priority thread should set the event after finishing a task.
This should limit the disk-thrashing to the interval, (if any), between the submission of the high-priority task and whenver the low-priority thread can wait on the MRE. Raising the CPU priority of the thread servicing the high-priority queue may assist in improving performance of the high-priority work in this interval.
Edit: by 'queue', I mean a thread-safe, blocking, producer-consumer queue, (just to be clear:).
More edit - if the issuing threads needs notification of job completion, the tasks issued to the queues could contain an 'OnCompletion' event to call with the task object as a parameter. The event handler could, for example, signal an AutoResetEvent that the originating thread is waiting on, so providing synchronous notification.
Let's say I have a class with the function
class foo
{
...
void bar() {
OutputDebugString(........);
// mode code
}
}
Is it possible to print the ID of the current thread (or if it's the main application) that is executing the function using OutputDebugString?
I have a large application I'm debugging and have found a deadlock situation and would like to check which threads are included in the deadlock. Since it could possibly be the same thread that is locking it's own critical section.
Have a look at the GetCurrentThread function.
Use GetCurrentThreadId().
Note that a thread cannot deadlock itself on a critical section. Once a thread has obtained the lock to the critical section, it can freeing re-enter that same lock as much as it wants (same thing with a mutex). Just make sure to unlock the critical section for each successful lock (re)entry so that OTHER threads do not become deadlocked.
I am learning multi-threading and for the sake of understanding I have wriiten a small function using multithreading...it works fine.But I just want to know if that thread is safe to use,did I followed the correct rule.
void CThreadingEx4Dlg::OnBnClickedOk()
{
//in thread1 100 elements are copied to myShiftArray(which is a CStringArray)
thread1 = AfxBeginThread((AFX_THREADPROC)MyThreadFunction1,this);
WaitForSingleObject(thread1->m_hThread,INFINITE);
//thread2 waits for thread1 to finish because thread2 is going to make use of myShiftArray(in which thread1 processes it first)
thread2 = AfxBeginThread((AFX_THREADPROC)MyThreadFunction2,this);
thread3 = AfxBeginThread((AFX_THREADPROC)MyThreadFunction3,this);
}
UINT MyThreadFunction1(LPARAM lparam)
{
CThreadingEx4Dlg* pthis = (CThreadingEx4Dlg*)lparam;
pthis->MyFunction(0,100);
return 0;
}
UINT MyThreadFunction2(LPARAM lparam)
{
CThreadingEx4Dlg* pthis = (CThreadingEx4Dlg*)lparam;
pthis->MyCommonFunction(0,20);
return 0;
}
UINT MyThreadFunction3(LPARAM lparam)
{
CThreadingEx4Dlg* pthis = (CThreadingEx4Dlg*)lparam;
WaitForSingleObject(pthis->thread3->m_hThread,INFINITE);
//here thread3 waits for thread 2 to finish so that thread can continue
pthis->MyCommonFunction(21,40);
return 0;
}
void CThreadingEx4Dlg::MyFunction(int minCount,int maxCount)
{
for(int i=minCount;i<maxCount;i++)
{
//assume myArray is a CStringArray and it has 100 elemnts added to it.
//myShiftArray is a CStringArray -public to the class
CString temp;
temp = myArray.GetAt(i);
myShiftArray.Add(temp);
}
}
void CThreadingEx4Dlg::MyCommonFunction(int min,int max)
{
for(int i = min;i < max;i++)
{
CSingleLock myLock(&myCS,TRUE);
CString temp;
temp = myShiftArray.GetAt(i);
//threadArray is CStringArray-public to the class
threadArray.Add(temp);
}
myEvent.PulseEvent();
}
Which function do you intend to be "thread-safe"?
I think that the term should be applied to your CommonFunction. This is a function that you intend to be called be several (two in this first case) threads.
I think your code has a rule on the lines of:
Thread 2 do some work
meanwhile Thread 3 wait until Thread 2 finishes then you do some work
In fact your code has
WaitForSingleObject(pthis->thread3->m_hThread,INFINITE);
maybe waits for the wrong thread?
But back to thread safety. Where is the policing of the safety? It's in the control logic of your threads. Suppose you had lots of threads, how would you extend what you've written? You have lots of logic of the kind:
if thread a has finished and thread b has finished ...
Really hard to get right and maintain. Instead you need to make CommonFunction truly thread safe, that is it needs to tolerate being called by several threads at the same time.
In this case you might do that by putting some kind of mutex around the critical part of the code, which perhaps in this case is the whole function - it's not clear whether you intend to keep the items you copy together or whether you mind if the values are interleaved.
In the latter case the only question is whether access to myArray and myShiftArray are thread safe collections
temp = myArray.GetAt(i);
myShiftArray.Add(temp);
all your other variables are local, on the stack so owned by current threads - so you just need to consult the documentation for those collections to determine if they can safely be called by separate threads.
As I've pointed out before what you are doing is entirely pointless you may as well not use threads as you fire a thread off and then wait for the thread to complete before doing anything further.
You give precious little information about your CEvent but your WaitForSingleObjects are waiting for the thread to enter a signalled state (ie for them to exit).
As MyCommonFunction is where the actual potentially thread un-safe thing occurs you have correctly critical sectioned the area, however, threads 2 and threads 3 don't run concurrently. Remove the WaitForSingleObject from MyThreadFunction3 and then you will have both running concurrently in a thread-safe manner, thanks to the critical section.
That said its still a tad pointless as both threads are going to spend most of their time waiting for the critical section to come free. In general you want to structure threads so that there is precious little they need to hit critical sections for and then, when they hit a critical section, hit it only for a very short time (ie not the vast majority of the function's processing time).
Edit:
A Critical section works by saying I'm holding this critical section anything else that wants it has to wait. This means that Thread 1 enters the critical section and begins to do what it needs to do. Thread 2 then comes along and says "I want to use the critical section". The kernel tell its "Thread 1 is using the critical section you have to wait your turn". Thread 3 comes along and gets told the same thing. Threads 2 and 3 are now in a wait state waiting for that critical section to come free. When Thread 1 finishes with the critical section both Threads 2 and 3 race to see who gets to hold the critical section first and when one obtains it the other has to continue waiting.
Now in your example above there would be so much waiting for critical sections it is possible that Thread 1 can be in the critical section and Thread 2 waiting and before Thread 2 has been given the chance to enter the critical section Thread 1 has looped back round and re-entered it. This means that Thread 1 could end up doing all its work before Thread 2 ever gets a chance to enter the critical section. Therefore keeping the amount of work done in the critical section compared to the rest of the loop/function as low as possible will aid the Threads running simultaneously. In your example one thread will ALWAYS be waiting for the other thread and hence just doing it serially may actually be faster as you have no kernel threading overheads.
ie the more you avoid CriticalSections the less time lost for threads waiting for each other. They are necessary, however, as you NEED to make sure that 2 threads don't try and operate on the same object at the same time. Certain in-built objects are "atomic" which can aid you on this but for non-atomic operations a critical section is a must.
An Event is a different sort of synchronisation object. Basically an event is an object that can be one of 2 states. Signalled or not-signalled. If you WaitForSingleObject on a "not-signalled" event then the thread will be put to sleep until it enters a signalled state.
This can be useful when you have a thread that MUST wait for another thread to complete something. In general though you want to avoid using such synchronisation objects as much as possible as it destroys the parallel-ness of your code.
Personally I use them when I have a worker thread waiting for when it needs to do something. The Thread sits in a wait state most of its time and then when some background processing is required I signal the event. The thread then jumps into life and does what it needs to do before looping back round and re-entering the wait state. You can also mark a variable as indicating that the object needs to exit. This way you can set an exit variable to true and then signal the waiting thread. The waiting thread wakes up and says "I should exit" and then exits. Be warned though that you "may" need a memory barrier that says make sure the exit variable is set before the event is woken up otherwise the compiler might re-order the operations. This could end up leaving your thread waking up finding out that the exit variable isn't set doing its thing and then going back to sleep. However the thread that originally sent the signal now assumes the thread has exited when it actually hasn't.
Whoever said multi-threading was easy eh? ;)
It looks like this is going to work because the threads aren't going to do any work concurrently. If you change the code to make the threads perform work at the same time, you will need to put mutexes (in MFC you can use CCriticalSection for that) around the code that accesses data members which are shared between the threads.