I'm working through an example of protecting a global double using mutexes, however I get the error -
Unhandled exception at 0x77b6308e in
Lab7.exe: 0xC0000005: Access violation
writing location 0x00000068.
I assume this is related to accessing score? (The global double)
#include <windows.h>
#include <iostream>
#include <process.h>
double score = 0.0;
HANDLE threads[10];
CRITICAL_SECTION score_mutex;
unsigned int __stdcall MyThread(void *data)
{
EnterCriticalSection(&score_mutex);
score = score + 1.0;
LeaveCriticalSection(&score_mutex);
return 0;
}
int main()
{
InitializeCriticalSection(&score_mutex);
for (int loop = 0; loop < 10; loop++)
{
threads[loop] = (HANDLE) _beginthreadex(NULL, 0, MyThread, NULL, 0, NULL);
}
WaitForMultipleObjects(10, threads, 0, INFINITE);
DeleteCriticalSection(&score_mutex);
std::cout << score;
while(true);
}
Update:
After fixing the problem with the loop being set to 1000 instead of 10, the error still occured, however when I commented out the pieces of code referring to the mutex the error did not occur.
CRITICAL_SECTION score_mutex;
EnterCriticalSection(&score_mutex);
LeaveCriticalSection(&score_mutex);
InitializeCriticalSection(&score_mutex);
DeleteCriticalSection(&score_mutex);
Update 2
The threads return 0 as per convention (It's been a long week!)
I tried adding back in the mutex-related code, and the program will compile and run fine (other than the race condition issues with the double of course) with CRITICAL_SECTION, InitializeCriticalSection and DeleteCriticalSection all added back in. The problem appears to be with EnterCriticalSection or LeaveCriticalSection, as the error reoccurs when I add them.
The remaining bug in your code is in the call to WaitForMultipleObjects(). You set the 3rd parameter to 0 (FALSE) such that the main thread unblocks as soon as any of the 10 threads finishes.
This causes the call to DeleteCriticalSection() to execute before all threads are finished, creating an access violation when one of the (possibly) 9 other threads starts and calls EnterCriticalSection().
You're writing beyond the end of your threads[10] array:
for (int loop = 0; loop < 1000; loop++){
threads[loop];
}
threads only has size 10!
Your problem is that WaitForMultipleObjects is not waiting for all the threads to complete, causing the critical section to be prematurely deleted. According to MSDN, the third argument is
bWaitAll [in]
If this parameter is TRUE, the function returns when the state of all objects in the >lpHandles array is signaled. If FALSE, the function returns when the state of any one of >the objects is set to signaled. In the latter case, the return value indicates the object >whose state caused the function to return.
You set this to 0, which returns when ANY ONE of your threads completes. This causes the following DeleteCriticalSection to be run while there's still threads waiting to access it.
You should also declare score as a volatile so you don't have cached value problem.
Related
I am new to here and I hope I am doing everything right.
I was wondering how to find out which thread finishes after waiting for one to finish using the WaitForMultipleObjects command. Currently I have something along the lines of:
int checknum;
int loop = 0;
const int NumThreads = 3;
HANDLE threads[NumThreads];
WaitForMultipleObjects(NumThreads, threads, false, INFINITE);
threads[loop] = CreateThread(0, 0, ThreadFunction, &checknum, 0, 0);
It is only supposed to have a max of three threads running at the same time. So I have a loop to begin all three threads (hence the loop value). The problem is when I go through it again, I would like to change the value of loop to the value of whichever thread just finished its task so that it can be used again. Is there any way to find out which thread in that array had finished?
I would paste the rest of my code, but I'm pretty sure no one needs all 147 lines of it. I figured this snippet would be enough.
When the third parameter is false, WaitForMultipleObjects will return as soon as ANY of the objects is signaled (it doesn't need to wait for all of them).
And the return value indicates which object caused it to return. It will be WAIT_OBJECT_0 for the first object, WAIT_OBJECT_0 + 1 for the second, etc.
I am away from my compiler and I don't know of an onlione IDE that works with windows but here is the rough idea of what you need to do.
const int NumThreads = 3;
HANDLE threads[NumThreads];
//create threads here
DWORD result = WaitForMultipleObjects(NumThreads, threads, false, INFINITE);
if(result >= WAIT_OBJECT_0 && result - WAIT_OBJECT_0 < NumThreads){
int index = result - WAIT_OBJECT_0;
if(!CloseHandle(Handles[index])){ //need to close to give handle back to system even though the thread has finished
DWORD error = GetLastError();
//TODO handle error
}
threads[index] = CreateThread(0, 0, ThreadFunction, &checknum, 0, 0);
}
else {
DWORD error = GetLastError();
//TODO handle error
break;
}
at work we do this a bit differently. We have made a library which wraps all needed windows handle types and preforms static type checking (though conversion operators) to make sure you can't wait for an IOCompletionPort with a WaitForMultipleObjects (which is not allowed). The wait function is variadic rather than taking an array of handles and its size and is specialized using SFINAE to use WaitForSingleObject when there is only one. It also takes Lambdas as arguements and executes the corresponding one depending on the signaled event.
This is what it looks like:
Win::Event ev;
Win::Thread th([]{/*...*/ return 0;});
//...
Win::WaitFor(ev,[]{std::cout << "event" << std::endl;},
th,[]{std::cout << "thread" << std::endl;},
std::chrono::milliseconds(100),[]{std::cout << "timeout" << std::endl;});
I would highly recommend this type of wrapping because at the end of the day the compiler optimizes it to the same code but you can't make nearly as many mistakes.
I have a loop generating threads via AfxBeginThread, which stores the CWinThread pointers in an array. In each iteration, I check the thread is not null and store the thread's handle in another array.
const unsigned int maxThreads = 2;
CWinThread* threads[maxThreads];
HANDLE* handles[maxThreads];
for(unsigned int threadId=0; threadId < maxThreads; ++threadId)
{
threads[threadId] = AfxBeginThread(endToEndProc, &threadId,
0,0,CREATE_SUSPENDED);
if(threads[threadId] == NULL)
{
// die carefully
}
threads[threadId]->m_bAutoDelete = FALSE;
handles[threadId] = &threads[threadId]->m_hThread;
::ResumeThread(handles[threadId]);
}
DWORD result = ::WaitForMultipleObjects(maxThreads, handles[0],
TRUE, 20000*maxThreads);
But WaitForMultipleObjects always returns with WAIT_FAILED, and GetLastError yields 6 for invalid handle. Either the test for the AfxBeginThread return is insufficient to guarantee the thread was created successfully and the handle will be valid, or the handle is becoming invalid before the WaitForMultipleObjects call, which I thought would be prevented by setting m_bAutoDelete to FALSE.
Is there a better way to wait on multiple threads when they are created by AfxBeginThread?
Note that it is fine when maxThreads=1.
handles[0] points to something that has ONE valid handle and some data possibly following it. maxThreads instead suggests that array should have two handles there one after another. Hence the error.
This is what you want instead:
HANDLE handles[maxThreads];
//...
handles[threadId] = threads[threadId]->m_hThread;
//...
WaitForMultipleObjects(maxThreads, handles, ...
I'm writing a program which splits off into threads, each thread is timed, and then the times are added together in total_time.
I protect total_time using a mutex.
The program was working fine, until I added 'OutputDebugStringW', which is when I started getting these Unhandled exception / Access violation errors.
for (int loop = 0; loop < THREADS; loop++)
{
threads[loop] = (HANDLE) _beginthreadex(NULL, 0, MandelbrotThread, &m_args[loop], 0, NULL);
}
WaitForMultipleObjects(THREADS, threads, TRUE, INFINITE);
OutputDebugStringW(LPCWSTR(total_time));
Within each of these threads, it does some calculation which it times, EntersCriticalSection, adds the time taken to total_time, LeaveCriticalSections, then ends.
I tried adding EnterCriticalSection and LeaveCriticalSection around OutputDebugStringW() but it didn't help to fix the error.
Any thoughts?
Update 1:
Here is the MandelbrotThread function -
unsigned int __stdcall MandelbrotThread(void *data)
{
long long int time = get_time();
MandelbrotArgs *m_args = (MandelbrotArgs *) data;
compute_mandelbrot(m_args->left, m_args->right, m_args->top, m_args->bottom, m_args->y_start, m_args->lines_to_render);
time = time - get_time();
EnterCriticalSection(&time_mutex);
total_time = total_time + time;
LeaveCriticalSection(&time_mutex);
return 0;
}
m_args are the sides of the set to be rendered (so the same for every thread), and the line to start on (y_start) and the number of lines to render.
reinterpret_cast'ing a number to a string will certainly shut up the compiler, but will not make your program magically work. You need to convert it, using sprintf, or preferrably boost::lexical_cast (although I'm guessing the latter's not an option for you).
WCHAR buf[32];
wsprintf(buf, L"%I64d\n", total_time);
OutputDebugStringW(buf);
I have a program that spawns 3 worker threads that do some number crunching, and waits for them to finish like so:
#define THREAD_COUNT 3
volatile LONG waitCount;
HANDLE pSemaphore;
int main(int argc, char **argv)
{
// ...
HANDLE threads[THREAD_COUNT];
pSemaphore = CreateSemaphore(NULL, THREAD_COUNT, THREAD_COUNT, NULL);
waitCount = 0;
for (int j=0; j<THREAD_COUNT; ++j)
{
threads[j] = CreateThread(NULL, 0, Iteration, p+j, 0, NULL);
}
WaitForMultipleObjects(THREAD_COUNT, threads, TRUE, INFINITE);
// ...
}
The worker threads use a custom Barrier function at certain points in the code to wait until all other threads reach the Barrier:
void Barrier(volatile LONG* counter, HANDLE semaphore, int thread_count = THREAD_COUNT)
{
LONG wait_count = InterlockedIncrement(counter);
if ( wait_count == thread_count )
{
*counter = 0;
ReleaseSemaphore(semaphore, thread_count - 1, NULL);
}
else
{
WaitForSingleObject(semaphore, INFINITE);
}
}
(Implementation based on this answer)
The program occasionally deadlocks. If at that point I use VS2008 to break execution and dig around in the internals, there is only 1 worker thread waiting on the Wait... line in Barrier(). The value of waitCount is always 2.
To make things even more awkward, the faster the threads work, the more likely they are to deadlock. If I run in Release mode, the deadlock comes about 8 out of 10 times. If I run in Debug mode and put some prints in the thread function to see where they hang, they almost never hang.
So it seems that some of my worker threads are killed early, leaving the rest stuck on the Barrier. However, the threads do literally nothing except read and write memory (and call Barrier()), and I'm quite positive that no segfaults occur. It is also possible that I'm jumping to the wrong conclusions, since (as mentioned in the question linked above) I'm new to Win32 threads.
What could be going on here, and how can I debug this sort of weird behavior with VS?
How do I debug weird thread behaviour?
Not quite what you said, but the answer is almost always: understand the code really well, understand all the possible outcomes and work out which one is happening. A debugger becomes less useful here, because you can either follow one thread and miss out on what is causing other threads to fail, or follow from the parent, in which case execution is no longer sequential and you end up all over the place.
Now, onto the problem.
pSemaphore = CreateSemaphore(NULL, THREAD_COUNT, THREAD_COUNT, NULL);
From the MSDN documentation:
lInitialCount [in]: The initial count for the semaphore object. This value must be greater than or equal to zero and less than or equal to lMaximumCount. The state of a semaphore is signaled when its count is greater than zero and nonsignaled when it is zero. The count is decreased by one whenever a wait function releases a thread that was waiting for the semaphore. The count is increased by a specified amount by calling the ReleaseSemaphore function.
And here:
Before a thread attempts to perform the task, it uses the WaitForSingleObject function to determine whether the semaphore's current count permits it to do so. The wait function's time-out parameter is set to zero, so the function returns immediately if the semaphore is in the nonsignaled state. WaitForSingleObject decrements the semaphore's count by one.
So what we're saying here, is that a semaphore's count parameter tells you how many threads are allowed to perform a given task at once. When you set your count initially to THREAD_COUNT you are allowing all your threads access to the "resource" which in this case is to continue onwards.
The answer you link uses this creation method for the semaphore:
CreateSemaphore(0, 0, 1024, 0)
Which basically says none of the threads are permitted to use the resource. In your implementation, the semaphore is signaled (>0), so everything carries on merrily until one of the threads manages to decrease the count to zero, at which point some other thread waits for the semaphore to become signaled again, which probably isn't happening in sync with your counters. Remember when WaitForSingleObject returns it decreases the counter on the semaphore.
In the example you've posted, setting:
::ReleaseSemaphore(sync.Semaphore, sync.ThreadsCount - 1, 0);
Works because each of the WaitForSingleObject calls decrease the semaphore's value by 1 and there are threadcount - 1 of them to do, which happen when the threadcount - 1 WaitForSingleObjects all return, so the semaphore is back to 0 and therefore unsignaled again, so on the next pass everybody waits because nobody is allowed to access the resource at once.
So in short, set your initial value to zero and see if that fixes it.
Edit A little explanation: So to think of it a different way, a semaphore is like an n-atomic gate. What you do is usually this:
// Set the number of tickets:
HANDLE Semaphore = CreateSemaphore(0, 20, 200, 0);
// Later on in a thread somewhere...
// Get a ticket in the queue
WaitForSingleObject(Semaphore, INFINITE);
// Only 20 threads can access this area
// at once. When one thread has entered
// this area the available tickets decrease
// by one. When there are 20 threads here
// all other threads must wait.
// do stuff
ReleaseSemaphore(Semaphore, 1, 0);
// gives back one ticket.
So the use we're putting semaphores to here isn't quite the one for which they were designed.
It's a bit hard to guess exactly what you might be running into. Parallel programming is one of those places that (IMO) it pays to follow the philosophy of "keep it so simple it's obviously correct", and unfortunately I can't say that your Barrier code seems to qualify. Personally, I think I'd have something like this:
// define and initialize the array of events use for the barrier:
HANDLE barrier_[thread_count];
for (int i=0; i<thread_count; i++)
barrier_[i] = CreateEvent(NULL, true, false, NULL);
// ...
Barrier(size_t thread_num) {
// Signal that this thread has reached the barrier:
SetEvent(barrier_[thread_num]);
// Then wait for all the threads to reach the barrier:
WaitForMultipleObjects(thread_count, barrier_, true, INFINITE);
}
Edit:
Okay, now that the intent has been clarified (need to handle multiple iterations), I'd modify the answer, but only slightly. Instead of one array of Events, have two: one for the odd iterations and one for the even iterations:
// define and initialize the array of events use for the barrier:
HANDLE barrier_[2][thread_count];
for (int i=0; i<thread_count; i++) {
barrier_[0][i] = CreateEvent(NULL, true, false, NULL);
barrier_[1][i] = CreateEvent(NULL, true, false, NULL);
}
// ...
Barrier(size_t thread_num, int iteration) {
// Signal that this thread has reached the barrier:
SetEvent(barrier_[iteration & 1][thread_num]);
// Then wait for all the threads to reach the barrier:
WaitForMultipleObjects(thread_count, &barrier[iteration & 1], true, INFINITE);
ResetEvent(barrier_[iteration & 1][thread_num]);
}
In your barrier, what prevents this line:
*counter = 0;
to be executed while this other one is executed by another thread?
LONG wait_count =
InterlockedIncrement(counter);
Why does the code sample below cause one thread to execute way more than another but a mutex does not?
#include <windows.h>
#include <conio.h>
#include <process.h>
#include <iostream>
using namespace std;
typedef struct _THREAD_INFO_ {
COORD coord; // a structure containing x and y coordinates
INT threadNumber; // each thread has it's own number
INT count;
}THREAD_INFO, * PTHREAD_INFO;
void gotoxy(int x, int y);
BOOL g_bRun;
CRITICAL_SECTION g_cs;
unsigned __stdcall ThreadFunc( void* pArguments )
{
PTHREAD_INFO info = (PTHREAD_INFO)pArguments;
while(g_bRun)
{
EnterCriticalSection(&g_cs);
//if(TryEnterCriticalSection(&g_cs))
//{
gotoxy(info->coord.X, info->coord.Y);
cout << "T" << info->threadNumber << ": " << info->count;
info->count++;
LeaveCriticalSection(&g_cs);
//}
}
ExitThread(0);
return 0;
}
int main(void)
{
// OR unsigned int
unsigned int id0, id1; // a place to store the thread ID returned from CreateThread
HANDLE h0, h1; // handles to theads
THREAD_INFO tInfo[2]; // only one of these - not optimal!
g_bRun = TRUE;
ZeroMemory(&tInfo, sizeof(tInfo)); // win32 function - memset(&buffer, 0, sizeof(buffer))
InitializeCriticalSection(&g_cs);
// setup data for the first thread
tInfo[0].threadNumber = 1;
tInfo[0].coord.X = 0;
tInfo[0].coord.Y = 0;
h0 = (HANDLE)_beginthreadex(
NULL, // no security attributes
0, // defaut stack size
&ThreadFunc, // pointer to function
&tInfo[0], // each thread gets its own data to output
0, // 0 for running or CREATE_SUSPENDED
&id0 ); // return thread id - reused here
// setup data for the second thread
tInfo[1].threadNumber = 2;
tInfo[1].coord.X = 15;
tInfo[1].coord.Y = 0;
h1 = (HANDLE)_beginthreadex(
NULL, // no security attributes
0, // defaut stack size
&ThreadFunc, // pointer to function
&tInfo[1], // each thread gets its own data to output
0, // 0 for running or CREATE_SUSPENDED
&id1 ); // return thread id - reused here
_getch();
g_bRun = FALSE;
return 0;
}
void gotoxy(int x, int y) // x=column position and y=row position
{
HANDLE hdl;
COORD coords;
hdl = GetStdHandle(STD_OUTPUT_HANDLE);
coords.X = x;
coords.Y = y;
SetConsoleCursorPosition(hdl, coords);
}
That may not answer your question but the behavior of critical sections changed on Windows Server 2003 SP1 and later.
If you have bugs related to critical sections on Windows 7 that you can't reproduce on an XP machine you may be affected by that change.
My understanding is that on Windows XP critical sections used a FIFO based strategy that was fair for all threads while later versions use a new strategy aimed at reducing context switching between threads.
There's a short note about this on the MSDN page about critical sections
You may also want to check this forum post
Critical sections, like mutexes are designed to protect a shared resource against conflicting access (such as concurrent modification). Critical sections are not meant to replace thread priority.
You have artificially introduced a shared resource (the screen) and made it into a bottleneck. As a result, the critical section is highly contended. Since both threads have equal priority, that is no reason for Windows to prefer one thread over another. Reduction of context switches is a reason to pick one thread over another. As a result of that reduction, the utilization of the shared resource goes up. That is a good thing; it means that one thread will be finished a lot earlier and the other thread will finish a bit earlier.
To see the effect graphically, compare
A B A B A B A B A B
to
AAAAA BBBBB
The second sequence is shorter because there's only one switch from A to B.
In hand wavey terms:
CriticalSection is saying the thread wants control to do some things together.
Mutex is making a marker to show 'being busy' so others can wait and notifying of completion so somebody else can start. Somebody else already waiting for the mutex will grab it before you can start the loop again and get it back.
So what you are getting with CriticalSection is a failure to yield between loops. You might see a difference if you had Sleep(0); after LeaveCriticalSection
I can't say why you're observing this particular behavior, but it's probably to do with the specifics of the implementation of each mechanism. What I CAN say is that unlocking then immediately locking a mutex is a bad thing. You will observe odd behavior eventually.
From some MSDN docs (http://msdn.microsoft.com/en-us/library/ms682530.aspx):
Starting with Windows Server 2003 with Service Pack 1 (SP1), threads waiting on a critical section do not acquire the critical section on a first-come, first-serve basis. This change increases performance significantly for most code