So App Verifier is throwing this exception. From what I gather, the text of this message is a little misleading. The problem appears to be that the the critical section was created by a thread that is being destroyed before the critical section is destroyed.
It's a relatively simple fix but does anyone know what the ramifications are for having a thread other than the creating one destroy the crticial section? How dangerous is it? Is the concern only that the critical section handle will "leak" or is there a more insidious side-effect?
Some other info:
App written in C++ (on Windows, of course)
Critical section created with InitializeCriticalSelection
Critical section is eventually deleted with DeleteCriticalSection
I believe you are correct on the interpretation of the message. The only reference I can find is as follows. The stack trace is a good clue as the author suggests
http://jpassing.wordpress.com/2008/02/18/application-verifier-thread-cannot-own-a-critical-section/
I dug around for a bit and cannot find any specific reason why you cannot create and delete a critical section on different threads. However I do wonder why it is that you want to do so? It seems like best practice to have one thread own a critical section so to speak. Handing off the critical section between threads introduces another means of communication and potential error (can be done, just more fun).
Seems accepted answer talks about creation of critical section, which is not what this message is about. Eran's answer covered the real cause of the message, but here it is in TL;DR terms:
Application verifier detected a thread, that acquired a critical section lock, is attempting to exit while the section is still locked
It is not complaining that thread created critical section and is now terminating. This has nothing to do with who creates and destroys the section. It is complaining, and very legitimately, that the thread owns the lock on that critical section and terminating. They could've made the wording of that message so much clearer.
As opposed to beasts such as COM objects, critical sections life cycle is not bound to a certain thread. Critical sections are constructed in two stages: upon creation, they are merely made of a few structures. Upon contention of two or more threads, a kernel mutex is created to handle the synchronization properly. Both the structures and the mutex can be created, accessed and destructed by any thread, be it the one that created the critical section or not.
To answer your questions, the above means you should have no problems creating the CS in one thread and destroying it in another. This might, however, imply some problem with the design. If you're not doing that already, consider wrapping the CS with a class that will initialize the CS in its constructor, and destroy it in its destructor (MFC's CCriticalSection does that). Then, have the wrapper created on a higher scope than the one that uses it (global, static class member, whatever). Should make the creation and cleanup much easier.
And lastly, regarding the error message itself - is it possible the thread that is being deleted has entered the CS without having the chance to leave it (due to an exception or so)? If that is the case, you got a weird looking message that points to a real problem.
Related
So we have an assertion engine.
What it does is it creates a assert helper thread, suspends every other thread, then pops up some interactive UI in the helper thread to talk to the user about the assertion failure. (We suspend other threads because we want a snapshot of the program state at the point the assert failed, and we don't want the other threads to advance).
This works well, most of the time.
A small percentage of the time, one of the suspended threads has held a lock -- usually the debug heap critical section -- and the assert helper thread blocks on its next allocation (which is hard to avoid doing).
I can see two ways around this. First, to do away with the in-process assertion handling (have it launch an out-of-process assertion dialog, and use IPC to communicate back and forth). It is possible this way we can manage that communication without heap allocations. Maybe.
That is a bunch of work, because it means we have to move the in-process stack-walking code out of process, etc.
The way we are trying right now is to add a watchdog thread. It notices if the assert helper thread has failed to make progress (maybe a failure for a timer message to be sent, maybe its instruction counter stops moving; irrelevant implementation detail).
When it detects this case, it tries to break the deadlock.
Our current method is to take threads at basically random, wake them up, then suspend them again, until we detect progress from the assert helper thread. This is ... haphazard, and slow.
To make picking the right thread faster, I'd like to determine if a given windows thread currently holds a critical section (and maybe other synchronization primitives). Then we can try those threads first.
So, is there a way to determine if a windows thread currently holds a CriticalSection while it is suspended?
I don't think there's a documented way to tell if a thread is in a critical section, and, if there was, I don't think it would be the right approach to your problem.
But to answer the question, you can peek inside the CRITICAL_SECTION data structure and see the handle of the thread that owns it. This doesn't directly answer the question, "Is this thread inside any critical section?" but it does let you answer, "Is this thread inside this critical section?" At least until some key implementation detail of CRITICAL_SECTION changes.
For your actual problem, I'd ask what benefit your assertion engine gives that isn't better handled by attaching a debugger when an assertion fails. A debugger is external, bypassing any deadlocks, and already knows how to walk the stacks, so you don't have to re-implement that.
I was just playing with Intel Parallel inspector on my project, and it displays a warning:
One or more threads in the application accessed the stack of another
thread. This may indicate one or more bugs in your application.
I do indeed have some objects that are allocated on stack shared between threads. I don't see why this is a problem. Any hints?
It's not wrong, it's just possibly wrong. Tools like Intel Parallel Inspector that provide additional diagnostics for your program must make a tradeoff between false positives and false negatives, in this case, it seems that the developers thought that accessing the stack of another thread was much more likely to be an error (low false positive rate if reported) than not (high false negative rate if not reported).
Valgrind is another example of a tool that can signal errors in code that is correct.
The real question here is, "what is the other thread doing?" If you think, "maybe it will return from that function and the stack frame will be invalid," then you are doing parallel programming wrong. No answer about multithreaded behavior should be qualified with "maybe". You had better make sure that that thread doesn't return, for example, by making it wait on a semaphore or condition variable, or by making it join with the other threads.
Discussion
Pubby: "AFAIK it's hugely inefficient."
The only reason it would be inefficient is because you might have multiple cores modifying the same cache lines, which is the same problem you have with other kinds of shared memory.
Collin: How do you know the stack frame is still good in another thread?
If you use something in multiple threads, you use some kind of synchronization mechanism to ensure that it's not modified in an invalid way. This situation is no different.
H2CO3: Well, is there a reason you should not walk into another person's house?
If we're going to play with analogies, I'd say that the process is the house, and each of the threads are people in the house. If Dave keeps a list of chores in his room, I'll go into his room every time I need to look at the list. If he stops doing that, he'd better tell me or else I'll start writing on random pieces of paper on his desk.
Conclusion
It's a matter of style whether this program behavior is acceptable or not.
Imagine this -- a thread is executing and a method is called which has a local (stack) variable (an object). It adds this object to a work queue, a queue which is processed by a separate thread.
That thread gets to the item added by the first thread and accesses the object, on the stack, of the first thread.
What has the first thread done in the meantime? It may have exited the method and freed up that stack space. That freed space may or may not be re-used. The second thread accessing the stack of the first thread may or may not work correctly, depending on timing and the call graph.
If you know the stack variable will exist while the second thread processes it then it can be safe to do; for example, if Thread 1 queues a stack variable and then blocks until Thread 2 notifies it has finished processing, that is a safe operation.
A warning rather than an error is issued because this may or may not be a legitimate operation, and there's no way for an analyzer to be certain.
So I have called createmutex like so
while(1){
HANDLE h;
h=CreateMutex(NULL,TRUE,"mutex1");
y=WaitForSingleObject(h,INFINITE);
///random code
ReleaseMutex(h)
}
It runs fine after looping twice, but deadlocks on WaitForSingleObject (h,INFINITE) after the third loop. This is with two threads running concurrently. How can it deadlock when ReleaseMutex is called? Is the createmutex function called correctly?
You're waiting on a mutex that's already owned... please don't do that.
Also, you're not destroying the mutex, only releasing it. The next call should give you ERROR_ALREADY_EXISTS. The complete quote from MSDN is "If the mutex is a named mutex and the object existed before this function call, the return value is a handle to the existing object, GetLastError returns ERROR_ALREADY_EXISTS, bInitialOwner is ignored, and the calling thread is not granted ownership."
If any of the "random code" waits for the other thread to make progress, it could deadlock while owning the mutex. In which case the other thread will wait forever trying to acquire the mutex, which is the behavior you're seeing.
I suspect you are trying to implement mutual exclusion within a single process. If that is so then the correct synchronization object is the critical section. The naming of these objects is a little confusing because both mutexes and critical sections peform mutual exclusion.
The interface for the critical section is much simpler to use, it being essentially an acquire function and a corresponding release function. If you are synchronizing within a single process, and you need a simple lock (rather than, say, a semaphore), you should use critical sections rather than mutexes.
In fact, very recently here on Stack Overflow, I wrote a more detailed answer to a question which described the standard usage pattern for critical sections. That post has lots of links to the pertinent sections of MSDN documentation.
You only need to use a mutex when you are performing cross process synchronization. Indeed you should only use a mutex when you are synchronizing across a process because critical sections perform so much better (i.e. faster).
I use to use critical section (in c++) to block theads execution whilel accessing shared data, but as to work them must need to wait until data is not used before blocking, maybe it's better to use them in main or thread.
Then if I want my main program to have priority and not be blocked must I use critical sections inside it to block other thread or the contrary ?
You seem to have rather a misconception over what critical sections are and how they work.
Speaking generically, a critical section (CS) is a piece of code that needs to run "exclusively" -- i.e., you need to ensure that only one thread is executing that piece of code at any given time.
As the term is used in most environments, a CS is really a mutex -- a mutual exclusion semaphore (aka binary semaphore). It's a data structure (and set of functions) you use to ensure that a section of code gets executed exclusively (rather than referring to the code itself).
In any case, a CS only makes sense at all when/if you have some code that will execute in more than one thread, and you need to ensure that it only ever executes in one thread at any given time. This is typically when you have some shared data that could and would be corrupted if more than one thread tried to manipulate it at one time. When/if that arises, you need to "use" the critical section for every thread that manipulates that data to assure that the shared data isn't corrupted.
Assuring that a particular thread remains responsive is a whole separate question. In most cases, this means using a queue (for one possibility) to allow the thread to "hand off" a task to some other thread quickly, with minimal contention (i.e., instead of using a CS for the duration of processing the data, the CS only lasts long enough to put a data structure into a queue, and some other thread takes the processing from there).
You cannot say "I am using critical section in thread A but not in thread B". Critical section is a piece of code that accesses shared resource. When this code is executed from two threads that run in parallel, shared resource might get corrupted so therefore you need to synchronise access to it: you need to use some of synchronisation objects (mutexes, semaphores, events...depending on the platform and API you are using). ThreadA locks the critical section so ThreadB needs to wait till ThreadA releases it.
If you want your main thread to block (wait) less than working thread, set working thread priority to be lower than priority of the main thread.
Is there a limit to the number of critical sections I can initialize and use?
My app creates a number of (a couple of thousand) objects that need to be thread-safe. If I have a critical section within each, will that use up too many resources?
I thought that because I need to declare my own CRITICAL_SECTION object, I don't waste kernel resources like I would with a Win32 Mutex or Event? But I just have a nagging doubt...?
To be honest, not all those objects probably need to be thread-safe for my application, but the critical section is in some low-level base class in a library, and I do need a couple of thousand of them!
I may have the opportunity to modify this library, so I was wondering if there is any way to lazily create (and then use from then on) the critical section only when I detect the object is being used from a different thread to the one it was created in? Or is this what Windows would do for me?
There's no limit to the number of CRITICAL_SECTION structures that you can declare -- they're just POD data structures at the lowest level. There may be some limit to the number that you can initialize with InitializeCriticalSection(). According to the documentation, it might raise a STATUS_NO_MEMORY exception on Windows 2000/XP/Server 2003, but apparently it's guaranteed to succeed on Vista. They don't occupy any kernel resources until you initialize them (if they take any at all).
If you find that the STATUS_NO_MEMORY exception is being raised, you can try only initializing the CRITICAL_SECTION for a given object if there's a chance it could be used in a multiple threads. If you know a particular object will only be used with one thread, set a flag, and then skip all calls to InitializeCriticalSection(), EnterCriticalSection(), LeaveCriticalSection(), and DeleteCriticalSection().
If you read carefully the documentation for IntializeCriticalSectionWithSpinCount(), it is clear that each critical section is backed by an Event object, although the API for critical sections treats them as opaque structures. Additionally, the 'Windows 2000' comment on the dwSpinCount parameter states that the event object is "allocated on demand."
I do not know of any documentation that says what conditions satisfy 'on demand,' but I would suspect that it is not created until a thread blocks while entering the critical section. For critical sections with a spin count, it may not be until the spin wait is exhausted.
Empirically speaking, I have worked on an application that I know to have created at least 60,000 live COM objects, each of which synchronizes itself with its own CRITICAL_SECTION. I have never seen any errors that suggested I had exhausted the supply of kernel objects.
Afaik most handle/resource types on Windows are limited by memory or maxint, whatever comes first. (in theory on 64-bit maxint could happen I guess).
The sometimes weasily texts that you find on this subject usually are relevant only to Win9x, which had some limitations. (64k kernel objects in total)