Platform-Specific Workarounds to C++ Static Destruction / Construction Order Problems

Platform-Specific Workarounds to C++ Static Destruction / Construction Order Problems - c++

I am developing with Visual Studio 2008 in standard (unmanaged) C++ under Windows XP Pro SP 3.
I have created a thread-safe wrapper around std::cout. This wrapper object is a drop-in replacement (i.e. same name) for what use to be a macro that was #defined to cout. It is used by a lot of code. Its behavior is probably pretty much as you would expect:
At construction, it creates a critical section.
During calls to operator<<(), it locks the critical section, passes the data to be printed to cout, and finally releases the critical section.
At destruction, it destroys the critical section.
This wrapper lives in static storage (it's global). Like all such objects, it constructs before main() starts and it destructs after main() exits.
Using my wrapper in the destructor of another object that also lives in static storage is problematic. Since the construction / destruction order of such objects is indeterminate, I may very well try to lock a critical section that has been destroyed. The symptom I'm seeing is that my program blocks on the lock attempt (though I suppose anything is liable to happen).
As far as ways to deal with this...
I could do nothing in the destructor; specifically, I would let the critical section continue to live. The C++ standard guarantees that cout never dies during program execution, and this would be the best possible attempt at making my wrapper behave similarly. Of course, my wrapper would "officially" be dead after its empty destructor runs, but it would probably (I hate that word) be as functional as it was before its destructor ran. On my platform, this does seem to be the case. But oh my gosh is this ugly, non-portable, and liable to future breakage...
I hold the critical section (but not the stream reference to cout) in a pimpl. All critical section accesses via the pimpl are preceded by a check for non-nullness of the pimpl. It so happens that I forgot to set the pimpl to 0 after calling delete on it in the destructor. If I were to set this to 0 (which I should do anyway), calls into my wrapper after it destructed would do nothing with the critical section but will still pass data to be printed to cout. On my platform, this also seems to work. Again, ugly...
I could tell my teammates to not use my wrapper after main() exits. Unfortunately, the aerodynamics of this would be about the same as that of a tank.
QUESTIONS:
* Question 1 *
For case 1, if I leave the critical section undestroyed, there will be a resource leak of a critical section in the OS. Will this leak persist after my program has fully exited? If not, case 1 becomes more viable.
* Question 2 *
For cases 1 and 2, does anybody know if on my particular platform I can indeed safely continue to use my wrapper after its empty destructor runs? It appears I can, but I want to see if anybody knows anything definitive about how my platform behaves in this case...
* Question 3 *
My proposals are obviously imperfect, but I do not see a truly correct solution. Does anybody out there know of a correct solution to this problem?
Side note: Of course, there is a converse problem that could occur if I try to use my wrapper in the constructor of another object that also lives in static storage. In that case, I may try to lock a critical section that has not yet been created. I would like to use the "construct on first use" idiom to fix this, but that entails a syntactic change of all the code that uses my wrapper. This would require giving up the naturalness of using the << operator. And there's way too much code to change anyway. So, this is not a workable option. I'm not very far into the thought process on this half of the problem, but I will ask one question that might be part of another imperfect way to deal with the problem...
* Question 4 *
As I've said, my wrapper lives in static storage (it's global) and it has a pimpl (hormonal problem :) ). I have the impression that the raw bytes of a variable in static storage are set to 0 at load time (unless initialized differently in code). This would mean that my wrapper's pimpl has a value of 0 before construction of my wrapper. Is this correct?
Thank You,
Dave

The first thing is that I would reconsider what you are doing altogether. You cannot create a thread safe interface by merely adding locking to each one of the operations. Thread safety must be designed into the interface. The problem with a drop in replacement as the one you propose is that it will make each single operation thread safe (I think they already are) but that does not avoid unwanted interleaving.
Consider two threads that executed cout << "Hi" << endl;, locking each operation does not exclude "HiHi\n\n" as output, and things get much ore complicated with manipulators, where one thread might change the format for the next value to be printed, but another thread might trigger the next write, in which case the two formats will be wrong.
On the particular question that you ask, You can consider using the same approach that the standard library takes with the iostreams:
Instead of creating the objects as globals, create a helper type that performs reference counting on the number of instances of the type. The constructor would check if the object is the first of its type to be created and initialize the thread safe wrapper. The last object to be destroyed would destroy your wrapper. The next piece of the puzzle is creating a global static variable of that type in a header that in turn includes the iostreams header. The last piece of the puzzle is that your users should include your header instead of iostreams.

Related

Using Critical Sections/Semaphores in C++

I recently started using C++ instead of Delphi.
And there are some things that seem to be quite different.
For example I don't know how to initialize variables like Semaphores and CriticalSections.
By now I only know 2 possible ways:
1. Initializing a Critical Section in the constructor is stupid since every instance would be using its own critical section without synchronizing anything, right?
2. Using a global var and initializing it when the form is created seems not to be the perfect solution, as well.
Can anyone tell me how to achieve this?
Just a short explanation for what I need the Critical Section :
I'd like to fill a ListBox from different threads.
The Semaphore :
Different threads are moving the mouse, this shouldn't be interrupted.
Thanks!

Contrary to Delphi, C++ has no concept of unit initialization/finalization (but you already found out about that).
What we are left with is very little. You need to distinguish two things:
where you declare your variable (global, static class member, class member, local to a function, static in a function -- I guess that covers it all)
where you initialize your variable (since you are concerned with a C API you have to call the initialization function yourself)
Fact is, in your case it hardly matters where you declare your variable as long as it is accessible to all the other parts of your program that need it, and the only requirement as to where you should initialize it is: before you actually start using it (which implies, before you start other threads).
In your case I would probably use a singleton pattern. But C++ being what it is, singletons suffer from race condition during their initialization, there is no clean way around that. So, in addition to your singleton, you should ensure that it is correctly created before you start using it in multithreaded context. A simple call to getInstance() at the start of your main() will do the trick (or anywhere else you see fit). As you see, this takes only care of where you declare your variable, not where you initialize it, but unfortunately C++ has important limitations when it comes to multithreading (it is under-specified) so there is no way around that.
To sum it up: just do what you want (as long as it works) and stop worrying.

In my opinion you only need a critical section to synchronize updates to a list box from various threads. Mouse will keep moving. Semaphore is not fitting the solution. You initialize the critical section in you class constructor. where the list box is. Write a method to update the listbox.
//psudo code
UpdateListBox()
{
//enter critical section
//update
//leave critical section
}
All the threads will call this method to update the listbox.
information about critical section is here
http://msdn.microsoft.com/en-us/library/windows/desktop/ms683472%28v=vs.85%29.aspx

Running method while destroying the object

A few days ago my friend told me about the situation, they had in their project.
Someone decided, that it would be good to destroy the object of NotVerySafeClass in parallel thread (like asynchronously). It was implemented some time ago.
Now they get crashes, because some method is called in main thread, while object is destroyed.
Some workaround was created to handle the situation.
Ofcourse, this is just an example of not very good solution, but still the question:
Is there some way to prevent the situation internally in NotVerySafeClass (deny running the methods, if destructor was called already, and force the destructor to wait, until any running method is over (let's assume there is only one method))?

No, no and no. This is a fundamental design issue, and it shows a common misconception in thinking about multithreaded situations and race conditions in general.
There is one thing that can happen equally likely, and this is really showing that you need an ownership concept: The function calling thread could call the function just right after the object has been destroyed, so there is no object anymore and try to call a function on it is UB, and since the object does not exist anymore, it also has no chance to prevent any interaction between the dtor and a member function.

What you need is a sound ownership policy. Why is the code destroying the object when it is still needed?
Without more details about the code, a std::shared_ptr would probably solve this issue. Depending on your specific situation, you may be able to solve it with a more lightweight policy.

Sounds like a horrible design. Can't you use smart pointer to make sure the object is destroyed only when no-one holds any references to it?
If not, I'd use some external synchronization mechanism. Synchronizing the destructor with a method is really awkward.

There is no methods that can be used to prevent this scenario.
In multithread programming, you need to make sure that an object will not be deleted if there are some others thread still accessing it.
If you are dealing with such code, it needs fundamental fix

(Not to promote bad design) but to answer your two questions:
... deny running the methods, if destructor was called already
You can do this with the solution proposed by #snemarch and #Simon (a lock). To handle the situation where one thread is inside the destructor, while another one is waiting for the lock at the beginning of your method, you need to keep track of the state of the object in a thread-safe way in memory shared between threads. E.g. a static atomic int that is set to 0 by the destructor before releasing the lock. The method checks for the int once it acquires the lock and bails if its 0.
... force the destructor to wait, until any running method is over
The solution proposed by #snemarch and #Simon (a lock) will handle this.

No. Just need to design the program propertly so that it is thread safe.

Why not make use of a mutex / semaphore ? At the beginning of any method the mutex is locked, and the destructor wait until the mutex is unlocked. It's a fix, not a solution. Maybe you should change the design of a part of your application.

Simple answer: no.
Somewhat longer answer: you could guard each and every member function and the destructor in your class with a mutex... welcome to deadlock opportunities and performance nightmares.
Gather a mob and beat some design sense into the 'someone' who thought parallel destruction was a good idea :)

Is this code thread-safe?

Let's say we have a thread-safe compare-and-swap function like
long CAS(long * Dest ,long Val ,long Cmp)
which compares Dest and Cmp, copies Val to Dest if comparison is succesful and returns the original value of Dest atomically.
So I would like to ask you if the code below is thread-safe.
while(true)
{
long dummy = *DestVar;
if(dummy == CAS(DestVar,Value,dummy) )
{
break;
}
}
EDIT:
Dest and Val parameters are the pointers to variables that created on the heap.
InterlockedCompareExchange is an example to out CAS function.

Edit. An edit to the question means most of this isn't relevant. Still, I'll leave this as all the concerns in the C# case also carry to the C++ case, but the C++ case brings many more concerns as stated, so it's not entirely irrelevant.
Yes, but...
Assuming you mean that this CAS is atomic (which is the case with C# Interlocked.CompareExchange and with some things available to use in some C++ libraries) the it's thread-safe in and of itself.
However DestVar = Value could be thread-safe in and of itself too (it will be in C#, whether it is in C++ or not is implementation dependent).
In C# a write to an integer is guaranteed to be atomic. As such, doing DestVar = Value will not fail due to something happening in another thread. It's "thread-safe".
In C++ there are no such guarantees, but there are on some processors (in fact, let's just drop C++ for now, there's enough complexity when it comes to the stronger guarantees of C#, and C++ has all of those complexities and more when it comes to these sort of issues).
Now, the use of atomic CAS operations in themselves will always be "thead-safe", but this is not where the complexity of thread safety comes in. It's the thread-safety of combinations of operations that is important.
In your code, at each loop either the value will be atomically over-written, or it won't. In the case where it won't it'll try again and keep going until it does. It could end up spinning for a while, but it will eventually work.
And in doing so it will have exactly the same effect as simple assignment - including the possibility of messing with what's happening in another thread and causing a serious thread-related bug.
Take a look, for comparison, with the answer at Is this use of a static queue thread-safe? and the explanation of how it works. Note that in each case a CAS is either allowed to fail because its failure means another thread has done something "useful" or when it's checked for success more is done than just stopping the loop. It's combinations of CASs that each pay attention to the possible state caused by other operations that allow for lock-free wait-free code that is thread-safe.
And now we've done with that, note also that you couldn't port that directly to C++ (it depends on garbage collection to make some possible ABA scenarios of little consequence, with C++ there are situations where there could be memory leaks). It really does also matter which language you are talking about.

It's impossible to tell, for any environment. You do not define the following:
What are the memory locations of DestVar and Value? On the heap or on the stack? If they are on the stack, then it is thread safe, as there is not another thread that can access that memory location.
If DestVar and Value are on the heap, then are they reference types or value types (have copy by assignment semantics). If the latter, then it is thread safe.
Does CAS synchronize access to itself? In other words, does it have some sort of mutual exclusion strucuture that has allows for only one call at a time? If so, then it is thread-safe.
If any of the conditions mentioned above are untrue, then it is indeterminable whether or not this is all thread safe. With more information about the conditions mentioned above (as well as whether or not this is C++ or C#, yes, it does matter) an answer can be provided.

Actually, this code is kind of broken. Either you need to know how the compiler is reading *DestVar (before or after CAS), which has wildly different semantics, or you are trying to spin on *DestVar until some other thread changes it. It's certainly not the former, since that would be crazy. If it's the latter, then you should use your original code. As it stands, your revision is not thread safe, since it isn't safe at all.

Static objects and singletons

I'm using boost singletons (from serialization).
For example, there are some classes which inherit boost::serialization::singleton. Each of them has such define near it's definition (in h-file):
#define appManager ApplicationManager::get_const_instance()
class ApplicationManager: public boost::serialization::singleton<ApplicationManager> { ... };
And I have to call some method from that class each update (nearly 17 ms), for example, 200 times. So the code is like:
for (int i=0; i < 200; ++i)
appManager.get_some_var();
I looked with gprof at function call stack and saw that boost::get_const_instance calls each time. Maybe, in release-mode compiler will optimize this?
My idea is to make some global variable like:
ApplicationManager &handle = ApplicationManager::get_const_instance();
And use handle, so it wouldn't call get_const_instnace each time. Is that right?

Instead of using the Singleton anti-pattern, just a global variable and be done with it. It's more honest.
The main benefit of Singleton is when you want lazy initialization, or more fine grained control over initialization order than a global variable would allow you. It doesn't look like either of these things are a concern for you, so just use a global.
Personally, I think designs with global variables or Singletons are almost certainly broken. But to each h(is/er) own.
If you are bent on using a Singleton, the performance concern you raise is interesting, but likely not an issue as the function call overhead is probably less than 100ns. As was pointed out, you should profile. If it really concerns you a whole lot, store a local reference to the Singleton before the loop:
ApplicationManager &myAppManager = appManager;
for (int i=0; i < 200; ++i)
myAppManager.get_some_var();
BTW, using that #define in that way is a serious mistake. Almost all cases where you use the preprocessor for anything other than conditional compilation based on compile-time flags is probably a poor use. Boost does make extensive use of the pre-processor, but mostly to get around C++ limitations. Do not emulate it.
Lastly, that function is probably doing something important. One of the jobs of a get_instance method for Singletons is to avoid having multiple threads initialize the same Singleton at the same time. With global variables this shouldn't be an issue because they should be initialized before you've started any threads.

Is it really a problem? I mean, does your application really suffers for this behaviour?
I would despise such a solution because, in all effects, you are countering one of the benefits of the Singleton pattern, namely to avoid global variables. If you want to use a global variable, then don't use Singleton at all, right?

Yes, that is certainly a possible solution. I'm not entirely sure what boost is doing with its singleton behind the scenes; you can look that up yourself in the code.
The singleton pattern is just like creating a global object and accessing the global object, in most respects. There are some differences:
1) The singleton object instance is not created until it is first accessed, whereas the global object is created at program startup.
2) Because the singleton object is not created until it is first accessed, it is actually created when the program is running. Thus the singleton instance has access to other fully constructed objects in the program when the constructor is actually running.
3) Because you access the singleton through the getInstance() method (boost's get_const_instance method) there is a little bit of overhead for executing that method call.
So if you're not concerned about when the singleton is actually created, and can live with it being created at program startup, you could just go with a global variable and access that. If you really need the singleton created after the program starts up, then you need the singleton. In that case, you can grab and hold onto a reference to the object returned by get_const_instance() and use that reference.
Something that bit me in the past though you should be aware of. You're actually getting a reference to the object that is owned by the singleton. You don't own that object.
1) Do not write code that would cause the destructor to execute (say, using a shared pointer on the returned reference), or write any other code that could cause the object to end up in a bad state.
2) In a multi-threaded app, take care to correctly lock fields in the object if the object may be used by more than one thread.
3) In a multi-threaded app, make sure that all threads that hold onto references to the object terminate before the program is unloaded. I've seen a case where the singleton's code resides in one DLL library; a thread that holds the reference lives in another DLL library. When the program ends, the thread was still active. The DLL holding the singleton's code was unloaded first; the thread that was still alive tried to do something to the singleton's object and caused a crash.

Singletons have their advantages in situations where you want to control the level of access to something at process or application scope beyond what a global variable could achieve in a more elegant way.
However most singleton objects provided in a library will be designed to ensure some level of thread safety and most likely access to the instance is being locked via a mutex or other critical section of some kind which can affect performance.
In the case of a game or 3d application where performance is key you may want to consider making your own lightweight singleton if thread safety is not a concern and gain some performance.

Can threads be safely created during static initialization?

At some point I remember reading that threads can't be safely created until the first line of main(), because compilers insert special code to make threading work that runs during static initialization time. So if you have a global object that creates a thread on construction, your program may crash. But now I can't find the original article, and I'm curious how strong a restriction this is -- is it strictly true by the standard? Is it true on most compilers? Will it remain true in C++0x? Is it possible for a standards conforming compiler to make static initialization itself multithreaded? (e.g. detecting that two global objects don't touch one another, and initializing them on separate threads to accelerate program startup)
Edit: To clarify, I'm trying to at least get a feel for whether implementations really differ significantly in this respect, or if it's something that's pseudo-standard. For example, technically the standard allows for shuffling the layout of members that belong to different access specifiers (public/protected/etc.). But no compiler I know of actually does this.

What you're talking about isn't strictly in the language but in the C Run Time Library (CRT).
For a start, if you create a thread using a native call such as CreateThread() on windows then you can do it anywhere you'd like because it goes straight to the OS with no intervention of the CRT.
The other option you usually have is to use _beginthread() which is part of the CRT. There are some advantages to using _beginthread() such as having a thread-safe errno. Read more about this here. If you're going to create threads using _beginthread() there could be some issues because the initializations needed for _beginthread() might not be in place.
This touches on a more general issue of what exactly happens before main() and in what order. Basically you have the program's entry point function which takes care of everything that needs to happen before main() with Visual Studio you can actually look at this piece of code which is in the CRT and find out for yourself what exactly's going on there. The easyest way to get to that code is to stop a breakpoint in your code and look at the stack frames before main()

The underlying problem is a Windows restriction on what you can and cannot do in DllMain. In particular, you're not supposed to create threads in DllMain. Static initialization often happens from DllMain. Then it follows logically that you can't create threads during static initialization.

As far as I can tell from reading the C++0x/1x draft, starting a thread prior to main() is fine, but still subject to the normal pitfalls of static initialization. A conforming implementation will have to make sure the code to intialize threading executes before any static or thread constructors do.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js