Recently my company has begun the process of upgrading to Visual Studio 2015 from Visual Studio 2010. The problem we're currently running into apparently seems to stem from a change in the behavior of the compiler. We can build and run our solution, but it seems to deadlock (it seems to just idle: CPU usage is nearly 0).
Stepping through with the debugger we've discovered an issue where a singleton object depends on itself during initialization. Here's an extremely stripped down version:
#include <iostream>
using namespace std;
struct Singleton
{
Singleton( int n )
{
cout << "Singleton( " << n << " )" << endl;
cout << Singleton::Instance().mN << endl;
mN = n;
}
static Singleton& Instance()
{
static Singleton instance( 5 );
return instance;
}
int mN;
};
int main() {
cout << Singleton::Instance().mN << endl;
return 0;
}
Naturally in our code there's a lot of other things going on, but this code exhibits the same behavior that we're seeing in the main project. In VS2010, this builds, runs, and terminates "normally". In VS2015 it deadlocks.
I've also tried this in ideone.com with various versions of C++ and all of those reproduce the deadlocking behavior. It makes sense to me that this doesn't work (nor should it work), because the object shouldn't depend on itself.
What I'm more curious about is why did this "work" in VS2010? What does the standard have to say about static variable initialization? Was this just a VS2010 (and possibly earlier) compiler bug?
The standard says that:
If control enters the declaration concurrently while the [block-scope variable with static or thread storage duration] is being initialized, the concurrent execution shall wait for completion of the initialization. If control re-enters the declaration recursively while the variable is being initialized, the behavior is undefined.
([stmt.dcl]/4)
The change made in C++11 is that initialization of local static variables is required to be thread-safe. The standard disallows recursion that would pass through the declaration again during the initialization, and the UB that results is manifesting as deadlock in your case---which makes perfect sense, since the second pass through the declaration is waiting forever for the first one to complete.
Now, this was undefined behavior in C++03 as well, but in a C++03 implementation, the initialization is not required to be thread-safe, so what probably happens is this: on the first pass through the declaration, a flag is set and then the constructor is called; the second pass sees the flag, assumes the variable is already initialized, and then returns a reference to it. Then the initialization completes.
You should rewrite your code, obviously, to avoid this recursive initialization.
Related
Before coroutines we used callbacks to run asynchronous operations. Callbacks was normal functions and could have thread_local variables.
Let see this example:
void StartTcpConnection(void)
{
using namespace std;
thread_local int my_thread_local = 1;
cout << "my_thread_local = " << my_thread_local << endl;
auto tcp_connection = tcp_connect("127.0.0.1", 8080);
tcp_connection.async_wait(TcpConnected);
}
void TcpConnected(void)
{
using namespace std;
thread_local int my_thread_local = 2;
cout << "my_thread_local = " << my_thread_local << endl;
}
As we see from code, I have some (undocumented here) tcp_connect function that connects to TCP endpoint and returns tcp_connection object. This object can wait until TCP connection will really occur and call TcpConnected function. Because we don't know specific implementation of tcp_connect and tcp_connection, we don't know will it call TcpConnected on the same or on different thread, both implementations are possible. But we know for sure that my_thread_local is different for 2 different functions, because each function has its own scope.
If we need this variable to be the same (as soon as thread is the same), we can create 3rd function that will return reference to thread_local variable:
int& GetThreadLocalInt(void)
{
thread_local int my_variable = 1;
return my_variable;
}
So, we have full control and predictability: we know for sure that variables will be different if TcpConnected and StartTcpConnection will run on different threads, and we know that we can have them different or the same depending on our choice when these functions will run on the same thread.
Now let see coroutine version of the same operation:
void Tcp(void)
{
thread_local int my_thread_local = 1;
auto tcp_connection = co_await tcp_connect("127.0.0.1", 8080);
cout << "my_thread_local = " << my_thread_local << endl;
}
This situation is a bit questionable for me. I still need thread local storage, it is important language feature that I don't want to abandon. However, we here have 2 cases:
Thread before co_await is the same one as after co_await. What will happen with my_thread_local? Will it be the same variable before and after co_await, especially if we'll use GetThreadLocalInt function to get its reference instead of value?
Thread changes after co_await. Will C++ runtime reinitialize my_thread_local to value from new thread, or make a copy of previous thread value, or may be use reference to the same data? And similar question for GetThreadLocalInt function, it returns reference to thread_local object, but the reference storage itself is auto, will coroutine reinitialize it to new thread, or we'll get (dangerous!!!) race condition, because thread 2 will strangely get reference to thread 1 thread local data and potentially use it in parallel?
Even it is easy to debug and test what will happen on any specific compiler, the important question is whether standard says us something about that, otherwise even if we'll test it on VC++ or gcc an see that it behaves somehow on these 2 popular compilers, the code may loose portability and compile differently on some exotic compilers.
For global thread_local variables, the coroutine behavior ought to be as expected (and MSVC seems to have a bug with this). But function-local thread_local variables in coroutines, there seems to be a hole in the specification. Indeed, I'm not sure the wording makes sense even without coroutines.
[stmt.dcl]/3 says:
Dynamic initialization of a block variable with static storage duration or thread storage duration is performed the first time control passes through its declaration; such a variable is considered initialized upon the completion of its initialization.
The problem is that, while by C++'s rules there is only one thread_local variable, there are multiple objects represented by that one variable. And those objects need to be initialized. So... how does that happen?
The only sane interpretation of this is that the object for a particular thread gets initialized the first time control flow passes through the declaration on that thread. And that's the problem.
If using co_await the thread a function executes on changes, then how could you pass through the declaration on the new thread? Which means that the thread_local for that thread should be zero-initialized.
Ultimately, I would say that you should never use thread_local in a coroutine function. It's just not clear what value the thread_local ought to have. And the only logical value for the new thread_local to have is the one from the previous thread. But that's not how a thread_local is supposed to work. Overall, the idea feels inherently nonsensical (and I would say that the standard should have explicitly forbid the declaration of thread_locals in a coroutine, just as they did for using co_await in a thread_local's initializer).
Just use a namespace-scoped thread_local variable. Outside of the aforementioned MSVC bug, it ought to work and make sense.
Rule 3.3.1 of the HIC++ Coding Standard restricts using variables with static storage duration even if they are declared in a block scope:
The order of initialization of block scope objects with static storage duration is well defined. However, the lifetime of such an object ends at program termination, which may be incompatible with future uses of the code, e.g. as a shared library.
Application const & theApp()
{
static Application app; // Non-Compliant
return app;
}
The question is what the incompatibilites can occur.
UPD. After I got reasonable remarks from #Employed-Russian I realized that some clarifications needed. I can imagine some issues with multi-process access to static variables. For example, on some Linux implementations the same memory shared with forked process until first memory write. It calls copy-on-write. So if we have following code executed on such system
#include <iostream>
#include <unistd.h>
using namespace std;
struct A
{
A() {cout << __FUNCTION__ << '\n';}
~A() {cout << __FUNCTION__ << '\n';}
};
static void f()
{
static A a;
}
int main()
{
f();
fork();
return 0;
}
we could get output like
A
~A
~A
that is it, double call of destructor with single call of constructor. Which may not supposed to be because on other system we can get A ~A A ~A. So we can imagine some issues with static variables in common, but what is the particular problem with shared libraries and block-scoped statics?
The question is what the incompatibilites can occur.
As an example, the library may become unusable in a multi-threaded program in which not all threads are terminated before exit is called.
That is, if thread T0 called exit, the Application destructor will be called at some point (because it was registered with atexit on the first call to theApp()).
If thread T1 is still running, and has a reference to theApp.app, it will have now have a reference to destructed object, and will likely crash.
This is known as at-exit race, and may be highly irreproducible (if T0 reaches sys_exit before T1 has a chance to crash, you will observe normal exit).
Update:
double call of destructor. Which isn't supposed to be
You are mistaken: this is working exactly as it should. There is no double call to destructor here: you have two processes, and each destructs its own object (as it should).
class TestClass
{
public:
int x, y;
TestClass();
};
TestClass::TestClass()
{
cout << "TestClass ctor" << endl;
}
TestClass GlobalTestClass;
int main()
{
cout << "main " << endl;
return 0;
}
In this code as known first output will be "TestClass ctor".
My question: Does the ctor function call codes run before main() (I mean, does entry point change ?) , or right after main() and before the executable statements or is there different mechanism ? (Sorry for English)
The question as stated is not very meaningful, because
main is not the machine code level entry point to the program (main is called by the same code that e.g. executes constructors of non-local static class type variables), and
the notion of “right after main() and before the executable statements” isn't very meaningful: the executable statements are in main.
Generally, in practice you can count on the static variable being initialized before main in your concrete example, but the standard does not guarantee that.
C++11 §3.6.2/4:
” It is implementation-defined whether the dynamic initialization of a non-local variable with static storage
duration is done before the first statement of main. If the initialization is deferred to some point in time
after the first statement of main, it shall occur before the first odr-use (3.2) of any function or variable
defined in the same translation unit as the variable to be initialized.
It's a fine point whether the automatic call of main qualifies as odr-use. I would think not, because one special property of main is that it cannot be called (in valid code), and its address cannot be taken.
Apparently the above wording is in support of dynamically loaded libraries, and constitutes the only support of such libraries.
In particular, I would be wary of using thread local storage with dynamically loaded libraries, at least until I learned more about the guarantees offered by the standard in that respect.
Yes, objects with static storage duration are initialized before main(), so indeed the "entry point" is before main(). See e.g. http://en.cppreference.com/w/cpp/language/storage_duration
In fact (although not recommended), you can run a whole program with a trivial main(){}, by putting everything in global instances.
I noticed that if you initialize a static variable in C++ in code, the initialization only runs the first time you run the function.
That is cool, but how is that implemented? Does it translate to some kind of twisted if statement? (if given a value, then ..)
void go( int x )
{
static int j = x ;
cout << ++j << endl ; // see 6, 7, 8
}
int main()
{
go( 5 ) ;
go( 5 ) ;
go( 5 ) ;
}
Yes, it does normally translate into an implicit if statement with an internal boolean flag. So, in the most basic implementation your declaration normally translates into something like
void go( int x ) {
static int j;
static bool j_initialized;
if (!j_initialized) {
j = x;
j_initialized = true;
}
...
}
On top of that, if your static object has a non-trivial destructor, the language has to obey another rule: such static objects have to be destructed in the reverse order of their construction. Since the construction order is only known at run-time, the destruction order becomes defined at run-time as well. So, every time you construct a local static object with non-trivial destructor, the program has to register it in some kind of linear container, which it will later use to destruct these objects in proper order.
Needless to say, the actual details depend on implementation.
It is worth adding that when it comes to static objects of "primitive" types (like int in your example) initialized with compile-time constants, the compiler is free to initialize that object at startup. You will never notice the difference. However, if you take a more complicated example with a "non-primitive" object
void go( int x ) {
static std::string s = "Hello World!";
...
then the above approach with if is what you should expect to find in the generated code even when the object is initialized with a compile-time constant.
In your case the initializer is not known at compile time, which means that the compiler has to delay the initialization and use that implicit if.
Yes, the compiler usually generates a hidden boolean "has this been initialized?" flag and an if that runs every time the function is executed.
There is more reading material here: How is static variable initialization implemented by the compiler?
While it is indeed "some kind of twisted if", the twist may be more than you imagined...
ZoogieZork's comment on AndreyT's answer touches on an important aspect: the initialisation of static local variables - on some compilers including GCC - is by default thread safe (a compiler command-line option can disable it). Consequently, it's using some inter-thread synchronisation mechanism (a mutex or atomic operation of some kind) which can be relatively slow. If you wouldn't be comfortable - performance wise - with explicit use of such an operation in your function, then you should consider whether there's a lower-impact alternative to the lazy initialisation of the variable (i.e. explicitly construct it in a threadsafe way yourself somewhere just once). Very few functions are so performance sensitive that this matters though - don't let it spoil your day, or make your code more complicated, unless your programs too slow and your profiler's fingering that area.
They are initialized only once because that's what the C++ standard mandates. How this happens is entirely up to compiler vendors. In my experience, a local hidden flag is generated and used by the compiler.
#include <iostream>
using namespace std;
class Foo
{
public:
Foo(): initialised(0)
{
cout << "Foo() gets called AFTER test() ?!" << endl;
};
Foo test()
{
cout << "initialised= " << initialised << " ?! - ";
cout << "but I expect it to be 0 from the 'initialised(0)' initialiser on Foo()" << endl;
cout << "this method test() is clearly working on an uninitialised object ?!" << endl;
return Foo();
}
~Foo()
{};
private:
int initialised;
};
int main()
{
//SURE this is bad coding but it compiles and runs
//I want my class to DETECT and THROW an error to prevent this type of coding
//in other words how to catch it at run time and throw "not initialised" or something
Foo foo=foo.test();
}
Yes, it is calling the function on a yet not constructed object, which is undefined behavior. You can't detect it reliable. I would argue you also should not try to detect it. It's nothing which would happen likely by accident, compared to for example calling a function on an already deleted object. Trying to catch every and all possible mistakes is just about impossible. The name declared is visible already in its initializer, for other useful purposes. Consider this:
Type *t = (Type*)malloc(sizeof(*t));
Which is a common idiom in C programming, and which still works in C++.
Personally, i like this story by Herb Sutter about null references (which are likewise invalid). The gist is, don't try to protect from cases that the language clearly forbids and in particular are in their general case impossible to diagnose reliably. You will get a false security over time, which becomes quite dangerous. Instead, train your understanding of the language and design interfaces in a way (avoid raw pointers, ...) that reduces the chance of doing mistakes.
In C++ and likewise in C, many cases are not explicitly forbidden, but rather are left undefined. Partially because some things are rather difficult to diagnose efficiently and partially because undefined behavior lets the implementation design alternative behavior for it instead of completely ignoring it - which is used often by existing compilers.
In the above case for example, any implementation is free to throw an exception. There are other situations that are likewise undefined behavior which are much harder to diagnose efficiently for the implementation: Having an object in a different translation unit accessed before it was constructed is such an example - which is known as the static initialization order fiasco.
The constructor is the method you want (not running before initialization but rather on initialization, but that should be OK). The reason it doesn't work in your case is that you have undefined behavior here.
Particularly, you use the not-yet-existent foo object to initialize itself (eg. the foo in foo.Test() doesn't exist yet). You can solve it by creating an object explicitly:
Foo foo=Foo().test()
You cannot check for it in the program, but maybe valgrind could find this type of bug (as any other uninitialized memory access).
You can't prevent people from coding poorly, really. It works just like it "should":
Allocate memory for Foo (which is the value of the "this" pointer)
Going to Foo::test by doing: Foo::test(this), in which,
It gets the value by this->initialised, which is random junk, then it
Calls Foo's default constructor (because of return Foo();), then
Call Foo's copy constructor, to copy the right-handed Foo().
Just like it should. You can't prevent people from not knowing the right way to use C++.
The best you could do is have a magic number:
class A
{
public:
A(void) :
_magicFlag(1337)
{
}
void some_method(void)
{
assert (_magicFlag == 1337); /* make sure the constructor has been called */
}
private:
unsigned _magicFlag;
}
This "works" because the chances _magicFlag gets allocated where the value is already 1337 is low.
But really, don't do this.
You're getting quite a few responses that basically say, "you shouldn't expect the compiler to help you with this". However, I'd agree with you that the compiler should help with this problem by with some sort of diagnostic. Unfortunately (as the other answers point out), the language spec doesn't help here - once you get to the initializer part of the declaration, the newly declared identifier is in scope.
A while back, DDJ had an article about a simple debugging class called "DogTag" that could be used as a debugging aid to help with:
using an object after deletion
overwriting an object's memory with garbage
using an object before initializing it
I haven't used it much - but it did come in handly on an embedded project that was running into some memory overwrite bugs.
It's basically an elaboration of the "MagicFlag" technique that GMan described.