C++ static initialization order - c++

When I use static variables in C++, I often end up wanting to initialize one variable passing another to its constructor. In other words, I want to create static instances that depend on each other.
Within a single .cpp or .h file this is not a problem: the instances will be created in the order they are declared. However, when you want to initialize a static instance with an instance in another compilation unit, the order seems impossible to specify. The result is that, depending on the weather, it can happen that the instance that depends on another is constructed, and only afterwards the other instance is constructed. The result is that the first instance is initialized incorrectly.
Does anyone know how to ensure that static objects are created in the correct order? I have searched a long time for a solution, trying all of them (including the Schwarz Counter solution), but I begin to doubt there is one that really works.
One possibility is the trick with the static function member:
Type& globalObject()
{
static Type theOneAndOnlyInstance;
return theOneAndOnlyInstance;
}
Indeed, this does work. Regrettably, you have to write globalObject().MemberFunction(), instead of globalObject.MemberFunction(), resulting in somewhat confusing and inelegant client code.
Update: Thank you for your reactions. Regrettably, it indeed seems like I have answered my own question. I guess I'll have to learn to live with it...

You have answered your own question. Static initialization order is undefined, and the most elegant way around it (while still doing static initialization i.e. not refactoring it away completely) is to wrap the initialization in a function.
Read the C++ FAQ items starting from https://isocpp.org/wiki/faq/ctors#static-init-order

Maybe you should reconsider whether you need so many global static variables. While they can sometimes be useful, often it's much simpler to refactor them to a smaller local scope, especially if you find that some static variables depend on others.
But you're right, there's no way to ensure a particular order of initialization, and so if your heart is set on it, keeping the initialization in a function, like you mentioned, is probably the simplest way.

Most compilers (linkers) actually do support a (non-portable) way of specifying the order. For example, with visual studio you can use the init_seg pragma to arrange the initialization into several different groups. AFAIK there is no way to guarantee order WITHIN each group. Since this is non-portable you may want to consider if you can fix your design to not require it, but the option is out there.

Indeed, this does work. Regrettably, you have to write globalObject().MemberFunction(), instead of globalObject.MemberFunction(), resulting in somewhat confusing and inelegant client code.
But the most important thing is that it works, and that it is failure proof, ie. it is not easy to bypass the correct usage.
Program correctness should be your first priority. Also, IMHO, the () above is purely stylistic - ie. completely unimportant.
Depending on your platform, be careful of too much dynamic initialization. There is a relatively small amount of clean up that can take place for dynamic initializers (see here). You can solve this problem using a global object container that contains members different global objects. You therefore have:
Globals & getGlobals ()
{
static Globals cache;
return cache;
}
There is only one call to ~Globals() in order to clean up for all global objects in your program. In order to access a global you still have something like:
getGlobals().configuration.memberFunction ();
If you really wanted you could wrap this in a macro to save a tiny bit of typing using a macro:
#define GLOBAL(X) getGlobals().#X
GLOBAL(object).memberFunction ();
Although, this is just syntactic sugar on your initial solution.

dispite the age of this thread, I would like to propose the solution I've found.
As many have pointed out before of me, C++ doesn't provide any mechanism for static initialization ordering. What I propose is to encapsule each static member inside a static method of the class that in turn initialize the member and provide an access in an object-oriented fashion.
Let me give you an example, supposing we want to define the class named "Math" which, among the other members, contains "PI":
class Math {
public:
static const float Pi() {
static const float s_PI = 3.14f;
return s_PI;
}
}
s_PI will be initialized the first time Pi() method is invoked (in GCC). Be aware: the local objects with static storage have an implementation dependent lifecyle, for further detail check 6.7.4 in 2.
Static keyword, C++ Standard

Wrapping the static in a method will fix the order problem, but it isn't thread safe as others have pointed out but you can do this to also make it thread if that is a concern.
// File scope static pointer is thread safe and is initialized first.
static Type * theOneAndOnlyInstance = 0;
Type& globalObject()
{
if(theOneAndOnlyInstance == 0)
{
// Put mutex lock here for thread safety
theOneAndOnlyInstance = new Type();
}
return *theOneAndOnlyInstance;
}

Related

Static variable from a static linked lib used before its construction [duplicate]

When I use static variables in C++, I often end up wanting to initialize one variable passing another to its constructor. In other words, I want to create static instances that depend on each other.
Within a single .cpp or .h file this is not a problem: the instances will be created in the order they are declared. However, when you want to initialize a static instance with an instance in another compilation unit, the order seems impossible to specify. The result is that, depending on the weather, it can happen that the instance that depends on another is constructed, and only afterwards the other instance is constructed. The result is that the first instance is initialized incorrectly.
Does anyone know how to ensure that static objects are created in the correct order? I have searched a long time for a solution, trying all of them (including the Schwarz Counter solution), but I begin to doubt there is one that really works.
One possibility is the trick with the static function member:
Type& globalObject()
{
static Type theOneAndOnlyInstance;
return theOneAndOnlyInstance;
}
Indeed, this does work. Regrettably, you have to write globalObject().MemberFunction(), instead of globalObject.MemberFunction(), resulting in somewhat confusing and inelegant client code.
Update: Thank you for your reactions. Regrettably, it indeed seems like I have answered my own question. I guess I'll have to learn to live with it...
You have answered your own question. Static initialization order is undefined, and the most elegant way around it (while still doing static initialization i.e. not refactoring it away completely) is to wrap the initialization in a function.
Read the C++ FAQ items starting from https://isocpp.org/wiki/faq/ctors#static-init-order
Maybe you should reconsider whether you need so many global static variables. While they can sometimes be useful, often it's much simpler to refactor them to a smaller local scope, especially if you find that some static variables depend on others.
But you're right, there's no way to ensure a particular order of initialization, and so if your heart is set on it, keeping the initialization in a function, like you mentioned, is probably the simplest way.
Most compilers (linkers) actually do support a (non-portable) way of specifying the order. For example, with visual studio you can use the init_seg pragma to arrange the initialization into several different groups. AFAIK there is no way to guarantee order WITHIN each group. Since this is non-portable you may want to consider if you can fix your design to not require it, but the option is out there.
Indeed, this does work. Regrettably, you have to write globalObject().MemberFunction(), instead of globalObject.MemberFunction(), resulting in somewhat confusing and inelegant client code.
But the most important thing is that it works, and that it is failure proof, ie. it is not easy to bypass the correct usage.
Program correctness should be your first priority. Also, IMHO, the () above is purely stylistic - ie. completely unimportant.
Depending on your platform, be careful of too much dynamic initialization. There is a relatively small amount of clean up that can take place for dynamic initializers (see here). You can solve this problem using a global object container that contains members different global objects. You therefore have:
Globals & getGlobals ()
{
static Globals cache;
return cache;
}
There is only one call to ~Globals() in order to clean up for all global objects in your program. In order to access a global you still have something like:
getGlobals().configuration.memberFunction ();
If you really wanted you could wrap this in a macro to save a tiny bit of typing using a macro:
#define GLOBAL(X) getGlobals().#X
GLOBAL(object).memberFunction ();
Although, this is just syntactic sugar on your initial solution.
dispite the age of this thread, I would like to propose the solution I've found.
As many have pointed out before of me, C++ doesn't provide any mechanism for static initialization ordering. What I propose is to encapsule each static member inside a static method of the class that in turn initialize the member and provide an access in an object-oriented fashion.
Let me give you an example, supposing we want to define the class named "Math" which, among the other members, contains "PI":
class Math {
public:
static const float Pi() {
static const float s_PI = 3.14f;
return s_PI;
}
}
s_PI will be initialized the first time Pi() method is invoked (in GCC). Be aware: the local objects with static storage have an implementation dependent lifecyle, for further detail check 6.7.4 in 2.
Static keyword, C++ Standard
Wrapping the static in a method will fix the order problem, but it isn't thread safe as others have pointed out but you can do this to also make it thread if that is a concern.
// File scope static pointer is thread safe and is initialized first.
static Type * theOneAndOnlyInstance = 0;
Type& globalObject()
{
if(theOneAndOnlyInstance == 0)
{
// Put mutex lock here for thread safety
theOneAndOnlyInstance = new Type();
}
return *theOneAndOnlyInstance;
}

class without instance data, or namespace + globals in a sub-namespace?

I have a few related methods, all of which use some global data (*), and all of which are to be implemented in a header. Should I...
place the methods in a class as static methods, with no instance data members, and the global variable(s) as a static class members?
place the methods in a namespace, and the global variable(s) in a ::detail sub-namespace?
Both of these options seem a bit ugly to me, but I'm wondering whether one of them has methodical benefits I'm missing, which should make me prefer it. I could try to bend-over-backwards and move the mutex into a .cpp file, so that might be a third option but it'll not be pretty...
(*) - in my specific case it's a mutex.
The only added bonus of using a class here, is that that compiler can complain if anyone ever reaches for that mutex (if you make it private, of course).
But, in both cases, you can only declare the data in the header, and you must still define it only once in some translation unit. I'd recommend you dispense with the in-header implementation, and hide the static mutex along with function implementation in the same translation unit.
From that point on, you can use a namespace, since it becomes a matter of organization only.
From a general "good practice" perspective, I would avoid any global data, and use static data as less as possible ( avoid the singleton syndrome ).
The first question is: Does it hurt to have several copy of those objects? Usually, if you are not managing hardware access or similar, it does not. Even if you are supposed to use only one, it is good to make it as general as possible and allows several instance of the class.
Also, static / global variables tend to carry issues with threads/processes, as it is some times difficult to figure how many copy of the static object you have.
So as a short answer: Put them in a class, as non-static, and put those "global" data into the class also.
If the reason to make those data is due to multiple threads or processes, then probably something is wrong at a design point of view.
If you do not have choice over a more appropriate solution, then I would use the namespace: a class using global parameters would break the encapsulation principle which any programmer would expect from your code.
EDITED:
After the edition in the question and trying to get closer of what the question is:
Supposing you require a wrapper around a mutex, which is unique per process.
class MyMutexWrapper // very minimal class, you should finish it.
{
MyMutex mutex;
public:
anyFunction();
};
Once you have this, you may create a single global instance of it and share it through the whole code. A better solution still to create it in any scope and pass it to any component that make use if it.
You could for example pass a reference.
This approach could seem more complex, but it avoid cross dependencies and make your code structure cleaner => you could then reuse it, extend it, maintain it...

How to make an old C codebase with many globals reentrant

I'm working with a large, old C codebase (an interpreter) that uses global variables a great deal, with the result that I cannot have two instances of it at once. Is there a straightforward (ideally automated) approach to convert this code to something reentrant? i.e. some refactor tool that would make all globals part of a struct and prepend the pointer to all variables?
Could I convert to C++ and wrap the entire thing in a class definition?
I would recommend to convert your project into C++11 project and change all your static vars into threadlocal.
This can be up to several days of work depending on the size of your project. In certain cases this will work.
I'm not aware of any "ready made" solution for this type of problem.
As a general rule, global variables are going to make it hard to make the code reentrant.
If you can remove all the global variables [simply delete the globals and see where you get compiler errors]. Replace the globals with a structure, and then use a structure per instance that is passed along, you'd be pretty much done (as long as the state of the interpreter instances is independent, and the instances don't need to know about each other). [Of course, you may need to have more than a single structure to solve the problem, but your global variables should be possible to "stick in a structure"].
Of course, making the structure and the code go together as a C++ class (which may have smaller classes as part of the solution) would be the "next step", but it's not entirely straight forward to do this, if you are not familiar with C++ and class designs.
Are you trying to make it reentrant in order to be able to make it multi-thread, and divide the work between threads?
If so, I would consider making it multi process, instead of multy-thread,
What I usually do with interpreters is go straight to a class with instance vars rather than globals. Not sure what you are interpreting, but it could be possible to pass in a file path or string container that the class interprets with an internal thread, so encapsulating the entire interpretation run.
It is possible to wrap the whole thing in a class definition, but it will not work for code that takes addresses of functions and passes them to C code. Also, converting a large legacy code base to be compilable by a C++ compiler is tedious enough that it probably outweighs the effort of removing the global variables by hand.
Barring that one, you have two options:
Since you need reentrancy to implement threading, it might be easiest to declare all global variables thread-local. If you have a C compiler that supports thread-locals, this is typically as easy as slapping a __thread (or other compiler-specific keyword) before every declaration. Then you create a new interpreter simply by creating a new thread and initializing the interpreter in the normal way.
If you cannot use __thread or equivalent, then you have to do a bit more footwork. You need to place all global variables in a structure, and replace every access to global variable foo with get_ctx()->foo. This is tedious, but straightforward. Once you are done, get_ctx() can allocate and return a thread-local interpreter state, using an API of your choosing.
A program transformation tool that can handle the C language would be able to automate such changes.
It needs to be able to resolve each symbol to its declaration, and preprocessor conditionals are likely to be trouble. I'd name one, but SO zealots object when I do that.
Your solution of a struct containing the globals is the right idea; you need rewrite that replaces each global declaration with a slot member, and each access to a global with access to the corresponding struct member.
The remaining question is, where does the struct pointer come from? One answer is a global variable that is multiplexed when threads are switched; a better answer if available under your OS is the get the struct pointer from thread local variables.

Use of global variables in simulation code [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Are global variables bad?
I am writing a simulation code making use of material and energy specific data. This data is stored in global arrays, because once uploaded, they are used during the simulation and should be accessible by most of the functions.
I read everywhere that it is not good practice to use global variables.
Could someone explain me or point me to material on the web explaining how I could avoid using global arrays in simulation application coding while massive data arrays need to be used. I try to code in C++ and make use as much as possible of object oriented features.
Thanks in advance for your help.
You are right about the fact that, using globals are not recommended. You can declare those unrelated golbals inside a namespace,
//Globals.h
namespace Globals
{
extern int a[100];
extern double d;
}
and define them in a .cpp file.
//Globals.cpp
int Globals::a[100] = { ... };
double Globals::d = 3.14;
Now use them as Globals::a, Globals::d etc. My answer is in code management perspective.
Yes you are right, global variables are not good.
This is a useful link which explains why global variables are bad and how to avoid them.
http://c2.com/cgi/wiki?GlobalVariablesAreBad
EDIT:
#sergio's post also points to the same link, you can ignore this answer
Could someone explain me or point me to material on the web explaining how I could avoid using global arrays in simulation application coding while massive data arrays need to be used.
The same way you avoid globals in general: by using locals instead. The natural way to get data into a function is to pass it as a parameter. This is not expensive, especially if you pass by reference where appropriate.
Have a look at this article about global variables. This is an excerpt:
Why Global Variables Should Be Avoided When Unnecessary
Non-locality -- Source code is easiest to understand when the scope of its individual elements are limited. Global variables can be read or modified by any part of the program, making it difficult to remember or reason about every possible use.
No Access Control or Constraint Checking -- A global variable can be get or set by any part of the program, and any rules regarding its use can be easily broken or forgotten. (In other words, get/set accessors are generally preferable over direct data access, and this is even more so for global data.) By extension, the lack of access control greatly hinders achieving security in situations where you may wish to run untrusted code (such as working with 3rd party plugins).
Implicit coupling -- A program with many global variables often has tight couplings between some of those variables, and couplings between variables and functions. Grouping coupled items into cohesive units usually leads to better programs.
Concurrency issues -- if globals can be accessed by multiple threads of execution, synchronization is necessary (and too-often neglected). When dynamically linking modules with globals, the composed system might not be thread-safe even if the two independent modules tested in dozens of different contexts were safe.
Namespace pollution -- Global names are available everywhere. You may unknowingly end up using a global when you think you are using a local (by misspelling or forgetting to declare the local) or vice versa. Also, if you ever have to link together modules that have the same global variable names, if you are lucky, you will get linking errors. If you are unlucky, the linker will simply treat all uses of the same name as the same object.
Memory allocation issues -- Some environments have memory allocation schemes that make allocation of globals tricky. This is especially true in languages where "constructors" have side-effects other than allocation (because, in that case, you can express unsafe situations where two globals mutually depend on one another). Also, when dynamically linking modules, it can be unclear whether different libraries have their own instances of globals or whether the globals are shared.
Testing and Confinement - source that utilizes globals is somewhat more difficult to test because one cannot readily set up a 'clean' environment between runs. More generally, source that utilizes global services of any sort (e.g. reading and writing files or databases) that aren't explicitly provided to that source is difficult to test for the same reason. For communicating systems, the ability to test system invariants may require running more than one 'copy' of a system simultaneously, which is greatly hindered by any use of shared services - including global memory - that are not provided for sharing as part of the test.
It also discusses several alternatives. Possibly in your case, you could consider:
hiding your globals (e.g., private static variables);
stateful procedures: setter and getter functions allowing access to the arrays while also "masking" it;
the singleton pattern.
EDIT:
I understand that a part of the development community are against the use of the singleton pattern. I fully respect this opinion. Anyway, in the context of the present discussion, the singleton offers several advantages over the raw use of globals:
improved access control;
opportunity for synchronization;
ability to abstract away the implementation.
In this respect, it is not better from a setter/getter set of functions, but still, not worse. I leave to the OP the hard task of choosing what to do with his own code. (BTW, the article discusses more approaches, like Context Objects, DependencyInjection, etc).
Introducing global state into your code can make it difficult to do things in a multi-threaded way.
I would also argue it can make the intent of your code more difficult to follow. If you pass all of the arguments to a function as parameters at least it's clear what data the function has access to, and what has the potential of changing. The use of global variables doesn't give someone reading the code this chance...
It's also not generally true that using global variables is in any way faster. If you have large objects that you need to pass to functions, pass these arguments via references and there won't be any issues with copying.
Without knowing more about your setup, it's difficult to make any recommendations, but if you have a large amount of data that needs to be passed around to a series of routines I would be tempted to put it all in a struct, and to pass that struct by reference:
struct my_data_type
{
double some_data;
double some_other_data;
std::vector<double> some_coefficients;
std::vector<double> some_other_coefficients;
std::string some_name;
std::string some_other_name;
// add more members as necessary...
};
void foo(my_data_type &data)
{
// there's no big overhead passing data by reference
}
If you only need to access the data in a read-only fashion, it's best to pass as a const reference:
void foo(my_data_type const&data)
{
// again, efficient, but ensures that data can't be modified
}
EDIT: In answer to your comment, I'm not talking about a "global" structure. You would need to declare a local struct variable, read the data from your file into the struct and then pass it by reference to any functions that need to be called:
int main()
{
// a local struct to encapsulate the file data
my_data_type file_data;
// read the data from your file into file_data
// the struct containing the data is passed by reference
foo(file_data);
return 0;
}

What to use instead of static variables

in a C++ program I need some helper constant objects that would be instantiated once, preferably when the program starts. Those objects would mostly be used within the same translation unit, so the simplest way to do this would be to make them static:
static const Helper h(params);
But then there is this static initialization order problem, so if Helper refers to some other statics (via params), this might lead to UB.
Another point is that I might eventually need to share this object between several units. If I just leave it static and put in a .h file, that would lead to multiple objects. I could avoid that by bothering with extern etc, but this can finally provoke the same initialization order issues (and not to say it looks very C-ish).
I thought about singletons, but that would be overkill due to the boilerplate code and inconvenient syntax (e.g. MySingleton::GetInstance().MyVar) - those objects are helpers, so they are supposed to simplify things, not to complicate them...
The same C++ FAQ mentions this option:
Fred& x()
{
static Fred* ans = new Fred();
return *ans;
}
Is this really used and considered a good thing? Should I do it this way, or would you suggest other alternatives? Thanks.
EDIT: I should have clarified why I actually need that helpers: they are very like normal constants, and could have been pre-calculated, but it is more convenient to do that at runtime. I would prefer to instantiate them before main, as it automatically resolves multi-threading issues (which local statics are not protected against in C++03). Also, as I said, they would often be limited to a translation unit, so it does not make sense to export them and initialize in main(). You can think of them as just constants but only known at runtime.
There are several possibilities for global state (whether mutable or not).
If you fear that you'll have an initialization issue, then you should use the local static approach to create your instance.
Note that the clunky singleton design you present is not mandatory design:
class Singleton
{
public:
static void DoSomething(int i)
{
Singleton& s = Instance();
// do something with i
}
private:
Singleton() {}
~Singleton() {}
static Singleton& Instance()
{
static Singleton S; // no dynamic allocation, it's unnecessary
return S;
}
};
// Invocation
Singleton::DoSomething(i);
Another design is somewhat similar, though I much prefer it because it makes transition to a non-global design much easier.
class Monoid
{
public:
Monoid()
{
static State S;
state = &s;
}
void doSomething(int i)
{
state->count += i;
}
private:
struct State
{
int count;
};
State* state;
};
// Use
Monoid m;
m.doSomething(1);
The net advantage here is that the "global-ness" of the state is hidden, it's an implementation details that clients need not worry about. Very useful for caches.
Let us, will you, question the design:
do you actually need to enforce the singularity ?
do you actually need the object be built before main starts ?
Singularity is generally over-emphasized. C++0x will help here, but even then, technically enforcing singularity rather than relying on programmers to behave themselves can be very annoying... for example when writing tests: do you really want to unload/reload your program between each unit test just to change the configuration between each one ? Ugh. Much more simple to instantiate it once and have faith in your fellow programmers... or the functional tests ;)
The second question is more technical, than functional. If you do need the configuration before the entry point of your program, then you can simply read it when it starts.
It may sound naive, but there is actually one issue with computing during the library load: how do you handle errors ? If you throw, the library is not loaded. If you do not throw and go on, you are in an invalid state. Not so funny, is it ? Things are much simpler once the real work has begun, because you can use the regular control-flow logic.
And if you think about testing whether the state is valid or not... why not simply building everything at the point where you'd test ?
Finally, the very issue with global is the hidden dependencies that are introduced. It's much better when dependencies are implicit to reason about the flow of execution, or the impacts of a refactoring.
EDIT:
Regarding initialization order issues: objects within a single translation unit are guaranteed to be initialized in the order they are defined.
Therefore, the following code is valid according to the standard:
static int foo() { return std::numeric_limits<int>::max() / 2; }
static int bar(int c) { return c*2; }
static int const x = foo();
static int const y = bar(x);
The initialization order is only an issue when referencing constants / variables defined in another translation unit. As such, static objects can naturally be expressed without issues as long as they only refer to static objects within the same translation unit.
Regarding the space issue: the as-if rule can do wonders here. Informally the as-if rule means that you specify a behavior and leave it up to the compiler/linker/runtime to provide it, without a care in the world for how it is provided. This is what actually enables optimizations.
Therefore, if the compiler chain can infer that the address of a constant is never taken, it may elide the constant altogether. If it can infer that several constants will always be equal, and once again that their address are never inspected, it may merge them together.
Yes, you can use Construct On First Use Idiom if it simplifies your problem. It's always better than global objects whose initialization depend on other global objects.
The other alternative is Singleton Pattern. Both can solve similar problem. But you've to decide which suits the situation better and fulfill your requirement.
To the best of my knowledge, there is nothing "better" than these two appproaches.
Singletons and global objects are often considered evil. The simplest and most flexible way is to instantiate the object in your main function and pass this object to other functions:
void doSomething(const Helper& h);
int main() {
const Parameters params(...);
const Helper h(params);
doSomething(h);
}
Another way is to make the helper functions non-members. Maybe they don't need any state at all, and if they do, you can pass a stateful object when you call them.
I think nothing speaks against the local static idiom mentioned in the FAQ. It is simple and should be thread-safe, and if the object isn't mutable, it should also be easily mockable and introduce no action at a distance.
Does Helper need to exist before main runs? If not, make a (set of?) global pointer variables initialized to 0. Then use main to populate them with the constant state in a definitive order. If you like you can even make helper functions that do the dereference for you.