What are the rules regarding initialization of non-local statics? - c++

Suppose I have a class whose only purpose is the side-effects caused during construction of its objects (e.g., registering a class with a factory):
class SideEffectCauser {
public:
SideEffectCauser() { /* code causing side-effects */ }
};
Also suppose I'd like to have an object create such side-effects once for each of several translation units. For each such translation unit, I'd like to be able to just put an a SideEffectCauser object at namespace scope in the .cpp file, e.g.,
SideEffectCauser dummyGlobal;
but 3.6.2/3 of the C++03 standard suggests that this object need not be constructed at all unless an object or function in the .cpp file is used, and articles such as this and online discussions such as this suggest that such objects are sometimes not initialized.
On the other hand, Is there a way to instantiate objects from a string holding their class name? has a solution that is claimed to work, and I note that it's based on using an object of a type like SideEffectCauser as a static data member, not as a global, e.g.,
class Holder {
static SideEffectHolder dummyInClass;
};
SideEffectHolder Holder::dummyInClass;
Both dummyGlobal and dummyInClass are non-local statics, but a closer look at 3.6.2/3 of the C++03 standard shows that that passage applies only to objects at namespace scope. I can't actually find anything in the C++03 standard that says when non-local statics at class scope are dynamically initialized, though 9.4.2/7 suggests that the same rules apply to them as to non-local statics at namespace scope.
Question 1: In C++03, is there any reason to believe that dummyInClass is any more likely to be initialized than dummyGlobal? Or may both go uninitialized if no functions or objects in the same translation unit are used?
Question 2: Does anything change in C++11? The wording in 3.6.2 and 9.4.2 is not the same as the C++03 versions, but, from what I can tell, there is no behavioral difference specified for the scenarios I describe above.
Question 3: Is there a reliable way to use objects of a class like SideEffectHolder outside a function body to force side-effects to take place?

I think the only reliable solution is to design this for specific compiler(s) and runtime. No standard covers the initialization of globals in a shared library which I think is the most intricate case, as this is much dependent on the loader and thus OS dependent.
Q1: No
Q2: Not in any practical sense
Q3: Not in a standard way

I'm using something similar with g++ / C++11 under Linux and get my factories registered as expected. I'm not sure why you wouldn't get the functions called. If what you describes is to be implemented it will mean that every single function in that unit has to call the initialization function. I'm not too sure how that could be done. My factories are also inside namespaces, although it is named namespaces. But I don't see why it wouldn't be called.
namespace snap {
namespace plugin_name {
class plugin_name_factory {
public:
plugin_name_factory() { plugin_register(this, name); }
...
} g_plugin_name_factory;
}
}
Note that the static keyword should not be used anymore in C++ anyway. It is often slower to have a static definition than a global.

Related

Why is static initialization order STILL unspecified?

Doesn't a compiler have all the information it needs to generate a dependency tree of all globals and create a well defined and correct initialization order for them? I realize you could write a cyclic dependency with globals - make only that case undefined behavior - and the compiler could warn and maybe error about it.
Usually the reason for this sort of thing is that it would be burdensome to compiler makers or cause compilation to slow significantly. I have no metrics or evidence that indicates either of these wouldn't be true in this case, but my inclination is that neither would be true.
Hm, imagine the following setup, which is perfectly valid C++, but tricky to analyze:
// TU #1
bool c = coin();
// TU #2
extern bool c;
extern int b;
int a = c ? b : 10;
// TU #3
extern bool c;
extern int a;
int b = c ? 20 : a;
It is clear that TU #1 needs to be initialized first, but then what? The standard solution with references-to-statics allows you to write this code correctly with standard C++, but solving this by fixing the global initialization order seems tricky.
The part the compiler can deal with is actually define: objects with static storage duration are constructed in the order their definition appears in the translation unit. The destruction order is just the reverse.
When it comes to ordering objects between translation units, the dependency group for objects is typically not explicitly represented. However, even if the dependencies were explicitly represnted, they wouldn't actually help much: on small projects the dependencies between objects with static storage duration can be managed relatively easy. Where things become interesting are large objects but these have a much higher chance to include initializations of the form
static T global = functionWhichMayuseTheword();
i.e., in the case where the ordering would be useful it is bound not to work.
There is a trivial way to make sure objects are constructed in time which is even thread-safe in C++ (it wasn't thread-safe in C++03 as this standard didn't mention any concept of threads in the first place): Use a function local static object and return a reference to it. The objects will be constructed upon demand but if there are dependencies between them this is generally acceptable:
static T& global() {
static rc = someInitialization();
return rc;
}
Given that there is a simple work-around and neither a proposal nor a working implementation demonstrating that the proposal does work, there is little interest to change the state of how global objects are initialized. Not to mention that improving the support for global objects seems as useful as making goto better.
I am not a compiler author so take what I say with a grain of salt. I think the reasons are as follows.
1) Desire the preserve the C model of separate compilation. Link time analysis is certainly allowed, but I suspect they did not want to make it required.
2) Meyers Singleton (especially now that it has been made thread-safe) provides a good enough alternative in that it is almost as easy to use as a global variable but provides the guarantees you are looking for.

is static const string member variable always initialized before used?

In C++, if I want to define some non-local const string which can be used in different classes, functions, files, the approaches that I know are:
use define directives, e.g.
#define STR_VALUE "some_string_value"
const class member variable, e.g.
class Demo {
public:
static const std::string ConstStrVal;
};
// then in cpp
std::string Demo::ConstStrVal = "some_string_value";
const class member function, e.g.
class Demo{
public:
static const std::string GetValue(){return "some_string_value";}
};
Now what I am not clear is, if we use the 2nd approach, is the variable ConstStrVal always initialized to "some_string_value" before it is actually used by any code in any case? Im concerned about this because of the "static initialization order fiasco". If this issue is valid, why is everybody using the 2nd approach?
Which is the best approach, 2 or 3? I know that #define directives have no respect of scope, most people don't recommend it.
Thanks!
if we use the 2nd approach, is the variable ConstStrVal always initialized to "some_string_value" before it is actually used by any code in any case?
No
It depends on the value it's initialized to, and the order of initialization. ConstStrVal has a global constructor.
Consider adding another global object with a constructor:
static const std::string ConstStrVal2(ConstStrVal);
The order is not defined by the language, and ConstStrVal2's constructor may be called before ConstStrVal has been constructed.
The initialization order can vary for a number of reasons, but it's often specified by your toolchain. Altering the order of linked object files could (for example) change the order of your image's initialization and then the error would surface.
why is everybody using the 2nd approach?
many people use other approaches for very good reasons…
Which is the best approach, 2 or 3?
Number 3. You can also avoid multiple constructions like so:
class Demo {
public:
static const std::string& GetValue() {
// this is constructed exactly once, when the function is first called
static const std::string s("some_string_value");
return s;
}
};
caution: this is approach is still capable of the initialization problem seen in ConstStrVal2(ConstStrVal). however, you have more control over initialization order and it's an easier problem to solve portably when compared to objects with global constructors.
In general, I (and many others) prefer to use functions to return values rather than variables, because functions give greater flexibility for future enhancement. Remember that most of the time spent on a successful software project is maintaining and enhancing the code, not writing it in the first place. It's hard to predict if your constant today might not be a compile time constant tomorrow. Maybe it will be read from a configuration file some day.
So I recommend approach 3 because it does what you want today and leaves more flexibility for the future.
Avoid using the preprocessor with C++. Also, why would you have a string in a class, but need it in other classes? I would re-evaluate your class design to allow better encapsulation. If you absolutely need this global string then I would consider adding a globals.h/cpp module and then declare/define string there as:
const char* const kMyErrorMsg = "This is my error message!";
Don't use preprocessor directives in C++, unless you're trying to achieve a holy purpose that can't possibly be achieved any other way.
From the standard (3.6.2):
Objects with static storage duration (3.7.1) shall be zero-initialized
(8.5) before any other initialization takes place. A reference with
static storage duration and an object of POD type with static storage
duration can be initialized with a constant expression (5.19); this is
called constant initialization. Together, zero-initialization and
constant initialization are called static initialization; all other
initialization is dynamic initialization. Static initialization shall
be performed before any dynamic initialization takes place. Dynamic
initialization of an object is either ordered or unordered.
Definitions of explicitly specialized class template static data
members have ordered initialization. Other class template static data
members (i.e., implicitly or explicitly instantiated specializations)
have unordered initialization. Other objects defined in namespace
scope have ordered initialization. Objects defined within a single
translation unit and with ordered initialization shall be initialized
in the order of their definitions in the translation unit. The order
of initialization is unspecified for objects with unordered
initialization and for objects defined in different translation units.
So, the fate of 2 depends on whether your variable is static initialised or dynamic initialised. For instance, in your concrete example, if you use const char * Demo::ConstStrVal = "some_string_value"; (better yet const char Demo::ConstStrVal[] if the value will stay constant in the program) you can be sure that it will be initialised no matter what. With a std::string, you can't be sure since it's not a POD type (I'm not dead sure on this one, but fairly sure).
3rd method allows you to be sure and the method in Justin's answer makes sure that there are no unnecessary constructions. Though keep in mind that the static method has a hidden overhead of checking whether or not the variable is already initialised on every call. If you're returning a simple constant, just returning your value is definitely faster since the function will probably be inlined.
All of that said, try to write your programs so as not to rely on static initialisation. Static variables are best regarded as a convenience, they aren't convenient any more when you have to juggle their initialisation orders.

Why doesn't C++ need forward declarations for class members?

I was under the impression that everything in C++ must be declared before being used.
In fact, I remember reading that this is the reason why the use of auto in return types is not valid C++0x without something like decltype: the compiler must know the declared type before evaluating the function body.
Imagine my surprise when I noticed (after a long time) that the following code is in fact perfectly legal:
[Edit: Changed example.]
class Foo
{
Foo(int x = y);
static const int y = 5;
};
So now I don't understand:
Why doesn't the compiler require a forward declaration inside classes, when it requires them in other places?
The standard says (section 3.3.7):
The potential scope of a name declared in a class consists not only of the declarative region following the name’s point of declaration, but also of all function bodies, brace-or-equal-initializers of non-static data members, and default arguments in that class (including such things in nested classes).
This is probably accomplished by delaying processing bodies of inline member functions until after parsing the entire class definition.
Function definitions within the class body are treated as if they were actually defined after the class has been defined. So your code is equivalent to:
class Foo
{
Foo();
int x, *p;
};
inline Foo::Foo() { p = &x; }
Actually, I think you need to reverse the question to understand it.
Why does C++ require forward declaration ?
Because of the way C++ works (include files, not modules), it would otherwise need to wait for the whole Translation Unit before being able to assess, for sure, what the functions are. There are several downsides here:
compilation time would take yet another hit
it would be nigh impossible to provide any guarantee for code in headers, since any introduction of a later function could invalidate it all
Why is a class different ?
A class is by definition contained. It's a small unit (or should be...). Therefore:
there is little compilation time issue, you can wait until the class end to start analyzing
there is no risk of dependency hell, since all dependencies are clearly identified and isolated
Therefore we can eschew this annoying forward-declaration rule for classes.
Just guessing: the compiler saves the body of the function and doesn't actually process it until the class declaration is complete.
unlike a namespace, a class' scope cannot be reopened. it is bound.
imagine implementing a class in a header if everything needed to be declared in advance. i presume that since it is bound, it was more logical to write the language as it is, rather than requiring the user to write forwards in the class (or requiring definitions separate from declarations).

Why field inside a local class cannot be static?

void foo (int x)
{
struct A { static const int d = 0; }; // error
}
Other than the reference from standard, is there any motivation behind this to disallow static field inside an inner class ?
error: field `foo(int)::A::d' in local class cannot be static
Edit: However, static member functions are allowed. I have one use case for such scenario. Suppose I want foo() to be called only for PODs then I can implement it like,
template<typename T>
void foo (T x)
{
struct A { static const T d = 0; }; // many compilers allow double, float etc.
}
foo() should pass for PODs only (if static is allowed) and not for other data types. This is just one use case which comes to my mind.
Because, static members of a class need to be defined in global a scope, e.g.
foo.h
class A {
static int dude;
};
foo.cpp
int A::dude = 314;
Since the scope inside void foo(int x) is local to that function, there is no scope to define its static member[s].
Magnus Skog has given the real answer: a static data member is just a declaration; the object must be defined elsewhere, at namespace scope, and the class definition isn't visible at namespace scope.
Note that this restriction only applies to static data members. Which means that there is a simple work-around:
class Local
{
static int& static_i()
{
static int value;
return value;
}
};
This provides you with exactly the same functionality, at the cost of
using the function syntax to access it.
Because nobody saw any need for it ?
[edit]: static variables need be defined only once, generally outside of the class (except for built-ins). Allowing them within a local class would require designing a way to define them also. [/edit]
Any feature added to a language has a cost:
it must be implemented by the compiler
it must be maintained in the compiler (and may introduce bugs, even in other features)
it lives in the compiler (and thus may cause some slow down even when unused)
Sometimes, not implementing a feature is the right decision.
Local functions, and classes, add difficulty already to the language, for little gain: they can be avoided with static functions and unnamed namespaces.
Frankly, if I had to make the decision, I'd remove them entirely: they just clutter the grammar.
A single example: The Most Vexing Parse.
I think this is the same naming problem that has prevented us from using local types in template instantiations.
The name foo()::A::d is not a good name for the linker to resolve, so how should it find the definition of the static member? What if there is another struct A in function baz()?
Interesting question, but I have difficulty understanding why you'd want a static member in a local class. Statics are typically used to maintain state across program flow, but in this case wouldn't it be better to use a static variable whose scope was foo()?
If I had to guess why the restriction exists, I'd say it was something to do with the difficulty for the compiler in knowing when to perform the static initialisation. The C++ standards docs might provide a more formal justification.
Just because.
One annoying thing about C++ is that there's a strong dependence on a "global context" concept where everything must be uniquely named. Even the nested namespaces machinery is just string trickery.
I suppose (just a wild guess) that one serious technical issue is working with linkers that were designed for C and that just got some tweak to get them working with C++ (and C++ code needs C interoperability).
It would be nice to be able to get any C++ code and "wrap it" to be able to use it without conflicts in a larger project, but this is not the case because of linkage problems. I don't think there is any reasonable philosophical reason for forbidding statics or non-inline methods (or even nested functions) at the function level but this is what we got (for now).
Even the declaration/definition duality with all its annoying verbosity and implications is just about implementation problems (and to give the ability to sell usable object code without providing the source, something that is now a lot less popular for good reasons).

Difference between static in C and static in C++??

What is the difference between the static keyword in C and C++?
The static keyword serves the same purposes in C and C++.
When used at file level (outside of a function), it sets the visibility of the item it's applied to. Static items are not visible outside of their compilation unit (e.g., to the linker). Their duration is the same as the duration of the program.
These file-level items (functions and data) should be static unless there's a specific need to access them from outside (and there's almost never a need to give direct access to data since that breaks the central tenet of encapsulation).
If (as your comment to the question indicates) this is the only use of static you're concerned with then, no, there is no difference between C and C++.
When used within a function, it sets the duration of the item. Again, the duration is the same as the program and the item continues to exist between invocations of that function.
It does not affect the visibility of that item since it's visible only within the function. An example is a random number generator that needs to keep its seed value between invocations but doesn't want that value visible to other functions.
C++ has one more use, static within a class. When used there, it becomes a single class variable that's common across all objects of that class. One classic example is to store the number of objects that have been instantiated for a given class.
As others have pointed out, the use of file-level static has been deprecated in favour of unnamed namespaces. However, I believe it'll be a cold day in a certain warm place before it's actually removed from the language - there's just too much code using it at the moment. And ISO C have only just gotten around to removing gets() despite the amount of time we've all known it was a dangerous function.
And even though it's deprecated, that doesn't change its semantics now.
The use of static at the file scope to restrict access to the current translation unit is deprecated in C++, but still acceptable in C.
Instead, use an unnamed namespace
namespace
{
int file_scope_x;
}
Variables declared this way are only available within the file, just as if they were declared static.
The main reason for the deprecation is to remove one of the several overloaded meanings of the static keyword.
Originally, it meant that the variable, such as in a function, would be given storage for the lifetime of the program in an area for such variables, and not stored on the stack as is usual for function local variables.
Then the keyword was overloaded to apply to file scope linkage. It's not desirable to make up new keywords as needed, because they might break existing code. So this one was used again with a different meaning without causing conflicts, because a variable declared as static can't be both inside a function and at the top level, and functions didn't have the modifier before. (The storage connotation is totally lost when referring to functions, as they are not stored anywhere.)
When classes came along in C++ (and in Java and C#) the keyword was used yet again, but the meaning is at least closer to the original intention. Variables declared this way are stored in a global area, as opposed to on the stack as for function variables, or on the heap as for object members. Because variables cannot be both at the top level and inside a class definition, extra meaning can be unambiguously attached to class variables. They can only be referenced via the class name or from within an object of that class.
It has the same meaning in both languages.
But C++ adds classes. In the context of a class (and thus a struct) it has the extra meaning of making the method/variable class members rather members of the object.
class Plop
{
static int x; // This is a member of the class not an instance.
public:
static int getX() // method is a member of the class.
{
return x;
}
};
int Plop::x = 5;
Note that the use of static to mean "file scope" (aka namespace scope) is only deoprecated by the C++ Standard for objects, not for functions. In other words,:
// foo.cpp
static int x = 0; // deprecated
static int f() { return 1; } // not deprecated
To quote Annex D of the Standard:
The use of the static keyword is
deprecated when declaring objects in
namespace scope.
You can not declare a static variable inside structure in C... But allowed in Cpp with the help of scope resolution operator.
Also in Cpp static function can access only static variables but in C static function can have static and non static variables...😊