Why is static initialization order STILL unspecified? - c++

Doesn't a compiler have all the information it needs to generate a dependency tree of all globals and create a well defined and correct initialization order for them? I realize you could write a cyclic dependency with globals - make only that case undefined behavior - and the compiler could warn and maybe error about it.
Usually the reason for this sort of thing is that it would be burdensome to compiler makers or cause compilation to slow significantly. I have no metrics or evidence that indicates either of these wouldn't be true in this case, but my inclination is that neither would be true.

Hm, imagine the following setup, which is perfectly valid C++, but tricky to analyze:
// TU #1
bool c = coin();
// TU #2
extern bool c;
extern int b;
int a = c ? b : 10;
// TU #3
extern bool c;
extern int a;
int b = c ? 20 : a;
It is clear that TU #1 needs to be initialized first, but then what? The standard solution with references-to-statics allows you to write this code correctly with standard C++, but solving this by fixing the global initialization order seems tricky.

The part the compiler can deal with is actually define: objects with static storage duration are constructed in the order their definition appears in the translation unit. The destruction order is just the reverse.
When it comes to ordering objects between translation units, the dependency group for objects is typically not explicitly represented. However, even if the dependencies were explicitly represnted, they wouldn't actually help much: on small projects the dependencies between objects with static storage duration can be managed relatively easy. Where things become interesting are large objects but these have a much higher chance to include initializations of the form
static T global = functionWhichMayuseTheword();
i.e., in the case where the ordering would be useful it is bound not to work.
There is a trivial way to make sure objects are constructed in time which is even thread-safe in C++ (it wasn't thread-safe in C++03 as this standard didn't mention any concept of threads in the first place): Use a function local static object and return a reference to it. The objects will be constructed upon demand but if there are dependencies between them this is generally acceptable:
static T& global() {
static rc = someInitialization();
return rc;
}
Given that there is a simple work-around and neither a proposal nor a working implementation demonstrating that the proposal does work, there is little interest to change the state of how global objects are initialized. Not to mention that improving the support for global objects seems as useful as making goto better.

I am not a compiler author so take what I say with a grain of salt. I think the reasons are as follows.
1) Desire the preserve the C model of separate compilation. Link time analysis is certainly allowed, but I suspect they did not want to make it required.
2) Meyers Singleton (especially now that it has been made thread-safe) provides a good enough alternative in that it is almost as easy to use as a global variable but provides the guarantees you are looking for.

Related

Will const and constexpr eventually be the same thing?

I just read the answer to
const vs constexpr on variables
and am watching this Google Tech Talk about C++11/14 features , in which it is said that, well, constexpr might not be necessary in the future when it comes to functions, since compilers will evolve to figure it out on their own. Finally, I know that Java compilers and JVMs work hard to figure out that classes (or any variable maybe) are immutable after construction - without you explicitly saying so - and doing all sorts of wicked optimization based on this fact.
So, here's the question: Is the fate of const and constexpr to eventually be the same thing? That is, even though a compiler is not guaranteed to do runtime initialization etc., will it not eventually do so whenever possible (basically)? And when that happens, won't one of the keywords be redundant? (Just like inline is becoming, maybe)?
No, neither one will replace the other, they have different roles. Bjarne Stroustrup tells us in his C++ FAQ that constexpr is not a replacement for const and outlines the different roles of each feature:
Please note that constexpr is not a general purpose replacement for
const (or vise versa):
const's primary function is to express the idea that an object is not modified through an interface (even though the object may very well be
modified through other interfaces). It just so happens that declaring
an object const provides excellent optimization opportunities for the
compiler. In particular, if an object is declared const and its
address isn't taken, a compiler is often able to evaluate its
initializer at compile time (though that's not guaranteed) and keep
that object in its tables rather than emitting it into the generated
code.
constexpr's primary function is to extend the range of what can be computed at compile time, making such computation type safe. Objects
declared constexpr have their initializer evaluated at compile time;
they are basically values kept in the compiler's tables and only
emitted into the generated code if needed.

What are the rules regarding initialization of non-local statics?

Suppose I have a class whose only purpose is the side-effects caused during construction of its objects (e.g., registering a class with a factory):
class SideEffectCauser {
public:
SideEffectCauser() { /* code causing side-effects */ }
};
Also suppose I'd like to have an object create such side-effects once for each of several translation units. For each such translation unit, I'd like to be able to just put an a SideEffectCauser object at namespace scope in the .cpp file, e.g.,
SideEffectCauser dummyGlobal;
but 3.6.2/3 of the C++03 standard suggests that this object need not be constructed at all unless an object or function in the .cpp file is used, and articles such as this and online discussions such as this suggest that such objects are sometimes not initialized.
On the other hand, Is there a way to instantiate objects from a string holding their class name? has a solution that is claimed to work, and I note that it's based on using an object of a type like SideEffectCauser as a static data member, not as a global, e.g.,
class Holder {
static SideEffectHolder dummyInClass;
};
SideEffectHolder Holder::dummyInClass;
Both dummyGlobal and dummyInClass are non-local statics, but a closer look at 3.6.2/3 of the C++03 standard shows that that passage applies only to objects at namespace scope. I can't actually find anything in the C++03 standard that says when non-local statics at class scope are dynamically initialized, though 9.4.2/7 suggests that the same rules apply to them as to non-local statics at namespace scope.
Question 1: In C++03, is there any reason to believe that dummyInClass is any more likely to be initialized than dummyGlobal? Or may both go uninitialized if no functions or objects in the same translation unit are used?
Question 2: Does anything change in C++11? The wording in 3.6.2 and 9.4.2 is not the same as the C++03 versions, but, from what I can tell, there is no behavioral difference specified for the scenarios I describe above.
Question 3: Is there a reliable way to use objects of a class like SideEffectHolder outside a function body to force side-effects to take place?
I think the only reliable solution is to design this for specific compiler(s) and runtime. No standard covers the initialization of globals in a shared library which I think is the most intricate case, as this is much dependent on the loader and thus OS dependent.
Q1: No
Q2: Not in any practical sense
Q3: Not in a standard way
I'm using something similar with g++ / C++11 under Linux and get my factories registered as expected. I'm not sure why you wouldn't get the functions called. If what you describes is to be implemented it will mean that every single function in that unit has to call the initialization function. I'm not too sure how that could be done. My factories are also inside namespaces, although it is named namespaces. But I don't see why it wouldn't be called.
namespace snap {
namespace plugin_name {
class plugin_name_factory {
public:
plugin_name_factory() { plugin_register(this, name); }
...
} g_plugin_name_factory;
}
}
Note that the static keyword should not be used anymore in C++ anyway. It is often slower to have a static definition than a global.

is static const string member variable always initialized before used?

In C++, if I want to define some non-local const string which can be used in different classes, functions, files, the approaches that I know are:
use define directives, e.g.
#define STR_VALUE "some_string_value"
const class member variable, e.g.
class Demo {
public:
static const std::string ConstStrVal;
};
// then in cpp
std::string Demo::ConstStrVal = "some_string_value";
const class member function, e.g.
class Demo{
public:
static const std::string GetValue(){return "some_string_value";}
};
Now what I am not clear is, if we use the 2nd approach, is the variable ConstStrVal always initialized to "some_string_value" before it is actually used by any code in any case? Im concerned about this because of the "static initialization order fiasco". If this issue is valid, why is everybody using the 2nd approach?
Which is the best approach, 2 or 3? I know that #define directives have no respect of scope, most people don't recommend it.
Thanks!
if we use the 2nd approach, is the variable ConstStrVal always initialized to "some_string_value" before it is actually used by any code in any case?
No
It depends on the value it's initialized to, and the order of initialization. ConstStrVal has a global constructor.
Consider adding another global object with a constructor:
static const std::string ConstStrVal2(ConstStrVal);
The order is not defined by the language, and ConstStrVal2's constructor may be called before ConstStrVal has been constructed.
The initialization order can vary for a number of reasons, but it's often specified by your toolchain. Altering the order of linked object files could (for example) change the order of your image's initialization and then the error would surface.
why is everybody using the 2nd approach?
many people use other approaches for very good reasons…
Which is the best approach, 2 or 3?
Number 3. You can also avoid multiple constructions like so:
class Demo {
public:
static const std::string& GetValue() {
// this is constructed exactly once, when the function is first called
static const std::string s("some_string_value");
return s;
}
};
caution: this is approach is still capable of the initialization problem seen in ConstStrVal2(ConstStrVal). however, you have more control over initialization order and it's an easier problem to solve portably when compared to objects with global constructors.
In general, I (and many others) prefer to use functions to return values rather than variables, because functions give greater flexibility for future enhancement. Remember that most of the time spent on a successful software project is maintaining and enhancing the code, not writing it in the first place. It's hard to predict if your constant today might not be a compile time constant tomorrow. Maybe it will be read from a configuration file some day.
So I recommend approach 3 because it does what you want today and leaves more flexibility for the future.
Avoid using the preprocessor with C++. Also, why would you have a string in a class, but need it in other classes? I would re-evaluate your class design to allow better encapsulation. If you absolutely need this global string then I would consider adding a globals.h/cpp module and then declare/define string there as:
const char* const kMyErrorMsg = "This is my error message!";
Don't use preprocessor directives in C++, unless you're trying to achieve a holy purpose that can't possibly be achieved any other way.
From the standard (3.6.2):
Objects with static storage duration (3.7.1) shall be zero-initialized
(8.5) before any other initialization takes place. A reference with
static storage duration and an object of POD type with static storage
duration can be initialized with a constant expression (5.19); this is
called constant initialization. Together, zero-initialization and
constant initialization are called static initialization; all other
initialization is dynamic initialization. Static initialization shall
be performed before any dynamic initialization takes place. Dynamic
initialization of an object is either ordered or unordered.
Definitions of explicitly specialized class template static data
members have ordered initialization. Other class template static data
members (i.e., implicitly or explicitly instantiated specializations)
have unordered initialization. Other objects defined in namespace
scope have ordered initialization. Objects defined within a single
translation unit and with ordered initialization shall be initialized
in the order of their definitions in the translation unit. The order
of initialization is unspecified for objects with unordered
initialization and for objects defined in different translation units.
So, the fate of 2 depends on whether your variable is static initialised or dynamic initialised. For instance, in your concrete example, if you use const char * Demo::ConstStrVal = "some_string_value"; (better yet const char Demo::ConstStrVal[] if the value will stay constant in the program) you can be sure that it will be initialised no matter what. With a std::string, you can't be sure since it's not a POD type (I'm not dead sure on this one, but fairly sure).
3rd method allows you to be sure and the method in Justin's answer makes sure that there are no unnecessary constructions. Though keep in mind that the static method has a hidden overhead of checking whether or not the variable is already initialised on every call. If you're returning a simple constant, just returning your value is definitely faster since the function will probably be inlined.
All of that said, try to write your programs so as not to rely on static initialisation. Static variables are best regarded as a convenience, they aren't convenient any more when you have to juggle their initialisation orders.

Why field inside a local class cannot be static?

void foo (int x)
{
struct A { static const int d = 0; }; // error
}
Other than the reference from standard, is there any motivation behind this to disallow static field inside an inner class ?
error: field `foo(int)::A::d' in local class cannot be static
Edit: However, static member functions are allowed. I have one use case for such scenario. Suppose I want foo() to be called only for PODs then I can implement it like,
template<typename T>
void foo (T x)
{
struct A { static const T d = 0; }; // many compilers allow double, float etc.
}
foo() should pass for PODs only (if static is allowed) and not for other data types. This is just one use case which comes to my mind.
Because, static members of a class need to be defined in global a scope, e.g.
foo.h
class A {
static int dude;
};
foo.cpp
int A::dude = 314;
Since the scope inside void foo(int x) is local to that function, there is no scope to define its static member[s].
Magnus Skog has given the real answer: a static data member is just a declaration; the object must be defined elsewhere, at namespace scope, and the class definition isn't visible at namespace scope.
Note that this restriction only applies to static data members. Which means that there is a simple work-around:
class Local
{
static int& static_i()
{
static int value;
return value;
}
};
This provides you with exactly the same functionality, at the cost of
using the function syntax to access it.
Because nobody saw any need for it ?
[edit]: static variables need be defined only once, generally outside of the class (except for built-ins). Allowing them within a local class would require designing a way to define them also. [/edit]
Any feature added to a language has a cost:
it must be implemented by the compiler
it must be maintained in the compiler (and may introduce bugs, even in other features)
it lives in the compiler (and thus may cause some slow down even when unused)
Sometimes, not implementing a feature is the right decision.
Local functions, and classes, add difficulty already to the language, for little gain: they can be avoided with static functions and unnamed namespaces.
Frankly, if I had to make the decision, I'd remove them entirely: they just clutter the grammar.
A single example: The Most Vexing Parse.
I think this is the same naming problem that has prevented us from using local types in template instantiations.
The name foo()::A::d is not a good name for the linker to resolve, so how should it find the definition of the static member? What if there is another struct A in function baz()?
Interesting question, but I have difficulty understanding why you'd want a static member in a local class. Statics are typically used to maintain state across program flow, but in this case wouldn't it be better to use a static variable whose scope was foo()?
If I had to guess why the restriction exists, I'd say it was something to do with the difficulty for the compiler in knowing when to perform the static initialisation. The C++ standards docs might provide a more formal justification.
Just because.
One annoying thing about C++ is that there's a strong dependence on a "global context" concept where everything must be uniquely named. Even the nested namespaces machinery is just string trickery.
I suppose (just a wild guess) that one serious technical issue is working with linkers that were designed for C and that just got some tweak to get them working with C++ (and C++ code needs C interoperability).
It would be nice to be able to get any C++ code and "wrap it" to be able to use it without conflicts in a larger project, but this is not the case because of linkage problems. I don't think there is any reasonable philosophical reason for forbidding statics or non-inline methods (or even nested functions) at the function level but this is what we got (for now).
Even the declaration/definition duality with all its annoying verbosity and implications is just about implementation problems (and to give the ability to sell usable object code without providing the source, something that is now a lot less popular for good reasons).

Which of these statements about objects is true?

Given this:
struct { int x; } ix;
struct A { A() {}; int x; };
A ia;
Which of these is true?
a. ix is an object
b. ia is an object
c. both are objects
d. both are not objects.
Many of these answers have ignored the C++ tag. In C++, "an object is a region of storage. [Note: a function is not an object, regardless of whether or not it occupies storage in the same way that objects do.]" (The C++ Standard, 1.8/1).
If the homework question is about C++, then no other definition of object is applicable, not even "anything that is visible or tangible and is relatively stable in form" (dictionary.reference.com). It's not asking for your opinion about OOP principles, it's in effect asking whether ix and ia are variables.
Since it's homework I'll not tell you the answer, but do note that struct { int x; } ix; is not the same thing as struct ix { int x; };.
On the other hand, if the homework assignment is about OOP principles, then knock yourself out with whatever definition your lecturer has given you of "object". Since I don't know what that is, I can't tell you what answer he'll consider correct...
Given the C++ tag, the answer is pretty much "take your choice."
The C standard defines an object as meaning (in essence) anything that has an address, including all instances of native/primitive types (e.g. int). Since C++ depends so heavily on C, that definition still carries some weight in C++. By this definition, essentially every variable is an object, and so are a few other things (e.g. character string literals, dynamically allocated blocks of memory).
In Smalltalk (at rather the opposite extreme) the answer would be none of them is an object -- an object never has public data. Its behavior is defined entirely in terms of responses to messages.
The word "object" is a rather ambiguous specification without some more context, but in general objects have identity, behavior, and state.
Neither ix nor ia have all three; ix fails because it lacks identity or behavior, and ia fails because it has no behavior. Both are essentially just blobs of data.
There are two commonly used definitions of "object" in C++.
One is official according to the C++ standard, and says that everything that has storage allocated for it is an object. A struct is an object, an int is an object, a bool is an object, a pointer is an object, a string literal is an object, and so on. By this definition, ix, ia and x are all objects. But this probably isn't what your teacher meant. You have to be a bit of a language lawyer to use this definition, and it's not that widely known among "average" C++ users. It's also not a very relevant definition for someone just learning the language.
The definition you are probably expected to use is that of an "object" in the object-oriented sense. Here (at least in the C++ family of languages), an object is typically meant to be an instance of a class.
Which leaves the next obvious question: Is an instance of a struct also an object? Depends. In C++, a class and a struct are essentially the same, so semantically, yes, but technically, you're not using the class keyword, so syntactically, probably not.
In short: It's a silly, and badly worded question, and only you know what your teacher means or wants to hear, because you're the one who attended the classes, not us. All we can do is guess at what he thinks defines a class.
These questions are impossible to answer without extra clarification. The question is tagged C++, which means that the language is supposedly C++.
In this case, if the declarations are made in namespace scope, the ix declaration is invalid. It is illegal to use an unnamed class type (which has no linkage) to declare an object with external linkage. The declaration of ix would work in local scope
void foo() {
struct { int x; } ix; // OK, no linkage
}
It might also work if ix was declared with internal linkage at namespace scope
static struct { int x; } ix; // OK? Internal linkage?
although I personally believe that this was intended to be ill-formed as well (Comeau somehow allows it).
But a namespace-scope declaration with external linkage is ill-formed
// In namespace scope
struct { int x; } ix; // ERROR
So, if the namespace scope is assumed and if the above declarations are meant to be taken as a single piece of code, there are no meaningful answers to these questions. The whole code is simply invalid. It is meaningless. It is not C++.
Otherwise, if ix is declared with no linkage (local) or with internal linkage, then ix is an object.
As for ia, it is an object regardless of where it is declared, since the class type is named.
Note though that the notion of object in C++ has nothing to do with classes. Object in C++ is a region of storage (memory). A variable of int type is an object in C++, for one example.
Added later: The bit about legality of ix declaration is an interesting issue. Apparently C++98 allowed such declarations, which was proposed to be outlawed in DR#132. However, later the proposal was rejected (for a rather weird reason) and the things were left as is. Yet, Comeau Online refuses to accept a declaration of an object with external linkage with unnamed type (internal linkage is OK). It could quite possibly be a formal bug in Comeau compiler (not that I'd complain about it).
Added even later: Oh, I see that there's an even later DR#389, which finally outlaws such declarations, but the status of this DR is still CD1.
By my definition, I'd say an object has properties and methods. Both nouns and verbs.
You can kick a ball, you can invade a country, and you can eat, milk, or punch a cow. Those are therefore objects.
You might have a data structure that represents the properties of a ball (radius), country (population), or cow (daily milk output in liters), but that data structure doesn't represent an object in my mind until you tell it how to process pertinent behaviors.
I recognize this definition may not work in 100% of cases, but it's close enough for my needs.
Technically, an object is an instance of a class, but objects' true usefulness lies in their ability to encapsulate information and aid in the design of systems. They are an analysis tool.
An object is an instnace of a type (be it POD or class).
As such you are able to extract the address of an object. All objects take up at least one byte. The reason for this is that you don't have to add special code for handling zero sized objects because every object in memory has a destinct address (by making everything at least one byte the compiler will automatically have a unique address for each object).
int main()
{
struct { int x; } ix;
struct A { A() {}; int x; };
A ia;
ix.x = 5; // Assigned a value.
// Thus it has state and thus is an object.
ia.x = 6; // Assigned a value.
}
So they are both objects.
The real answer is "e. Whoever writes code like this should be coached to improve their legibility." Okay, that was a joke.
This question isn't so complex. But elsewhere, I've seen programming tests written purposely complex for the purpose of seeing if you can solve puzzles. It's completely pointless, because code that is that complex shouldn't and usually does not exist. If it's that hard to read, it's poorly written code.
Remember, code is not written for computers. Code is written for the next developer after you to read and understand.
And don't write code just so it works. That's not a high enough standard. The worst junk in the world will run, but it's a nightmare to fix or upgrade.