is static const string member variable always initialized before used? - c++

In C++, if I want to define some non-local const string which can be used in different classes, functions, files, the approaches that I know are:
use define directives, e.g.
#define STR_VALUE "some_string_value"
const class member variable, e.g.
class Demo {
public:
static const std::string ConstStrVal;
};
// then in cpp
std::string Demo::ConstStrVal = "some_string_value";
const class member function, e.g.
class Demo{
public:
static const std::string GetValue(){return "some_string_value";}
};
Now what I am not clear is, if we use the 2nd approach, is the variable ConstStrVal always initialized to "some_string_value" before it is actually used by any code in any case? Im concerned about this because of the "static initialization order fiasco". If this issue is valid, why is everybody using the 2nd approach?
Which is the best approach, 2 or 3? I know that #define directives have no respect of scope, most people don't recommend it.
Thanks!

if we use the 2nd approach, is the variable ConstStrVal always initialized to "some_string_value" before it is actually used by any code in any case?
No
It depends on the value it's initialized to, and the order of initialization. ConstStrVal has a global constructor.
Consider adding another global object with a constructor:
static const std::string ConstStrVal2(ConstStrVal);
The order is not defined by the language, and ConstStrVal2's constructor may be called before ConstStrVal has been constructed.
The initialization order can vary for a number of reasons, but it's often specified by your toolchain. Altering the order of linked object files could (for example) change the order of your image's initialization and then the error would surface.
why is everybody using the 2nd approach?
many people use other approaches for very good reasons…
Which is the best approach, 2 or 3?
Number 3. You can also avoid multiple constructions like so:
class Demo {
public:
static const std::string& GetValue() {
// this is constructed exactly once, when the function is first called
static const std::string s("some_string_value");
return s;
}
};
caution: this is approach is still capable of the initialization problem seen in ConstStrVal2(ConstStrVal). however, you have more control over initialization order and it's an easier problem to solve portably when compared to objects with global constructors.

In general, I (and many others) prefer to use functions to return values rather than variables, because functions give greater flexibility for future enhancement. Remember that most of the time spent on a successful software project is maintaining and enhancing the code, not writing it in the first place. It's hard to predict if your constant today might not be a compile time constant tomorrow. Maybe it will be read from a configuration file some day.
So I recommend approach 3 because it does what you want today and leaves more flexibility for the future.

Avoid using the preprocessor with C++. Also, why would you have a string in a class, but need it in other classes? I would re-evaluate your class design to allow better encapsulation. If you absolutely need this global string then I would consider adding a globals.h/cpp module and then declare/define string there as:
const char* const kMyErrorMsg = "This is my error message!";

Don't use preprocessor directives in C++, unless you're trying to achieve a holy purpose that can't possibly be achieved any other way.
From the standard (3.6.2):
Objects with static storage duration (3.7.1) shall be zero-initialized
(8.5) before any other initialization takes place. A reference with
static storage duration and an object of POD type with static storage
duration can be initialized with a constant expression (5.19); this is
called constant initialization. Together, zero-initialization and
constant initialization are called static initialization; all other
initialization is dynamic initialization. Static initialization shall
be performed before any dynamic initialization takes place. Dynamic
initialization of an object is either ordered or unordered.
Definitions of explicitly specialized class template static data
members have ordered initialization. Other class template static data
members (i.e., implicitly or explicitly instantiated specializations)
have unordered initialization. Other objects defined in namespace
scope have ordered initialization. Objects defined within a single
translation unit and with ordered initialization shall be initialized
in the order of their definitions in the translation unit. The order
of initialization is unspecified for objects with unordered
initialization and for objects defined in different translation units.
So, the fate of 2 depends on whether your variable is static initialised or dynamic initialised. For instance, in your concrete example, if you use const char * Demo::ConstStrVal = "some_string_value"; (better yet const char Demo::ConstStrVal[] if the value will stay constant in the program) you can be sure that it will be initialised no matter what. With a std::string, you can't be sure since it's not a POD type (I'm not dead sure on this one, but fairly sure).
3rd method allows you to be sure and the method in Justin's answer makes sure that there are no unnecessary constructions. Though keep in mind that the static method has a hidden overhead of checking whether or not the variable is already initialised on every call. If you're returning a simple constant, just returning your value is definitely faster since the function will probably be inlined.
All of that said, try to write your programs so as not to rely on static initialisation. Static variables are best regarded as a convenience, they aren't convenient any more when you have to juggle their initialisation orders.

Related

Why can't we declare static variables in the class definition? [duplicate]

I have the following working code:
#include <string>
#include <iostream>
class A {
public:
const std::string test = "42";
//static const std::string test = "42"; // fails
};
int main(void){
A a;
std::cout << a.test << '\n';
}
Is there a good reason why it is not possible to make the test a static const ? I do understand prior to c++11 it was constrained by the standard. I thought that c++11 introduced in-class initializations to make it a little bit friendlier. I also not such semantic are available for integral type since quite some time.
Of course it works with the out-of class initialization in form of const std::string A::test = "42";
I guess that, if you can make it non-static, then the problem lies in one of the two. Initializing it out-of-class scope (normally consts are created during the instantiation of the object). But I do not think this is the problem if you are creating an object independant of any other members of the class. The second is having multiple definitions for the static member. E.g. if it were included in several .cpp files, landing into several object-files, and then the linker would have troubles when linking those object together (e.g. into one executable), as they would contain copies of the same symbol. To my understanding, this is exactly equal to the situation when ones provides the out-of-class right under the class declaration in the header, and then includes this common header in more than one place. As I recall, this leads to linker errors.
However, now the responsibility of handling this is moved onto user/programmer. If one wants to have a library with a static they need to provide a out-of-class definition, compile it into a separate object file, and then link all other object to this one, therefore having only one copy of the binary definition of the symbol.
I read the answers in Do we still need to separately define static members, even if they are initialised inside the class definition? and Why can't I initialize non-const static member or static array in class?.
I still would like to know:
Is it only a standard thing, or there is deeper reasoning behind it?
Can this be worked-around with the constexpr and user-defined
literals mechanisms. Both clang and g++ say the variable cannot have non-literal type. Maybe I can make one. (Maybe for some reason its also a bad idea)
Is it really such a big issue for linker to include only one copy of
the symbol? Since it is static const all should be binary-exact
immutable copies.
Plese also comment if I am missing or missunderstanding something.
Your question sort of has two parts. What does the standard say? And why is it so?
For a static member of type const std::string, it is required to be defined outside the class specifier and have one definition in one of the translation units. This is part of the One Definition Rule, and is specified in clause 3 of the C++ standard.
But why?
The problem is that an object with static storage duration needs unique static storage in the final program image, so it needs to be linked from one particular translation unit. The class specifier doesn't have a home in one translation unit, it just defines the type (which is required to be identically defined in all translation units where it is used).
The reason a constant integral doesn't need storage, is that it is used by the compiler as a constant expression and inlined at point of use. It never makes it to the program image.
However a complex type, like a std::string, with static storage duration need storage, even if they are const. This is because they may need to be dynamically initialized (have their constructor called before the entry to main).
You could argue that the compiler should store information about objects with static storage duration in each translation unit where they are used, and then the linker should merge these definitions at link-time into one object in the program image. My guess for why this isn't done, is that it would require too much intelligence from the linker.

Why can't I make in-class initialized `const const std::string` a static member

I have the following working code:
#include <string>
#include <iostream>
class A {
public:
const std::string test = "42";
//static const std::string test = "42"; // fails
};
int main(void){
A a;
std::cout << a.test << '\n';
}
Is there a good reason why it is not possible to make the test a static const ? I do understand prior to c++11 it was constrained by the standard. I thought that c++11 introduced in-class initializations to make it a little bit friendlier. I also not such semantic are available for integral type since quite some time.
Of course it works with the out-of class initialization in form of const std::string A::test = "42";
I guess that, if you can make it non-static, then the problem lies in one of the two. Initializing it out-of-class scope (normally consts are created during the instantiation of the object). But I do not think this is the problem if you are creating an object independant of any other members of the class. The second is having multiple definitions for the static member. E.g. if it were included in several .cpp files, landing into several object-files, and then the linker would have troubles when linking those object together (e.g. into one executable), as they would contain copies of the same symbol. To my understanding, this is exactly equal to the situation when ones provides the out-of-class right under the class declaration in the header, and then includes this common header in more than one place. As I recall, this leads to linker errors.
However, now the responsibility of handling this is moved onto user/programmer. If one wants to have a library with a static they need to provide a out-of-class definition, compile it into a separate object file, and then link all other object to this one, therefore having only one copy of the binary definition of the symbol.
I read the answers in Do we still need to separately define static members, even if they are initialised inside the class definition? and Why can't I initialize non-const static member or static array in class?.
I still would like to know:
Is it only a standard thing, or there is deeper reasoning behind it?
Can this be worked-around with the constexpr and user-defined
literals mechanisms. Both clang and g++ say the variable cannot have non-literal type. Maybe I can make one. (Maybe for some reason its also a bad idea)
Is it really such a big issue for linker to include only one copy of
the symbol? Since it is static const all should be binary-exact
immutable copies.
Plese also comment if I am missing or missunderstanding something.
Your question sort of has two parts. What does the standard say? And why is it so?
For a static member of type const std::string, it is required to be defined outside the class specifier and have one definition in one of the translation units. This is part of the One Definition Rule, and is specified in clause 3 of the C++ standard.
But why?
The problem is that an object with static storage duration needs unique static storage in the final program image, so it needs to be linked from one particular translation unit. The class specifier doesn't have a home in one translation unit, it just defines the type (which is required to be identically defined in all translation units where it is used).
The reason a constant integral doesn't need storage, is that it is used by the compiler as a constant expression and inlined at point of use. It never makes it to the program image.
However a complex type, like a std::string, with static storage duration need storage, even if they are const. This is because they may need to be dynamically initialized (have their constructor called before the entry to main).
You could argue that the compiler should store information about objects with static storage duration in each translation unit where they are used, and then the linker should merge these definitions at link-time into one object in the program image. My guess for why this isn't done, is that it would require too much intelligence from the linker.

What are the rules regarding initialization of non-local statics?

Suppose I have a class whose only purpose is the side-effects caused during construction of its objects (e.g., registering a class with a factory):
class SideEffectCauser {
public:
SideEffectCauser() { /* code causing side-effects */ }
};
Also suppose I'd like to have an object create such side-effects once for each of several translation units. For each such translation unit, I'd like to be able to just put an a SideEffectCauser object at namespace scope in the .cpp file, e.g.,
SideEffectCauser dummyGlobal;
but 3.6.2/3 of the C++03 standard suggests that this object need not be constructed at all unless an object or function in the .cpp file is used, and articles such as this and online discussions such as this suggest that such objects are sometimes not initialized.
On the other hand, Is there a way to instantiate objects from a string holding their class name? has a solution that is claimed to work, and I note that it's based on using an object of a type like SideEffectCauser as a static data member, not as a global, e.g.,
class Holder {
static SideEffectHolder dummyInClass;
};
SideEffectHolder Holder::dummyInClass;
Both dummyGlobal and dummyInClass are non-local statics, but a closer look at 3.6.2/3 of the C++03 standard shows that that passage applies only to objects at namespace scope. I can't actually find anything in the C++03 standard that says when non-local statics at class scope are dynamically initialized, though 9.4.2/7 suggests that the same rules apply to them as to non-local statics at namespace scope.
Question 1: In C++03, is there any reason to believe that dummyInClass is any more likely to be initialized than dummyGlobal? Or may both go uninitialized if no functions or objects in the same translation unit are used?
Question 2: Does anything change in C++11? The wording in 3.6.2 and 9.4.2 is not the same as the C++03 versions, but, from what I can tell, there is no behavioral difference specified for the scenarios I describe above.
Question 3: Is there a reliable way to use objects of a class like SideEffectHolder outside a function body to force side-effects to take place?
I think the only reliable solution is to design this for specific compiler(s) and runtime. No standard covers the initialization of globals in a shared library which I think is the most intricate case, as this is much dependent on the loader and thus OS dependent.
Q1: No
Q2: Not in any practical sense
Q3: Not in a standard way
I'm using something similar with g++ / C++11 under Linux and get my factories registered as expected. I'm not sure why you wouldn't get the functions called. If what you describes is to be implemented it will mean that every single function in that unit has to call the initialization function. I'm not too sure how that could be done. My factories are also inside namespaces, although it is named namespaces. But I don't see why it wouldn't be called.
namespace snap {
namespace plugin_name {
class plugin_name_factory {
public:
plugin_name_factory() { plugin_register(this, name); }
...
} g_plugin_name_factory;
}
}
Note that the static keyword should not be used anymore in C++ anyway. It is often slower to have a static definition than a global.

When are static C++ class members initialized?

There appears to be no easy answer to this, but are there any assumptions that can be safely made about when a static class field can be accessed?
EDIT: The only safe assumption seems to be that all statics are initialized before the program commences (call to main). So, as long as I don't reference statics from other static initialization code, I should have nothing to worry about?
The standard guarantees two things - that objects defined in the same translation unit (usually it means .cpp file) are initialized in order of their definitions (not declarations):
3.6.2
The storage for objects with static storage duration (basic.stc.static) shall be zero-initialized (dcl.init) before any other initialization takes place. Zero-initialization and initialization with a constant expression are collectively called static initialization; all other initialization is dynamic initialization. Objects of POD types (basic.types) with static storage duration initialized with constant expressions (expr.const) shall be initialized before any dynamic initialization takes place. Objects with static storage duration defined in namespace scope in the same translation unit and dynamically initialized shall be initialized in the order in which their definition appears in the translation unit.
The other guaranteed thing is that initialization of static objects from a translation unit will be done before use of any object or function from this translation unit:
It is implementation-defined whether or not the dynamic initialization (dcl.init, class.static, class.ctor, class.expl.init) of an object of namespace scope is done before the first statement of main. If the initialization is deferred to some point in time after the first statement of main, it shall occur before the first use of any function or object defined in the same translation unit as the object to be initialized.
Nothing else i guaranteed (especially order of initialization of objects defined in different translation units is implementation defined).
EDIT
As pointed in Suma's comment, it is also guaranteed that they are initialized before main is entered.
They're initialized before the program starts (i.e. before main is entered).
When there are two or more definitions (of static data) in a single CPP file, then they're initialized in the sequence in which they're defined in the file (the one defined earlier/higher in the file is initialized before the next one is).
When there are two or more definitions (of static data) in more than one CPP file, the sequence in which the CPP files are processed is undefined/implementation-specific. This is a problem if the constructor of a global variable (called before the program is started) references another global variable defined in a different CPP file, which might not have been constructed yet. However, item 47 of Meyers' Effective C++ (which is titled Ensure that global objects are initialized before they're used) does describes a work-around ...
Define a static variable in a header file (it's static so you can have multiple instances of it without the linker complaining)
Have the constructor of that variable invoke whatever you need it to (in particular, construct the global singletons declared in the headers)
... which it says is a technique which may be used in some system header files e.g. to ensure that the cin global variable is initialized before even your static variables' constructors use it.
Your final conclusion in the Edit is correct. But the problem is the class static themselves. It's easier to say that my code will have class static members that don't refer to other global data/ class static members but once you take this route, things will go wrong soon. One approach that I have found useful in practice to not have class static data members but class static wrapper methods. These methods can then hold the static object within itself. For e.g.
TypeX* Class2::getClass1Instance()
{
static TypeX obj1;
return &obj1;
}
Note: An earlier answer says:
The other guaranteed thing is that initialization of static objects
from a translation unit will be done before use of any object or
function from this translation unit
This is not completely correct and the standard is incorrectly inferred here. This may not hold true if the function from a translation unit is called before main is entered.
I believe it can be accessed anytime during the execution. What remains undefined is the initialization order of the static variables.
They can be initialized in an implementation file (.c/cpp/cc) files. Dont initialize them in .h as compiler will complain about multiple definitions.
They are typically initialized before main, however order is uknown, hence avoid dependencies. They can certainly be accessed within member function. Keep in mind, order of initialization is unknown for static members. I would suggest to encapsulate a static member into the static function that will check if the member has been initialized.
There isn't a totally trivial answer to this question, but basically they're initialized just before control is passed to the entry point (main) of your program. The order in which they are initialized is (to my knowledge) undefined and may be compiler specific.
EDIT: To clarify, your added assumption is correct. As long as you're only accessing it post main-entry, you don't really have to worry about when/how it's initialized. It will be initialized by that time.
i think the main thread of a proccess will execute the following five steps in order
initialization of CRT library
static initialization
execution of main() function
static unitialization
unitialization of CRT library
you want reference statics from other static initialization code?
maybe the following codes work:
class A;
static auto_ptr<A> a(auto_ptr<A>(&GetStaticA()));
A &GetStaticA(void)
{
static A *a = NULL; //the static basic type variables initialized with constant experession will be initialized earlier than the other static ones
if (a == NULL)
{
a = new A();
return *a;
}
}

Difference between static in C and static in C++??

What is the difference between the static keyword in C and C++?
The static keyword serves the same purposes in C and C++.
When used at file level (outside of a function), it sets the visibility of the item it's applied to. Static items are not visible outside of their compilation unit (e.g., to the linker). Their duration is the same as the duration of the program.
These file-level items (functions and data) should be static unless there's a specific need to access them from outside (and there's almost never a need to give direct access to data since that breaks the central tenet of encapsulation).
If (as your comment to the question indicates) this is the only use of static you're concerned with then, no, there is no difference between C and C++.
When used within a function, it sets the duration of the item. Again, the duration is the same as the program and the item continues to exist between invocations of that function.
It does not affect the visibility of that item since it's visible only within the function. An example is a random number generator that needs to keep its seed value between invocations but doesn't want that value visible to other functions.
C++ has one more use, static within a class. When used there, it becomes a single class variable that's common across all objects of that class. One classic example is to store the number of objects that have been instantiated for a given class.
As others have pointed out, the use of file-level static has been deprecated in favour of unnamed namespaces. However, I believe it'll be a cold day in a certain warm place before it's actually removed from the language - there's just too much code using it at the moment. And ISO C have only just gotten around to removing gets() despite the amount of time we've all known it was a dangerous function.
And even though it's deprecated, that doesn't change its semantics now.
The use of static at the file scope to restrict access to the current translation unit is deprecated in C++, but still acceptable in C.
Instead, use an unnamed namespace
namespace
{
int file_scope_x;
}
Variables declared this way are only available within the file, just as if they were declared static.
The main reason for the deprecation is to remove one of the several overloaded meanings of the static keyword.
Originally, it meant that the variable, such as in a function, would be given storage for the lifetime of the program in an area for such variables, and not stored on the stack as is usual for function local variables.
Then the keyword was overloaded to apply to file scope linkage. It's not desirable to make up new keywords as needed, because they might break existing code. So this one was used again with a different meaning without causing conflicts, because a variable declared as static can't be both inside a function and at the top level, and functions didn't have the modifier before. (The storage connotation is totally lost when referring to functions, as they are not stored anywhere.)
When classes came along in C++ (and in Java and C#) the keyword was used yet again, but the meaning is at least closer to the original intention. Variables declared this way are stored in a global area, as opposed to on the stack as for function variables, or on the heap as for object members. Because variables cannot be both at the top level and inside a class definition, extra meaning can be unambiguously attached to class variables. They can only be referenced via the class name or from within an object of that class.
It has the same meaning in both languages.
But C++ adds classes. In the context of a class (and thus a struct) it has the extra meaning of making the method/variable class members rather members of the object.
class Plop
{
static int x; // This is a member of the class not an instance.
public:
static int getX() // method is a member of the class.
{
return x;
}
};
int Plop::x = 5;
Note that the use of static to mean "file scope" (aka namespace scope) is only deoprecated by the C++ Standard for objects, not for functions. In other words,:
// foo.cpp
static int x = 0; // deprecated
static int f() { return 1; } // not deprecated
To quote Annex D of the Standard:
The use of the static keyword is
deprecated when declaring objects in
namespace scope.
You can not declare a static variable inside structure in C... But allowed in Cpp with the help of scope resolution operator.
Also in Cpp static function can access only static variables but in C static function can have static and non static variables...😊