Why can't we declare static variables in the class definition? [duplicate] - c++

I have the following working code:
#include <string>
#include <iostream>
class A {
public:
const std::string test = "42";
//static const std::string test = "42"; // fails
};
int main(void){
A a;
std::cout << a.test << '\n';
}
Is there a good reason why it is not possible to make the test a static const ? I do understand prior to c++11 it was constrained by the standard. I thought that c++11 introduced in-class initializations to make it a little bit friendlier. I also not such semantic are available for integral type since quite some time.
Of course it works with the out-of class initialization in form of const std::string A::test = "42";
I guess that, if you can make it non-static, then the problem lies in one of the two. Initializing it out-of-class scope (normally consts are created during the instantiation of the object). But I do not think this is the problem if you are creating an object independant of any other members of the class. The second is having multiple definitions for the static member. E.g. if it were included in several .cpp files, landing into several object-files, and then the linker would have troubles when linking those object together (e.g. into one executable), as they would contain copies of the same symbol. To my understanding, this is exactly equal to the situation when ones provides the out-of-class right under the class declaration in the header, and then includes this common header in more than one place. As I recall, this leads to linker errors.
However, now the responsibility of handling this is moved onto user/programmer. If one wants to have a library with a static they need to provide a out-of-class definition, compile it into a separate object file, and then link all other object to this one, therefore having only one copy of the binary definition of the symbol.
I read the answers in Do we still need to separately define static members, even if they are initialised inside the class definition? and Why can't I initialize non-const static member or static array in class?.
I still would like to know:
Is it only a standard thing, or there is deeper reasoning behind it?
Can this be worked-around with the constexpr and user-defined
literals mechanisms. Both clang and g++ say the variable cannot have non-literal type. Maybe I can make one. (Maybe for some reason its also a bad idea)
Is it really such a big issue for linker to include only one copy of
the symbol? Since it is static const all should be binary-exact
immutable copies.
Plese also comment if I am missing or missunderstanding something.

Your question sort of has two parts. What does the standard say? And why is it so?
For a static member of type const std::string, it is required to be defined outside the class specifier and have one definition in one of the translation units. This is part of the One Definition Rule, and is specified in clause 3 of the C++ standard.
But why?
The problem is that an object with static storage duration needs unique static storage in the final program image, so it needs to be linked from one particular translation unit. The class specifier doesn't have a home in one translation unit, it just defines the type (which is required to be identically defined in all translation units where it is used).
The reason a constant integral doesn't need storage, is that it is used by the compiler as a constant expression and inlined at point of use. It never makes it to the program image.
However a complex type, like a std::string, with static storage duration need storage, even if they are const. This is because they may need to be dynamically initialized (have their constructor called before the entry to main).
You could argue that the compiler should store information about objects with static storage duration in each translation unit where they are used, and then the linker should merge these definitions at link-time into one object in the program image. My guess for why this isn't done, is that it would require too much intelligence from the linker.

Related

Why C++ static data members are needed to define but non-static data members do not?

I am trying to understand the difference between the declaration & definition of static and non-static data members. Apology, if I am fundamentally miss understood concepts. Your explanations are highly appreciated.
Code Trying to understand
class A
{
public:
int ns; // declare non-static data member.
static int s; // declare static data member.
void foo();
};
int A::s; // define non-static data member.
// int A::ns; //This gives an error if defined.
void A::foo()
{
ns = 10;
s = 5; // if s is not defined this gives an error 'undefined reference'
}
When you declare something, you're telling the compiler that the name being declared exists and what kind of name it is (type, variable, function, etc.) The definition could be with the declaration (as with your class A) or be elsewhere—the compiler and linker will have to connect the two later.
The key point of a variable or function definition is that it tells the compiler and linker where this variable/function will live. If you have a variable, there needs to be a place in memory for it. If you have a function, there needs to be a place in the binary containing the function's instructions.
For non-static data members, the declaration is also the definition. That is, you're giving them a place to live¹. This place is within each instance of the class. Every time you make a new A object, it comes with an ns as part of it.
Static data members, on the other hand, have no associated object. Without a definition, you've got a situation where you have N instances of A all sharing the same s, but nowhere to put s. Therefore, C++ makes you choose one translation unit for it via a definition, most often the source file that acommpanies that header.
You could argue that the compiler should just pick one instance for it, but this won't work for various reasons, one being that you can use static data members before ever creating an instance, after the last instance is gone, or without having instances at all.
Now you might wonder why the compiler and linker still can't just figure it out on their own, and... that's actually pretty much what happens if you slap an inline on the variable or function. You can end up with multiple definitions, but only one will be chosen.
1: Giving them a place to live is a little beside the point here. All the compiler needs to know when it creates an object of that class is how much space to give it and which parts of that space are which data members. You could think of it as the compiler doing the definition part for you since there's only one place that data member could possibly live.
static members are essentially global variables with a special name and access rules tied to the class. Hence, they inherit all the problems for usual global variables. Namely, in the whole C++ program (which is the union of all translation units aka .cpp files) there should be exactly one definition of each global variable, no more.
You can think of "variable definition" as "the place which will allocate memory for the variable".
However, classes are typically defined in a header file (.h/.hpp/etc) which is included in multiple translation units. So it's up to the programmer to specify which translation unit actually defines the variable. Note that since C++17 we have the inline keyword which places this burden on a compiler, look for "inline variables". The naming is weird for historical reasons.
However, non-static members do not really exist until you create an instance of the class, i.e. an object. And it's the object lifetime and storage duration which define how each individual member is created/stored/destroyed. So there is no need to actually define them anywhere outside of the class.
static variables belongs to the class definition. non-static variables belong to the instances created with the class definition.
int main()
{
A::s = 5; // this is ok
A a;
a.ns = 5 // this is also ok
}

Multiple static const int class variables in DLL [duplicate]

In the class:
class foo
{
public:
static int bar; //declaration of static data member
};
int foo::bar = 0; //definition of data member
We have to explicitly define the static variable, otherwise it will result in a
undefined reference to 'foo::bar'
My question is:
Why do we have to give an explicit definition of a static variable?
Please note that this is NOT a duplicate of previously asked undefined reference to static variable questions. This question intends to ask the reason behind explicit definition of a static variable.
From the beginning of time C++ language, just like C, was built on the principle of independent translation. Each translation unit is compiled by the compiler proper independently, without any knowledge of other translation units. The whole program only comes together later, at linking stage. Linking stage is the earliest stage at which the entire program is seen by linker (it is seen as collection of object files prepared by the compiler proper).
In order to support this principle of independent translation, each entity with external linkage has to be defined in one translation unit, and in only one translation unit. The user is responsible for distributing such entities between different translation units. It is considered a part of user intent, i.e. the user is supposed to decide which translation unit (and object file) will contain each definition.
The same applies to static members of the class. Static members of the class are entities with external linkage. The compiler expects you to define that entity in some translation unit. The whole purpose of this feature is to give you the opportunity to choose that translation unit. The compiler cannot choose it for you. It is, again, a part of your intent, something you have to tell the compiler.
This is no longer as critical as it used to be a while ago, since the language is now designed to deal with (and eliminate) large amount of identical definitions (templates, inline functions, etc.), but the One Definition Rule is still rooted in the principle of independent translation.
In addition to the above, in C++ language the point at which you define your variable will determine the order of its initialization with regard to other variables defined in the same translation unit. This is also a part of user intent, i.e. something the compiler cannot decide without your help.
Starting from C++17 you can declare your static members as inline. This eliminates the need for a separate definition. By declaring them in that fashion you effectively tell compiler that you don't care where this member is physically defined and, consequently, don't care about its initialization order.
In early C++ it was allowed to define the static data members inside the class which certainly violate the idea that class is only a blueprint and does not set memory aside. This has been dropped now.
Putting the definition of static member outside the class emphasize that memory is allocated only once for static data member (at compile time). Each object of that class doesn't have it own copy.
static is a storage type, when you declare the variable you are telling the compiler "this week be in the data section somewhere" and when you subsequently use it, the compiler emits code that loads a value from a TBD address.
In some contexts, the compiler can drive that a static is really a compile time constant and replace it with such, for example
static const int meaning = 42;
Inside a function that never takes the address of the value.
When dealing with class members, however, the compiler can't guess where this value should be created. It might be in a library you will link against, or a dll, or you might be providing a library where the value must be provided by the library consumer.
Usually, when someone asks this, though, it is because they are misusing static members.
If all you want us a constant value, e.g
static int MaxEntries;
...
int Foo::MaxEntries = 10;
You would be better off with one or other of the following
static const int MaxEntries = 10;
// or
enum { MaxEntries = 10 };
The static requires no separate definition until something tries to take the address of or form a reference to the variable, the enum version never does.
Inside the class you are only declaring the variable, ie: you tell the compiler that there is something with this name.
However, a static variable must get some memory space to live in, and this must be inside one translation unit. The compiler reserves this space only when you DEFINE the variable.
Structure is not variable, but its instance is. Hence we can include same structure declaration in multiple modules but we cannot have same instance name defined globally in multiple modules.
Static variable of structure is essentially a global variable. If we define it in structure declaration itself, we won't be able to use the structure declaration in multiple modules. Because that would result in having same global instance name (of static variable) defined in multiple modules causing linker error "Multiple definitions of same symbol"

Why static variable needs to be explicitly defined?

In the class:
class foo
{
public:
static int bar; //declaration of static data member
};
int foo::bar = 0; //definition of data member
We have to explicitly define the static variable, otherwise it will result in a
undefined reference to 'foo::bar'
My question is:
Why do we have to give an explicit definition of a static variable?
Please note that this is NOT a duplicate of previously asked undefined reference to static variable questions. This question intends to ask the reason behind explicit definition of a static variable.
From the beginning of time C++ language, just like C, was built on the principle of independent translation. Each translation unit is compiled by the compiler proper independently, without any knowledge of other translation units. The whole program only comes together later, at linking stage. Linking stage is the earliest stage at which the entire program is seen by linker (it is seen as collection of object files prepared by the compiler proper).
In order to support this principle of independent translation, each entity with external linkage has to be defined in one translation unit, and in only one translation unit. The user is responsible for distributing such entities between different translation units. It is considered a part of user intent, i.e. the user is supposed to decide which translation unit (and object file) will contain each definition.
The same applies to static members of the class. Static members of the class are entities with external linkage. The compiler expects you to define that entity in some translation unit. The whole purpose of this feature is to give you the opportunity to choose that translation unit. The compiler cannot choose it for you. It is, again, a part of your intent, something you have to tell the compiler.
This is no longer as critical as it used to be a while ago, since the language is now designed to deal with (and eliminate) large amount of identical definitions (templates, inline functions, etc.), but the One Definition Rule is still rooted in the principle of independent translation.
In addition to the above, in C++ language the point at which you define your variable will determine the order of its initialization with regard to other variables defined in the same translation unit. This is also a part of user intent, i.e. something the compiler cannot decide without your help.
Starting from C++17 you can declare your static members as inline. This eliminates the need for a separate definition. By declaring them in that fashion you effectively tell compiler that you don't care where this member is physically defined and, consequently, don't care about its initialization order.
In early C++ it was allowed to define the static data members inside the class which certainly violate the idea that class is only a blueprint and does not set memory aside. This has been dropped now.
Putting the definition of static member outside the class emphasize that memory is allocated only once for static data member (at compile time). Each object of that class doesn't have it own copy.
static is a storage type, when you declare the variable you are telling the compiler "this week be in the data section somewhere" and when you subsequently use it, the compiler emits code that loads a value from a TBD address.
In some contexts, the compiler can drive that a static is really a compile time constant and replace it with such, for example
static const int meaning = 42;
Inside a function that never takes the address of the value.
When dealing with class members, however, the compiler can't guess where this value should be created. It might be in a library you will link against, or a dll, or you might be providing a library where the value must be provided by the library consumer.
Usually, when someone asks this, though, it is because they are misusing static members.
If all you want us a constant value, e.g
static int MaxEntries;
...
int Foo::MaxEntries = 10;
You would be better off with one or other of the following
static const int MaxEntries = 10;
// or
enum { MaxEntries = 10 };
The static requires no separate definition until something tries to take the address of or form a reference to the variable, the enum version never does.
Inside the class you are only declaring the variable, ie: you tell the compiler that there is something with this name.
However, a static variable must get some memory space to live in, and this must be inside one translation unit. The compiler reserves this space only when you DEFINE the variable.
Structure is not variable, but its instance is. Hence we can include same structure declaration in multiple modules but we cannot have same instance name defined globally in multiple modules.
Static variable of structure is essentially a global variable. If we define it in structure declaration itself, we won't be able to use the structure declaration in multiple modules. Because that would result in having same global instance name (of static variable) defined in multiple modules causing linker error "Multiple definitions of same symbol"

Definition of a class's private integral constant: in the header or in the cpp file?

Subject has been addressed mostly here (Where to declare/define class scope constants in C++?)
and in particular here.
What I would like to fully understand, in case of integral constants, is there any difference between:
//In the header
class A {
private:
static const int member = 0; //Declaration and definition
};
And:
//In the header
class A {
private:
static const int member; //Only declaration
};
//In the cpp
const int A::member = 0; //Definition
(I understand that the second might have the advantage that if I change the value of the constant, I have to recompile only one file)
Side questions:
What happens for example with an inline method defined in the header that access member? Will it simply be not inlined? What would happens if, going to one extreme, all methods were defined in the header file as inline methods and all constants were defined in the cpp file?
Edit:
My apologizes: I thought it was not necessary, but I missed the fact that the member is static. My question stays, but now the code is legal.
If, as it was before the question was changed to make it static, it's a non-static member, then it can only be initialised in the constructor's initialiser list or (since 2011) in the member's declaration. Your second example was ill-formed.
If it's static, then you need a definition if it's odr-used: roughly speaking, if you do anything that requires its address rather than just its value. If you only use the value, then the first example is fine. But note that the comment is wrong - it's just a declaration, not a definition.
If you do need a definition, then it's up to you whether you specify the value in the declaration or the definition. Specifying it in the declaration allows better scope for optimisation, since the value is always available when the variable is used. Specifying it in the definition gives better encapsulation, only requiring one translation unit to be recompiled if it changes.
What happens for example with an inline method defined in the header that access member? Will it simply be not inlined?
There's no reason why accessing a data object defined in another translation unit should prevent a function from being inlined.
There are two points of view to take into account, namely visibility and addressing.
Note that the two are orthogonal, for you can actually declare the variable as initialized and still define it in a translation unit so it has an effective address in memory.
Visibility
Visibility affects the usage of the variable, and has some technical impacts.
For usage in template code as a non-type template parameter, the value must be visible at the point of use. Also, in C++11, this might be necessary for constexpr usage. Otherwise, it is not necessary that the value be visible.
Technically a visible value can trigger optimizations from the compiler. For example if (A::member) is trivially false so the test can be elided. This is generally referred to as Constant Propagation. While this may seem a good thing, at first glance, there is a profound impact though: all clients of the header file potentially depends on this value, and thus any change to this value means they should be recompiled. If you deliver this header as part of a shared library, this means that changing this value breaks the ABI.
Addressing
The rule here is quite simple: if the variable can be addressed (either passed by pointer or reference), then it needs to reside somewhere in memory. This requires a definition in one translation unit.
This is the question of data hiding. Whether you want to unveil internal class fields or not. If you are shipping a classes library and want to hide the implementation details then it is better to show in the interface as few entities as possible, then even a declaration of the private field member is too much.
I would just declare this value as a static variable inside a .cpp file.

Why can't I make in-class initialized `const const std::string` a static member

I have the following working code:
#include <string>
#include <iostream>
class A {
public:
const std::string test = "42";
//static const std::string test = "42"; // fails
};
int main(void){
A a;
std::cout << a.test << '\n';
}
Is there a good reason why it is not possible to make the test a static const ? I do understand prior to c++11 it was constrained by the standard. I thought that c++11 introduced in-class initializations to make it a little bit friendlier. I also not such semantic are available for integral type since quite some time.
Of course it works with the out-of class initialization in form of const std::string A::test = "42";
I guess that, if you can make it non-static, then the problem lies in one of the two. Initializing it out-of-class scope (normally consts are created during the instantiation of the object). But I do not think this is the problem if you are creating an object independant of any other members of the class. The second is having multiple definitions for the static member. E.g. if it were included in several .cpp files, landing into several object-files, and then the linker would have troubles when linking those object together (e.g. into one executable), as they would contain copies of the same symbol. To my understanding, this is exactly equal to the situation when ones provides the out-of-class right under the class declaration in the header, and then includes this common header in more than one place. As I recall, this leads to linker errors.
However, now the responsibility of handling this is moved onto user/programmer. If one wants to have a library with a static they need to provide a out-of-class definition, compile it into a separate object file, and then link all other object to this one, therefore having only one copy of the binary definition of the symbol.
I read the answers in Do we still need to separately define static members, even if they are initialised inside the class definition? and Why can't I initialize non-const static member or static array in class?.
I still would like to know:
Is it only a standard thing, or there is deeper reasoning behind it?
Can this be worked-around with the constexpr and user-defined
literals mechanisms. Both clang and g++ say the variable cannot have non-literal type. Maybe I can make one. (Maybe for some reason its also a bad idea)
Is it really such a big issue for linker to include only one copy of
the symbol? Since it is static const all should be binary-exact
immutable copies.
Plese also comment if I am missing or missunderstanding something.
Your question sort of has two parts. What does the standard say? And why is it so?
For a static member of type const std::string, it is required to be defined outside the class specifier and have one definition in one of the translation units. This is part of the One Definition Rule, and is specified in clause 3 of the C++ standard.
But why?
The problem is that an object with static storage duration needs unique static storage in the final program image, so it needs to be linked from one particular translation unit. The class specifier doesn't have a home in one translation unit, it just defines the type (which is required to be identically defined in all translation units where it is used).
The reason a constant integral doesn't need storage, is that it is used by the compiler as a constant expression and inlined at point of use. It never makes it to the program image.
However a complex type, like a std::string, with static storage duration need storage, even if they are const. This is because they may need to be dynamically initialized (have their constructor called before the entry to main).
You could argue that the compiler should store information about objects with static storage duration in each translation unit where they are used, and then the linker should merge these definitions at link-time into one object in the program image. My guess for why this isn't done, is that it would require too much intelligence from the linker.