According to Bjarne Stroustrup:
if (and only if) you use an initialized member in away that requires it to be stored as an oject in memory ,the member must be(uniquely) defined somewhere. The initializer may not be repeated.
(The C++ Programming Language, 3rd Edition, Section 10.4.6.1)
He gives this example:
class curious{
public:
static const int c1=7;
//..
};
const int curious::c1; //necessary
Then why it is necessary to define a static member, because we may not be initializing it at all?
Also, const and reference members are not declared anywhere, even though it is necessary to initialize them (no default constructor).
If you don't use c1 in a way that requires it to be stored in memory (such as taking the address, etc) the compiler can replace all uses of c1 with the value 7. However, if you use it in such a way that it needs to be stored somewhere, then you have to provide a definition so that it exists in some compilation unit.
Member variables are not declared anywhere because they exist inside the object when it is created; each member variable lives inside the object that is created. static variables exist apart from any object instance (that is, the static variable exists regardless of whether the class is instantiated or not) so they need somewhere to live (sometimes) which is independent of a specific instance.
Related
I am trying to understand the difference between the declaration & definition of static and non-static data members. Apology, if I am fundamentally miss understood concepts. Your explanations are highly appreciated.
Code Trying to understand
class A
{
public:
int ns; // declare non-static data member.
static int s; // declare static data member.
void foo();
};
int A::s; // define non-static data member.
// int A::ns; //This gives an error if defined.
void A::foo()
{
ns = 10;
s = 5; // if s is not defined this gives an error 'undefined reference'
}
When you declare something, you're telling the compiler that the name being declared exists and what kind of name it is (type, variable, function, etc.) The definition could be with the declaration (as with your class A) or be elsewhere—the compiler and linker will have to connect the two later.
The key point of a variable or function definition is that it tells the compiler and linker where this variable/function will live. If you have a variable, there needs to be a place in memory for it. If you have a function, there needs to be a place in the binary containing the function's instructions.
For non-static data members, the declaration is also the definition. That is, you're giving them a place to live¹. This place is within each instance of the class. Every time you make a new A object, it comes with an ns as part of it.
Static data members, on the other hand, have no associated object. Without a definition, you've got a situation where you have N instances of A all sharing the same s, but nowhere to put s. Therefore, C++ makes you choose one translation unit for it via a definition, most often the source file that acommpanies that header.
You could argue that the compiler should just pick one instance for it, but this won't work for various reasons, one being that you can use static data members before ever creating an instance, after the last instance is gone, or without having instances at all.
Now you might wonder why the compiler and linker still can't just figure it out on their own, and... that's actually pretty much what happens if you slap an inline on the variable or function. You can end up with multiple definitions, but only one will be chosen.
1: Giving them a place to live is a little beside the point here. All the compiler needs to know when it creates an object of that class is how much space to give it and which parts of that space are which data members. You could think of it as the compiler doing the definition part for you since there's only one place that data member could possibly live.
static members are essentially global variables with a special name and access rules tied to the class. Hence, they inherit all the problems for usual global variables. Namely, in the whole C++ program (which is the union of all translation units aka .cpp files) there should be exactly one definition of each global variable, no more.
You can think of "variable definition" as "the place which will allocate memory for the variable".
However, classes are typically defined in a header file (.h/.hpp/etc) which is included in multiple translation units. So it's up to the programmer to specify which translation unit actually defines the variable. Note that since C++17 we have the inline keyword which places this burden on a compiler, look for "inline variables". The naming is weird for historical reasons.
However, non-static members do not really exist until you create an instance of the class, i.e. an object. And it's the object lifetime and storage duration which define how each individual member is created/stored/destroyed. So there is no need to actually define them anywhere outside of the class.
static variables belongs to the class definition. non-static variables belong to the instances created with the class definition.
int main()
{
A::s = 5; // this is ok
A a;
a.ns = 5 // this is also ok
}
Must I provide an explicit initializer for the definition of static members of integral type outside the class body, or can I safely omit that? Omitting the initializer and accessing the value seems to return a value of 0 every time, this implies that it is indeed value initialized and can be omitted. What does the standard say about this?
Static members of the class are entities with external linkage. The compiler expects you to define that entity in some translation unit. The whole purpose of this feature is to give you the opportunity to choose that translation unit. The compiler cannot choose it for you. It is, again, a part of your intent, something you have to tell the compiler.
In early C++ it was allowed to define the static data members inside the class which certainly violate the idea that class is only a blueprint and does not set memory aside. This has been dropped now.
Putting the definition of static member outside the class emphasize that memory is allocated only once for static data member (at compile time). Each object of that class doesn't have it own copy.
Starting from C++17 you can declare your static members as inline. This eliminates the need for a separate definition. By declaring them in that fashion you effectively tell the compiler that you don't care where this member is physically defined and, consequently, don't care about its initialization order.
In C++, if I want to define some non-local const string which can be used in different classes, functions, files, the approaches that I know are:
use define directives, e.g.
#define STR_VALUE "some_string_value"
const class member variable, e.g.
class Demo {
public:
static const std::string ConstStrVal;
};
// then in cpp
std::string Demo::ConstStrVal = "some_string_value";
const class member function, e.g.
class Demo{
public:
static const std::string GetValue(){return "some_string_value";}
};
Now what I am not clear is, if we use the 2nd approach, is the variable ConstStrVal always initialized to "some_string_value" before it is actually used by any code in any case? Im concerned about this because of the "static initialization order fiasco". If this issue is valid, why is everybody using the 2nd approach?
Which is the best approach, 2 or 3? I know that #define directives have no respect of scope, most people don't recommend it.
Thanks!
if we use the 2nd approach, is the variable ConstStrVal always initialized to "some_string_value" before it is actually used by any code in any case?
No
It depends on the value it's initialized to, and the order of initialization. ConstStrVal has a global constructor.
Consider adding another global object with a constructor:
static const std::string ConstStrVal2(ConstStrVal);
The order is not defined by the language, and ConstStrVal2's constructor may be called before ConstStrVal has been constructed.
The initialization order can vary for a number of reasons, but it's often specified by your toolchain. Altering the order of linked object files could (for example) change the order of your image's initialization and then the error would surface.
why is everybody using the 2nd approach?
many people use other approaches for very good reasons…
Which is the best approach, 2 or 3?
Number 3. You can also avoid multiple constructions like so:
class Demo {
public:
static const std::string& GetValue() {
// this is constructed exactly once, when the function is first called
static const std::string s("some_string_value");
return s;
}
};
caution: this is approach is still capable of the initialization problem seen in ConstStrVal2(ConstStrVal). however, you have more control over initialization order and it's an easier problem to solve portably when compared to objects with global constructors.
In general, I (and many others) prefer to use functions to return values rather than variables, because functions give greater flexibility for future enhancement. Remember that most of the time spent on a successful software project is maintaining and enhancing the code, not writing it in the first place. It's hard to predict if your constant today might not be a compile time constant tomorrow. Maybe it will be read from a configuration file some day.
So I recommend approach 3 because it does what you want today and leaves more flexibility for the future.
Avoid using the preprocessor with C++. Also, why would you have a string in a class, but need it in other classes? I would re-evaluate your class design to allow better encapsulation. If you absolutely need this global string then I would consider adding a globals.h/cpp module and then declare/define string there as:
const char* const kMyErrorMsg = "This is my error message!";
Don't use preprocessor directives in C++, unless you're trying to achieve a holy purpose that can't possibly be achieved any other way.
From the standard (3.6.2):
Objects with static storage duration (3.7.1) shall be zero-initialized
(8.5) before any other initialization takes place. A reference with
static storage duration and an object of POD type with static storage
duration can be initialized with a constant expression (5.19); this is
called constant initialization. Together, zero-initialization and
constant initialization are called static initialization; all other
initialization is dynamic initialization. Static initialization shall
be performed before any dynamic initialization takes place. Dynamic
initialization of an object is either ordered or unordered.
Definitions of explicitly specialized class template static data
members have ordered initialization. Other class template static data
members (i.e., implicitly or explicitly instantiated specializations)
have unordered initialization. Other objects defined in namespace
scope have ordered initialization. Objects defined within a single
translation unit and with ordered initialization shall be initialized
in the order of their definitions in the translation unit. The order
of initialization is unspecified for objects with unordered
initialization and for objects defined in different translation units.
So, the fate of 2 depends on whether your variable is static initialised or dynamic initialised. For instance, in your concrete example, if you use const char * Demo::ConstStrVal = "some_string_value"; (better yet const char Demo::ConstStrVal[] if the value will stay constant in the program) you can be sure that it will be initialised no matter what. With a std::string, you can't be sure since it's not a POD type (I'm not dead sure on this one, but fairly sure).
3rd method allows you to be sure and the method in Justin's answer makes sure that there are no unnecessary constructions. Though keep in mind that the static method has a hidden overhead of checking whether or not the variable is already initialised on every call. If you're returning a simple constant, just returning your value is definitely faster since the function will probably be inlined.
All of that said, try to write your programs so as not to rely on static initialisation. Static variables are best regarded as a convenience, they aren't convenient any more when you have to juggle their initialisation orders.
There appears to be no easy answer to this, but are there any assumptions that can be safely made about when a static class field can be accessed?
EDIT: The only safe assumption seems to be that all statics are initialized before the program commences (call to main). So, as long as I don't reference statics from other static initialization code, I should have nothing to worry about?
The standard guarantees two things - that objects defined in the same translation unit (usually it means .cpp file) are initialized in order of their definitions (not declarations):
3.6.2
The storage for objects with static storage duration (basic.stc.static) shall be zero-initialized (dcl.init) before any other initialization takes place. Zero-initialization and initialization with a constant expression are collectively called static initialization; all other initialization is dynamic initialization. Objects of POD types (basic.types) with static storage duration initialized with constant expressions (expr.const) shall be initialized before any dynamic initialization takes place. Objects with static storage duration defined in namespace scope in the same translation unit and dynamically initialized shall be initialized in the order in which their definition appears in the translation unit.
The other guaranteed thing is that initialization of static objects from a translation unit will be done before use of any object or function from this translation unit:
It is implementation-defined whether or not the dynamic initialization (dcl.init, class.static, class.ctor, class.expl.init) of an object of namespace scope is done before the first statement of main. If the initialization is deferred to some point in time after the first statement of main, it shall occur before the first use of any function or object defined in the same translation unit as the object to be initialized.
Nothing else i guaranteed (especially order of initialization of objects defined in different translation units is implementation defined).
EDIT
As pointed in Suma's comment, it is also guaranteed that they are initialized before main is entered.
They're initialized before the program starts (i.e. before main is entered).
When there are two or more definitions (of static data) in a single CPP file, then they're initialized in the sequence in which they're defined in the file (the one defined earlier/higher in the file is initialized before the next one is).
When there are two or more definitions (of static data) in more than one CPP file, the sequence in which the CPP files are processed is undefined/implementation-specific. This is a problem if the constructor of a global variable (called before the program is started) references another global variable defined in a different CPP file, which might not have been constructed yet. However, item 47 of Meyers' Effective C++ (which is titled Ensure that global objects are initialized before they're used) does describes a work-around ...
Define a static variable in a header file (it's static so you can have multiple instances of it without the linker complaining)
Have the constructor of that variable invoke whatever you need it to (in particular, construct the global singletons declared in the headers)
... which it says is a technique which may be used in some system header files e.g. to ensure that the cin global variable is initialized before even your static variables' constructors use it.
Your final conclusion in the Edit is correct. But the problem is the class static themselves. It's easier to say that my code will have class static members that don't refer to other global data/ class static members but once you take this route, things will go wrong soon. One approach that I have found useful in practice to not have class static data members but class static wrapper methods. These methods can then hold the static object within itself. For e.g.
TypeX* Class2::getClass1Instance()
{
static TypeX obj1;
return &obj1;
}
Note: An earlier answer says:
The other guaranteed thing is that initialization of static objects
from a translation unit will be done before use of any object or
function from this translation unit
This is not completely correct and the standard is incorrectly inferred here. This may not hold true if the function from a translation unit is called before main is entered.
I believe it can be accessed anytime during the execution. What remains undefined is the initialization order of the static variables.
They can be initialized in an implementation file (.c/cpp/cc) files. Dont initialize them in .h as compiler will complain about multiple definitions.
They are typically initialized before main, however order is uknown, hence avoid dependencies. They can certainly be accessed within member function. Keep in mind, order of initialization is unknown for static members. I would suggest to encapsulate a static member into the static function that will check if the member has been initialized.
There isn't a totally trivial answer to this question, but basically they're initialized just before control is passed to the entry point (main) of your program. The order in which they are initialized is (to my knowledge) undefined and may be compiler specific.
EDIT: To clarify, your added assumption is correct. As long as you're only accessing it post main-entry, you don't really have to worry about when/how it's initialized. It will be initialized by that time.
i think the main thread of a proccess will execute the following five steps in order
initialization of CRT library
static initialization
execution of main() function
static unitialization
unitialization of CRT library
you want reference statics from other static initialization code?
maybe the following codes work:
class A;
static auto_ptr<A> a(auto_ptr<A>(&GetStaticA()));
A &GetStaticA(void)
{
static A *a = NULL; //the static basic type variables initialized with constant experession will be initialized earlier than the other static ones
if (a == NULL)
{
a = new A();
return *a;
}
}
What is the difference between the static keyword in C and C++?
The static keyword serves the same purposes in C and C++.
When used at file level (outside of a function), it sets the visibility of the item it's applied to. Static items are not visible outside of their compilation unit (e.g., to the linker). Their duration is the same as the duration of the program.
These file-level items (functions and data) should be static unless there's a specific need to access them from outside (and there's almost never a need to give direct access to data since that breaks the central tenet of encapsulation).
If (as your comment to the question indicates) this is the only use of static you're concerned with then, no, there is no difference between C and C++.
When used within a function, it sets the duration of the item. Again, the duration is the same as the program and the item continues to exist between invocations of that function.
It does not affect the visibility of that item since it's visible only within the function. An example is a random number generator that needs to keep its seed value between invocations but doesn't want that value visible to other functions.
C++ has one more use, static within a class. When used there, it becomes a single class variable that's common across all objects of that class. One classic example is to store the number of objects that have been instantiated for a given class.
As others have pointed out, the use of file-level static has been deprecated in favour of unnamed namespaces. However, I believe it'll be a cold day in a certain warm place before it's actually removed from the language - there's just too much code using it at the moment. And ISO C have only just gotten around to removing gets() despite the amount of time we've all known it was a dangerous function.
And even though it's deprecated, that doesn't change its semantics now.
The use of static at the file scope to restrict access to the current translation unit is deprecated in C++, but still acceptable in C.
Instead, use an unnamed namespace
namespace
{
int file_scope_x;
}
Variables declared this way are only available within the file, just as if they were declared static.
The main reason for the deprecation is to remove one of the several overloaded meanings of the static keyword.
Originally, it meant that the variable, such as in a function, would be given storage for the lifetime of the program in an area for such variables, and not stored on the stack as is usual for function local variables.
Then the keyword was overloaded to apply to file scope linkage. It's not desirable to make up new keywords as needed, because they might break existing code. So this one was used again with a different meaning without causing conflicts, because a variable declared as static can't be both inside a function and at the top level, and functions didn't have the modifier before. (The storage connotation is totally lost when referring to functions, as they are not stored anywhere.)
When classes came along in C++ (and in Java and C#) the keyword was used yet again, but the meaning is at least closer to the original intention. Variables declared this way are stored in a global area, as opposed to on the stack as for function variables, or on the heap as for object members. Because variables cannot be both at the top level and inside a class definition, extra meaning can be unambiguously attached to class variables. They can only be referenced via the class name or from within an object of that class.
It has the same meaning in both languages.
But C++ adds classes. In the context of a class (and thus a struct) it has the extra meaning of making the method/variable class members rather members of the object.
class Plop
{
static int x; // This is a member of the class not an instance.
public:
static int getX() // method is a member of the class.
{
return x;
}
};
int Plop::x = 5;
Note that the use of static to mean "file scope" (aka namespace scope) is only deoprecated by the C++ Standard for objects, not for functions. In other words,:
// foo.cpp
static int x = 0; // deprecated
static int f() { return 1; } // not deprecated
To quote Annex D of the Standard:
The use of the static keyword is
deprecated when declaring objects in
namespace scope.
You can not declare a static variable inside structure in C... But allowed in Cpp with the help of scope resolution operator.
Also in Cpp static function can access only static variables but in C static function can have static and non static variables...😊