I read the reason for defining static data members in the source file is because if they were in the header file and multiple source files included the header file- the definitions would get output multiple times. I can see why this would be a problem for the static const data member, but why is this a problem for the static data member?
I'm not too sure I fully understand why there is a problem if the definition is written in the header file...
The multiple definition problem for variables is due to two main deficiencies in the language definition.
As shown below you can easily work around it. There is no technical reason why there is no direct support. It has to do with the feature not being in sufficient high demand that people on the committee have chosen to make it a priority.
First, why multiple definitions in general are a problem. Since C++ lacks support for separately compiled modules (deficiency #1), programmers have to emulate that feature by using textual preprocessing etc. And then it's easy to inadvertently introduce two or more definitions of the same name, which would most likely be in error.
For functions this was solved by the inline keyword and property. A freestanding function can only be explicitly inline, while a member function can be implicitly inline by being defined in the class definition. Either way, if a function is inline then it can be defined in multiple translation units, and it must be defined in every translation unit where it's used, and those definitions must be equivalent.
Mainly that solution allowed classes to be defined in header files.
No such language feature was needed to support data, variables defined in header files, so it just isn't there: you can't have inline variables. This is language deficiency #2.
However, you can obtain the effect of inline variables via a special exemption for static data members of class templates. The reason for the exemption is that class templates generally have to be fully defined in header files (unless the template is only used internally in a translation unit), and so for a class template to be able to have static data members, it's necessary with either an exemption from the general rules, or some special support. The committee chose the exemption-from-the-rules route.
template< class Dummy >
struct math_
{
static double const pi;
};
template< class Dummy >
double const math_<Dummy>::pi = 3.14;
typedef math_<void> math;
The above has been referred to as the templated const trick. As far as I know I was the one who once introduced it, in the [comp.lang.c++] Usenet group, so I can't give credit to someone else. I've also posted it a few times here on SO.
Anyway, this means that every C++ compiler and linker internally supports and must support the machinery needed for inline data, and yet the language doesn't have that feature.
However, on the third hand, C++11 has constexpr, where you can write the above as just
struct math
{
static double constexpr pi = 3.14;
};
Well, there is a difference, that you can't take the address of the C++11 math::pi, but that's a very minor limitation.
I think you're confusing two things: static data members and global variables markes as static.
The latter have internal linkage, which means that if you put their definition in a header file that multiple translation units #include, each translation unit will receive a private copy of those variables.
Global variables marked as const have internal linkage by default, so you won't need to specify static explicitly for those. Hence, the linker won't complain about multiple definitions of global const variable or of global non-const variables marked as static, while it will complain in the other cases (because those variables would have external linkage).
Concerning static data members, this is what Paragraph 9.4.2/5 of the C++11 Standard says:
static data members of a class in namespace scope have external linkage (3.5). A local class shall not have
static data members.
This means that if you put their definition in a header file #included by multiple translation units, you will end up with multiple definitions of the same symbol in the corresponding object files (exactly like non-const global variables), no matter what their const-qualification is. In that case, your program would violate the One Definition Rule.
Also, this Q&A on StackOverflow may give you a clearer understanding of the subject.
Related
According to Static data members on the IBM C++ knowledge center:
The declaration of a static data member in the member list of a class is not a definition. You must define the static member outside of the class declaration, in namespace scope.
Why is that? What's the schematic behind that regarding the memory allocation?
It's a rule of the language, known as the One Definition Rule. Within a program, each static object (if it's used) must be defined once, and only once.
Class definitions typically go in header files, included in multiple translation units (i.e. from multiple source files). If the static object's declaration in the header were a definition, then you'd end up with multiple definitions, one in each unit that includes the header, which would break the rule. So instead, it's not a definition, and you must provide exactly one definition somewhere else.
In principle, the language could do what it does with inline functions, allowing multiple definitions to be consolidated into a single one. But it doesn't, so we're stuck with this rule.
It's not about the memory allocation piece at all. It's about having a single point of definition in a linked compilation unit. #Nick pointed this out as well.
From Bjarne's webite https://www.stroustrup.com/bs_faq2.html#in-class
A class is typically declared in a header file and a header file is
typically included into many translation units. However, to avoid
complicated linker rules, C++ requires that every object has a unique
definition. That rule would be broken if C++ allowed in-class
definition of entities that needed to be stored in memory as objects.
As of C++17 you can now define static data members inside a class. See cppreference:
A static data member may be declared inline. An inline static data
member can be defined in the class definition and may specify an
initializer. It does not need an out-of-class definition:
struct X {
inline static int n = 1;
};
According to the definition of static data members, we can define static variables only for once (i.e in class only) and it is shared by every instance of the class. Also, static members can be accessed without any object.
As per OOPS guidelines compiler does not allocate memory to class instead of that it allocates memory to objects, but static members are independent of object, so to allocate memory to static variables, we define the static data members outside of the class, that's why once this variable is declared, it exists till the program executes. Generally static member functions are used to modify the static variables.
In the class:
class foo
{
public:
static int bar; //declaration of static data member
};
int foo::bar = 0; //definition of data member
We have to explicitly define the static variable, otherwise it will result in a
undefined reference to 'foo::bar'
My question is:
Why do we have to give an explicit definition of a static variable?
Please note that this is NOT a duplicate of previously asked undefined reference to static variable questions. This question intends to ask the reason behind explicit definition of a static variable.
From the beginning of time C++ language, just like C, was built on the principle of independent translation. Each translation unit is compiled by the compiler proper independently, without any knowledge of other translation units. The whole program only comes together later, at linking stage. Linking stage is the earliest stage at which the entire program is seen by linker (it is seen as collection of object files prepared by the compiler proper).
In order to support this principle of independent translation, each entity with external linkage has to be defined in one translation unit, and in only one translation unit. The user is responsible for distributing such entities between different translation units. It is considered a part of user intent, i.e. the user is supposed to decide which translation unit (and object file) will contain each definition.
The same applies to static members of the class. Static members of the class are entities with external linkage. The compiler expects you to define that entity in some translation unit. The whole purpose of this feature is to give you the opportunity to choose that translation unit. The compiler cannot choose it for you. It is, again, a part of your intent, something you have to tell the compiler.
This is no longer as critical as it used to be a while ago, since the language is now designed to deal with (and eliminate) large amount of identical definitions (templates, inline functions, etc.), but the One Definition Rule is still rooted in the principle of independent translation.
In addition to the above, in C++ language the point at which you define your variable will determine the order of its initialization with regard to other variables defined in the same translation unit. This is also a part of user intent, i.e. something the compiler cannot decide without your help.
Starting from C++17 you can declare your static members as inline. This eliminates the need for a separate definition. By declaring them in that fashion you effectively tell compiler that you don't care where this member is physically defined and, consequently, don't care about its initialization order.
In early C++ it was allowed to define the static data members inside the class which certainly violate the idea that class is only a blueprint and does not set memory aside. This has been dropped now.
Putting the definition of static member outside the class emphasize that memory is allocated only once for static data member (at compile time). Each object of that class doesn't have it own copy.
static is a storage type, when you declare the variable you are telling the compiler "this week be in the data section somewhere" and when you subsequently use it, the compiler emits code that loads a value from a TBD address.
In some contexts, the compiler can drive that a static is really a compile time constant and replace it with such, for example
static const int meaning = 42;
Inside a function that never takes the address of the value.
When dealing with class members, however, the compiler can't guess where this value should be created. It might be in a library you will link against, or a dll, or you might be providing a library where the value must be provided by the library consumer.
Usually, when someone asks this, though, it is because they are misusing static members.
If all you want us a constant value, e.g
static int MaxEntries;
...
int Foo::MaxEntries = 10;
You would be better off with one or other of the following
static const int MaxEntries = 10;
// or
enum { MaxEntries = 10 };
The static requires no separate definition until something tries to take the address of or form a reference to the variable, the enum version never does.
Inside the class you are only declaring the variable, ie: you tell the compiler that there is something with this name.
However, a static variable must get some memory space to live in, and this must be inside one translation unit. The compiler reserves this space only when you DEFINE the variable.
Structure is not variable, but its instance is. Hence we can include same structure declaration in multiple modules but we cannot have same instance name defined globally in multiple modules.
Static variable of structure is essentially a global variable. If we define it in structure declaration itself, we won't be able to use the structure declaration in multiple modules. Because that would result in having same global instance name (of static variable) defined in multiple modules causing linker error "Multiple definitions of same symbol"
Here are two variables declared with the keyword static:
void fcn() {
static int x = 2;
}
class cls() {
static int y;
};
We all know that in order for cls to link properly, int cls::y needs to be explicitly defined by the programmer exactly once.
Based on the answers to static variables in an inlined function , it seems that even though no out-of-class definition is required for fcn::x , it is guaranteed that even inlined versions of fcn from different compilation units will reference the same fcn::x. If this is true, then the linker has to be smart enough to reach between compilation units and connect multiple instances of "the same" variable to ensure that static function variables perform as expected.
If this is possible for static function variables, it seems to me that it should also be possible for static class members... so why does the standard require a single out-of-class definition of static class members?
Yes, the linker will indeed have to merge different instances of fcn::x. In other words, even though formally the language says that fcn::x has no linkage, physically it will have to be exposed as an external symbol in all object files that contain it. This is how it is typically implemented in practice: your compiler will expose fcx::x as some sort of heavily mangled external name ##$%^&_fcx_x or such (to ensure it can never clash with "real" external names). This is what the linker will use to merge all instances of fcn::x into one.
As for class members... Firstly, it is not really about what is "possible". It is about the language-level concepts of declarations and definitions. It is about One Definition Rule, which is a higher-level concept than what is "possible" based on raw linker features. According to that rule, objects with external linkage shall be defined by the user and shall have one and only one definition. Static data members of the class are objects with external linkage. The rest follows.
Secondly, and more practically, there's another serious issue with static data members. It is their order of initialization. Static data members are [guaranteed to be] initialized no later than when the first function from the containing translation unit is called (which refers to the translation unit that contains the data member definition). And static objects declared in a single translation unit are initialized the order of their definition, top-to bottom. This is an important property of static data member initialization process. Allowing static data members to be defined "automatically" would defy this part of the specification and would require massive changes to this part of the language.
In other words, when you provide a dedicated definition for a static data member of the class, you are not just doing it for ODR compliance, you are actually expressing your desired initialization order for that object.
Meanwhile, static variables inside functions are objects with no linkage. Hence they receive a different treatment at the conceptual level. And they have well-defined order-of-initialization semantics that is completely unaffected by the need to merge multiple definitions into one.
I am writing a header-only library, and I can’t make up my mind between declaring the functions I provide to the user static or inline. Is there any reason why I should prefer one to the other in that case?
They both provide different functionalities.
There are two implications of using the inline keyword(§ 7.1.3/4):
It hints the compiler that substitution of function body at the point of call is preferable over the usual function call mechanism.
Even if the inline substitution is omitted, the other rules(especially w.r.t One Definition Rule) for inline are followed.
The static keyword on the function forces the inline function to have an internal linkage(inline functions have external linkage) Each instance of such a function is treated as a separate function(address of each function is different) and each instance of these functions have their own copies of static local variables & string literals(an inline function has only one copy of these)
Note: this answer is for C++. For C, see caf's answer. The two languages differ.
static has two relevant meanings:
For a function at namespace scope, static gives the function internal linkage, which in practical terms means that the name is not visible to the linker. static can also be used this way for data, but for data this usage was deprecated in C++03 (§D.2 in C++03 Annex D, normative). Still, constants have internal linkage by default (it's not a good idea to make that explicit, since the usage for data is deprecated).
For a function in a class, static removes the implicit this argument, so that the function can be called without an object of the class.
In a header one would usually not use internal linkage for functions, because one doesn't want a function to be duplicated in every compilation unit where the header is included.
A common convention is to instead use a nested namespace called detail, when one needs classes or functions that are not part of the public module interface, and wants to reduce the pollution the ordinary namespace (i.e., reduce the potential for name conflict). This convention is used by the Boost library. In the same way as with include guard symbols this convention signifies a lack of module support in current C++, where one is essentially reduced to simulating some crucial language features via conventions.
The word inline also has two relevant meanings:
For a function at namespace scope it tells the compiler that the definition of the function is intentionally provided in every compilation unit where it’s used. In practical terms this makes the linker ignore multiple definitions of the function, and makes it possible to define non-template functions in header files. There is no corresponding language feature for data, although templates can be used to simulate the inline effect for data.
It also, unfortunately, gives the compiler a strong hint that calls to the function should preferably be expanded “inline” in the machine code.
The first meaning is the only guaranteed meaning of inline.
In general, apply inline to every function definition in a header file. There is one exception, namely a function defined directly in a class definition. Such a function is automatically declared inline (i.e., you avoid linker protests without explicitly adding the word inline, so that one practical usage in this context is to apply inline to a function declaration within a class, to tell a reader that a definition of that function follows later in the header file).
So, while it appears that you are a bit confused about the meanings of static and inline – they're not interchangable! – you’re essentially right that static and inline are somehow connected. Moving a (free) function out of a class to namespace scope, you would change static → inline, and moving a function from namespace scope into a class, you would change inline → static. Although it isn’t a common thing to do, I have found it to be not uncommon while refactoring header-only code.
Summing up:
Use inline for every namespace scope function defined in a header. In particular, do use inline for a specialization of a function template, since function templates can only be fully specialized, and the full specialization is an ordinary function. Failure to apply inline to a function template specialization in a header file, will in general cause linking errors.
Use some special nested namespace e.g. called detail to avoid pollution with internal implementation detail names.
Use static for static class members.
Don't use static to make explicit that a constant has internal linkage, because this use of static is deprecated in C++03 (even though apparently the deprecation was removed in C++11).
Keep in mind that while inline can’t be applied to data, it is possible to achieve just about the same (in-practice) effect by using templates. But where you do need some big chunk of shared constant data, implemented in a header, I recommend producing a reference to the data via an inline function. It’s much easier to code up, and much easier to understand for a reader of the code. :-)
static and inline are orthogonal. In otherwords, you can have either or both or none. Each has its own separate uses which determine whether or not to use them.
If that function is not sharing the class state: static. To make a function static can benefit from being called at anywhere.
If the function is relatively small and clear: inline
(This answer is based on the C99 rules; your question is tagged with both C and C++, and the rules in C++ may well be different)
If you declare the function as just inline, then the definition of your function is an inline definition. The compiler is not required to use the inline definition for all calls to the function in the translation unit - it is allowed to assume that there is an external definition of the same function provided in another translation unit, and use that for some or all calls. In the case of a header-only library, there will not be such an external definition, so programs using it may fail at link time with a missing definition.
On the other hand, you may declare the function as both inline and static. In this case, the compiler must use the definition you have provided for all calls to the function in the translation unit, whether it actually inlines them or not. This is appropriate for a header-only library (although the compiler is likely to behave exactly the same for a function declared inline static as for one declared static only, in both cases inlining where it feels it would be beneficial, so in practice there is probably little to be gained over static only).
I've recently seen a bit on SO about the static keyword before a function and I'm wondering how to use it properly.
1) When should I write the keyword static before a non-member function?
2) Is it dangerous to define a static non-member function in the header? Why (not)?
(Side Question)
3) Is it possible to define a class in the header file in a certain way, so that it would only be available in the translation unit where you use it first?
(The reason that I'm asking this is because I'm learning STL and it might be a good solution for my predicates etc (possibly functors), since I don't like to define functions other than member-functions in the cpp file)
(Also, I think it is related in a way to the original question because according to my current reasoning, it would do the same thing as static before a function does)
EDIT
Another question that came up while seeing some answers:
4) Many people tell me I have to declare the static function in the header, and define it in the source file. But the static function is unique to the translation unit. How can the linker know which translation unit it is unique to, since header files do not directly relate to a source file (only when you include them)?
static, as I think you're using it, is a means of symbol hiding. Functions declared static are not given global visibility (a Unix-like nm will show these as 't' rather than 'T'). These functions cannot be called from other translation units.
For C++, static in this sense has been replaced, more or less, by the anonymous namespace, e.g.,
static int x = 0;
is pretty equivalent to
namespace {
int x = 0;
}
Note that the anonymous namespace is unique for every compilation unit.
Unlike static, the anonymous namespace also works for classes. You can say something like
namespace {
class Foo{};
}
and reuse that class name for unrelated classes in other translation units. I think this goes to your point 3.
The compiler actually gives each of the symbols you define this way a unique name (I think it includes the compilation time). These symbols are never available to another translation unit and will never collide with a symbol from another translation unit.
Note that all non-member functions declared to be inline are also by default static. That's the most common (and implicit) use of static. As to point 2, defining a static but not inline function in a header is a pretty corner case: it's not dangerous per se but it's so rarely useful it might be confusing. Such a function might or might not be emitted in every translation unit. A compiler might generate warnings if you never actually call the function in some TUs. And if that static function has within it a static variable, you get a separate variable per translation unit even with one definition in a single .h which might be confusing. There just aren't many (non-inline) use cases.
As to point 4, I suspect those people are conflating the static member function meaning of static with that of the linkage meaning of static. Which is as good a reason as any for using the anonymous namespace for the latter.
The keyword "static" is overloaded to mean several different things:
It can control visibility (both C and C++)
It can persist a variable between subroutine invocations (both C and C++)
... and ...
It can make a method or member apply to an entire class (rather than just a class instance: C++ only)
Short answer: it's best not to use ANY language facility unless
a) you're pretty sure you need it
b) you're pretty sure you know what you're doing (i.e. you know WHY you need it)
There's absolutely nothing wrong with declaring static variables or standalone functions in a .cpp file. Declaring a static variable or standalone function in a header is probably unwise. And, if you actually need "static" for a class function or class member, then a header is arguably the BEST place to define it.
Here's a good link:
http://www.cprogramming.com/tutorial/statickeyword.html
'Hope that helps
You should define non-member functions as static when they are only to be visible inside the code file they were declared in.
This same question was asked on cplusplus.com