Static vs New/Malloc - c++

I was wondering if people could shed some light on the uses of "static." I have never run into an issue where I have explicitly declared a variable or method as static. I understand that when declaring something as "static" it gets stuffed into the data segment of your program, similar to globals, and hence the variable is accessible for the run of your program. If this is the case, why not just make a static variable a global variable. Hell, why not just throw this variable on the heap using a new or a malloc, both methods ensure the variable will be available for you throughout the run of your program.

static has multiple meanings in C, and C++ heaps on even more.
In a file scope declaration (what I think the question is about), static controls the visibility of an identifier.
Let's set aside C++ and use the C concepts.
File scope identifiers which name objects or functions have linkage. Linkage can be external (program-wide) or internal (within one translation unit).
static specifies internal linkage.
This is important because if a name with internal linkage appears in multiple units, those occurrences are not related. One module can have a static foo function and another one in the same program can have a different foo function. They both exist and are reachable by the name foo from their respective units.
This is not possible with external linkage: there must be one foo.
malloc creates an object which is potentially available everywhere in a program, as long as it is not freed, but in a different sense. The object is available if you have its pointer. A pointer is a kind of "run time name": an access key to get to the object. Linkage makes an object or function available if you know its name (at compile time) and if that object and function has the right kind of linkage relative to where you're trying to access it from.
In a dynamic operating system in which multiple programs come into life and terminate, the storage for its static data and functions (whether they have external or internal linkage) is in fact dynamically allocated. The system routine which loads a program has to do something similar to malloc to fetch memory for all of the fixed areas of the program.
Sometimes C programs use malloc even for "singleton" objects that are referenced globally via global pointers. These objects behave like de-facto static variables since they basically have a lifetime which is almost that of the entire program, and are accessed through the pointer, which is accessed by name. This is useful if the objects have properties (such as size) that is not known until run time, or if their initialization is expensive and they are not always needed (only when certain cases occur in the program).
Supplemental factoids about static and extern:
In C, at file scope, extern ensures that the declaration of an object, where an initializer is omitted, is in fact a declaration. Without extern it is a tentative definition, but if an initializer is present, then it is a definition.
In C, at file scope extern doesn't mean "this declaration has external linkage", surprisingly. An extern declaration inherits linkage from a previous declaration of the same name.
A block-scope extern in C means "this name, which is being introduced into this scope, refers to the external definition with external linkage". The linkage is inherited from a previous file-scope declaration of the name, if it exists, otherwise it is external.
A block-scope static on an object controls not linkage, but storage duration. A static object is not instantiated each time on entry into the block; a single copy of it exists, and can be initialized prior to program startup. (In C++, non-constant expressions can initialize such object or its members; in that case, initialization occurs on the first execution of the block).
A block-scope static function declaration declares a function with internal linkage.
There is no way, in a block scope, to declare an external object name which has internal linkage. Paradoxically, the first extern declaration in the following snippet is correct, but the second, block-scope one, is erroneous!
static int name; /* external name with internal linkage */
extern int name; /* redundant redeclaration of the above */
void foo(void)
{
int name; /* local variable shadowing external one */
{
/* attempt to "punch through" shadow and reach external: */
extern int name; /* ERROR! */
}
}
Clearly, the word "external" has an ambiguous meaning between "outside of any function" and "program-wide linkage" and this ambiguity is embroiled in the extern keyword.
In C++, static takes on additional meanings. In a class declaration, it declares "static member functions" which belong to the class scope and have the same access to class instances as non-static member functions do, but are not invoked on objects (do not have the implicit this parameter). Class data members marked static have a single class-wide instance; they are not instantiated per-object. (Unfortunately, they don't participate properly in inheritance like true object-oriented class variables, which can be overridden in a derived class to be instance or vice versa.)
In C++, a privacy similar to internal linkage can be achieved using an unnamed namespace rather than static. Namespaces make the internal/external linkage concept mostly an obsolete mechanism for C compatibility.
C++ involves extern in the special extern "LANG" syntax (e.g. extern "C").
static_cast is unrelated to static; what they have in common is "static" meaning "prior to program run time": static storage is determined prior to run time, and the conversion of a static casts is also determined at compile time (without run-time-type info).

Related

Multiple static const int class variables in DLL [duplicate]

In the class:
class foo
{
public:
static int bar; //declaration of static data member
};
int foo::bar = 0; //definition of data member
We have to explicitly define the static variable, otherwise it will result in a
undefined reference to 'foo::bar'
My question is:
Why do we have to give an explicit definition of a static variable?
Please note that this is NOT a duplicate of previously asked undefined reference to static variable questions. This question intends to ask the reason behind explicit definition of a static variable.
From the beginning of time C++ language, just like C, was built on the principle of independent translation. Each translation unit is compiled by the compiler proper independently, without any knowledge of other translation units. The whole program only comes together later, at linking stage. Linking stage is the earliest stage at which the entire program is seen by linker (it is seen as collection of object files prepared by the compiler proper).
In order to support this principle of independent translation, each entity with external linkage has to be defined in one translation unit, and in only one translation unit. The user is responsible for distributing such entities between different translation units. It is considered a part of user intent, i.e. the user is supposed to decide which translation unit (and object file) will contain each definition.
The same applies to static members of the class. Static members of the class are entities with external linkage. The compiler expects you to define that entity in some translation unit. The whole purpose of this feature is to give you the opportunity to choose that translation unit. The compiler cannot choose it for you. It is, again, a part of your intent, something you have to tell the compiler.
This is no longer as critical as it used to be a while ago, since the language is now designed to deal with (and eliminate) large amount of identical definitions (templates, inline functions, etc.), but the One Definition Rule is still rooted in the principle of independent translation.
In addition to the above, in C++ language the point at which you define your variable will determine the order of its initialization with regard to other variables defined in the same translation unit. This is also a part of user intent, i.e. something the compiler cannot decide without your help.
Starting from C++17 you can declare your static members as inline. This eliminates the need for a separate definition. By declaring them in that fashion you effectively tell compiler that you don't care where this member is physically defined and, consequently, don't care about its initialization order.
In early C++ it was allowed to define the static data members inside the class which certainly violate the idea that class is only a blueprint and does not set memory aside. This has been dropped now.
Putting the definition of static member outside the class emphasize that memory is allocated only once for static data member (at compile time). Each object of that class doesn't have it own copy.
static is a storage type, when you declare the variable you are telling the compiler "this week be in the data section somewhere" and when you subsequently use it, the compiler emits code that loads a value from a TBD address.
In some contexts, the compiler can drive that a static is really a compile time constant and replace it with such, for example
static const int meaning = 42;
Inside a function that never takes the address of the value.
When dealing with class members, however, the compiler can't guess where this value should be created. It might be in a library you will link against, or a dll, or you might be providing a library where the value must be provided by the library consumer.
Usually, when someone asks this, though, it is because they are misusing static members.
If all you want us a constant value, e.g
static int MaxEntries;
...
int Foo::MaxEntries = 10;
You would be better off with one or other of the following
static const int MaxEntries = 10;
// or
enum { MaxEntries = 10 };
The static requires no separate definition until something tries to take the address of or form a reference to the variable, the enum version never does.
Inside the class you are only declaring the variable, ie: you tell the compiler that there is something with this name.
However, a static variable must get some memory space to live in, and this must be inside one translation unit. The compiler reserves this space only when you DEFINE the variable.
Structure is not variable, but its instance is. Hence we can include same structure declaration in multiple modules but we cannot have same instance name defined globally in multiple modules.
Static variable of structure is essentially a global variable. If we define it in structure declaration itself, we won't be able to use the structure declaration in multiple modules. Because that would result in having same global instance name (of static variable) defined in multiple modules causing linker error "Multiple definitions of same symbol"

C++ singleton lazy initialization implementation and linkage seems conflict

C++ Singleton design pattern I come across this question and learned that there are two ways to implement the singleton pattern in c++.
1) allocate the single instance in heap and return it in the instance() call
2) return a static instance in the instance() call, this is also known as the lazy initialization implementation.
But I think the second, that is the lazy initialization implementation, is wrong due to following reasons.
Static object returned from the instance() call has internal linkage and will have unique copies in different translation unit. So if user modifies the singleton, it will not be reflected in any other translation unit.
But there are many statement that the second implementation is correct, am I missing something?
In the context of a method, the static keyword is not about linkage. It just affects the "storage class" of the defined variable. And for static local variables the standard explicitly states:
9.3.6 A static local variable in a member function always refers to the same object, whether or not the member function is inline.
So it doesn't matter at all whether you put the code in a header or cpp file.
Note that for free / non-member function it does indeed depend on the linkage of the function, as KerrekSB pointed out.
The linkage of the name of implementation object does not matter. What matters is the linkage of the name of the function you use to access the object, and that name has, of course, external linkage:
thing.h:
Thing & TheThing(); // external linkage
thing.cpp:
#include "thing.h"
Thing & TheThing() { static Thing impl; return impl; }
Every use of the name TheThing in the program refers to the same entity, namely the function defined (uniquely) in thing.cpp.
Remember, linkage is a property of names, not of objects.
You are wrong, because the singleton is defined in one single translation unit, the one that contains the definition of the function that returns it. That means that all translation units that wants to use the singleton ask it to the single one that actually defines it, and in the end all use the same object (as expected for a singleton pattern :-) ).

why is "static" both storage class and linkage specifier?

The static keyword defines how a variable is to be stored in memory, i.e., in the data segment if initialized or in the BSS if uninitialized. But the keyword also specifies how a variable is to be linked, i.e., local scope only.
How or why are these two things related? Can the two be separated, or was this a necessary design consideration?
IOW, why is it that if I want my variable to exist for the duration of the program, it must be linked internally?
The keyword static can probably be seen as somewhat "overloaded".
The following usage-options are all viable:
Static local variables
Static global variables
Static member variables
Static global functions
Static member functions
Variables:
In terms of runtime, all types of static variables are essentially the same. They all reside in the data-section of the program, and their addresses remain constant throughout the execution of the program. So the only difference between them is during compilation, in the scope of declaration:
Static local variable: recognized by the compiler only in the scope of the function
Static global variable: recognized by the compiler only in the scope of the file
Static member variable: recognized by the compiler only in the scope of the class
Functions:
In terms of runtime, all types of functions (static and non-static) are essentially the same. They all reside in the code-section of the program, and their addresses remain constant throughout the execution of the program. So the only difference between them is during compilation, in the scope of declaration:
Static global function: recognized by the compiler only in the scope of the file
Static member function: recognized by the compiler only in the scope of the class
Re
“why is it that if I want my variable to exist for the duration of the program, it must be linked internally?”
no that is not so.
Program-global variables are supported, and that's the default for a non-const namespace level variable. It's just wise to avoid them (to the extent possible).
Re apparent conflation of concepts in the single keyword static, since C++ doesn't support dynamic libraries, local to the translation unit linkage is not meaningful for lifetime shorter than the program execution.

C++: Static Members can't be Defined at Declaration, but Static Function Variables can?

Here are two variables declared with the keyword static:
void fcn() {
static int x = 2;
}
class cls() {
static int y;
};
We all know that in order for cls to link properly, int cls::y needs to be explicitly defined by the programmer exactly once.
Based on the answers to static variables in an inlined function , it seems that even though no out-of-class definition is required for fcn::x , it is guaranteed that even inlined versions of fcn from different compilation units will reference the same fcn::x. If this is true, then the linker has to be smart enough to reach between compilation units and connect multiple instances of "the same" variable to ensure that static function variables perform as expected.
If this is possible for static function variables, it seems to me that it should also be possible for static class members... so why does the standard require a single out-of-class definition of static class members?
Yes, the linker will indeed have to merge different instances of fcn::x. In other words, even though formally the language says that fcn::x has no linkage, physically it will have to be exposed as an external symbol in all object files that contain it. This is how it is typically implemented in practice: your compiler will expose fcx::x as some sort of heavily mangled external name ##$%^&_fcx_x or such (to ensure it can never clash with "real" external names). This is what the linker will use to merge all instances of fcn::x into one.
As for class members... Firstly, it is not really about what is "possible". It is about the language-level concepts of declarations and definitions. It is about One Definition Rule, which is a higher-level concept than what is "possible" based on raw linker features. According to that rule, objects with external linkage shall be defined by the user and shall have one and only one definition. Static data members of the class are objects with external linkage. The rest follows.
Secondly, and more practically, there's another serious issue with static data members. It is their order of initialization. Static data members are [guaranteed to be] initialized no later than when the first function from the containing translation unit is called (which refers to the translation unit that contains the data member definition). And static objects declared in a single translation unit are initialized the order of their definition, top-to bottom. This is an important property of static data member initialization process. Allowing static data members to be defined "automatically" would defy this part of the specification and would require massive changes to this part of the language.
In other words, when you provide a dedicated definition for a static data member of the class, you are not just doing it for ODR compliance, you are actually expressing your desired initialization order for that object.
Meanwhile, static variables inside functions are objects with no linkage. Hence they receive a different treatment at the conceptual level. And they have well-defined order-of-initialization semantics that is completely unaffected by the need to merge multiple definitions into one.

Difference between static in C and static in C++??

What is the difference between the static keyword in C and C++?
The static keyword serves the same purposes in C and C++.
When used at file level (outside of a function), it sets the visibility of the item it's applied to. Static items are not visible outside of their compilation unit (e.g., to the linker). Their duration is the same as the duration of the program.
These file-level items (functions and data) should be static unless there's a specific need to access them from outside (and there's almost never a need to give direct access to data since that breaks the central tenet of encapsulation).
If (as your comment to the question indicates) this is the only use of static you're concerned with then, no, there is no difference between C and C++.
When used within a function, it sets the duration of the item. Again, the duration is the same as the program and the item continues to exist between invocations of that function.
It does not affect the visibility of that item since it's visible only within the function. An example is a random number generator that needs to keep its seed value between invocations but doesn't want that value visible to other functions.
C++ has one more use, static within a class. When used there, it becomes a single class variable that's common across all objects of that class. One classic example is to store the number of objects that have been instantiated for a given class.
As others have pointed out, the use of file-level static has been deprecated in favour of unnamed namespaces. However, I believe it'll be a cold day in a certain warm place before it's actually removed from the language - there's just too much code using it at the moment. And ISO C have only just gotten around to removing gets() despite the amount of time we've all known it was a dangerous function.
And even though it's deprecated, that doesn't change its semantics now.
The use of static at the file scope to restrict access to the current translation unit is deprecated in C++, but still acceptable in C.
Instead, use an unnamed namespace
namespace
{
int file_scope_x;
}
Variables declared this way are only available within the file, just as if they were declared static.
The main reason for the deprecation is to remove one of the several overloaded meanings of the static keyword.
Originally, it meant that the variable, such as in a function, would be given storage for the lifetime of the program in an area for such variables, and not stored on the stack as is usual for function local variables.
Then the keyword was overloaded to apply to file scope linkage. It's not desirable to make up new keywords as needed, because they might break existing code. So this one was used again with a different meaning without causing conflicts, because a variable declared as static can't be both inside a function and at the top level, and functions didn't have the modifier before. (The storage connotation is totally lost when referring to functions, as they are not stored anywhere.)
When classes came along in C++ (and in Java and C#) the keyword was used yet again, but the meaning is at least closer to the original intention. Variables declared this way are stored in a global area, as opposed to on the stack as for function variables, or on the heap as for object members. Because variables cannot be both at the top level and inside a class definition, extra meaning can be unambiguously attached to class variables. They can only be referenced via the class name or from within an object of that class.
It has the same meaning in both languages.
But C++ adds classes. In the context of a class (and thus a struct) it has the extra meaning of making the method/variable class members rather members of the object.
class Plop
{
static int x; // This is a member of the class not an instance.
public:
static int getX() // method is a member of the class.
{
return x;
}
};
int Plop::x = 5;
Note that the use of static to mean "file scope" (aka namespace scope) is only deoprecated by the C++ Standard for objects, not for functions. In other words,:
// foo.cpp
static int x = 0; // deprecated
static int f() { return 1; } // not deprecated
To quote Annex D of the Standard:
The use of the static keyword is
deprecated when declaring objects in
namespace scope.
You can not declare a static variable inside structure in C... But allowed in Cpp with the help of scope resolution operator.
Also in Cpp static function can access only static variables but in C static function can have static and non static variables...😊