Proper Global Variable definition/declaration - c++

This is probably a pretty straight forward question but for some reason I haven't been able to find an answer on the great interwebs so far.
When using global variables, I know global variables are bad and should for the most part be avoided, but on those rare occasions where a global variable gets the job done best, should the global variable be both declared and initialized at once? I have been recently trying to drill into my head the mantra "always initialize variables upon declaration when possible" since this usually saves many headaches later and is encouraged with C++. Does this rule apply to global variables as well though?
If you initialize a variable in its global scope when declaring it, how does this affect the program? Is this best practice?
You're advice is very much appreciated!

You want to initialize variables at the point they are declared whenever possible, regardless of their scope.
For global variables, if you don't initialize them when they are declared, you need to initialize them after your program starts running, which leaves open the possibility that the global will be used before this occurs. Classes, with constructors, will be default constructed, which can cause work to be done and thrown out when the variable is assigned its proper value.
Global variables have another problem: the order they are constructed/initialized is only partially defined by the language. If you have two global variables that are declared in different source modules, you do not know which one will be constructed first. This can lead to the static initialization order fiasco.
So you can run into problems if you initialize them, or different problems if you don't. This is one reason why global variables should be avoided.
What is the solution? You can wrap your global variables into an accessor function. This will ensure that the variable has been properly initialized. So rather than the plain declaration:
SomeType big = ReadBig();
you can place it into a function:
const SomeType &GetBig() {
static SomeType big = ReadBig();
return big;
}
This also has the advantage of having your global variable const, so that it cannot be changed if this is necessary.

Yes, you do want to initialize global variables. As #paladin commented, if you do not initialize them, the compiler will attempt to initialize them with the default values.
Consider this simple example:
struct Foo {
Foo(int x, char *y, double z) {}
};
Foo f;
The compiler will try to initialize f, but there is no default constructor. This configures an error:
<source>:5:5: error: no matching constructor for initialization of 'Foo'
Foo f;
^
<source>:1:8: note: candidate constructor (the implicit copy constructor) not viable: requires 1 argument, but 0 were provided
So yes, you need to initialize your global variables, and if you don't the compiler will try to do it for you.

Related

Why C++ static data members are needed to define but non-static data members do not?

I am trying to understand the difference between the declaration & definition of static and non-static data members. Apology, if I am fundamentally miss understood concepts. Your explanations are highly appreciated.
Code Trying to understand
class A
{
public:
int ns; // declare non-static data member.
static int s; // declare static data member.
void foo();
};
int A::s; // define non-static data member.
// int A::ns; //This gives an error if defined.
void A::foo()
{
ns = 10;
s = 5; // if s is not defined this gives an error 'undefined reference'
}
When you declare something, you're telling the compiler that the name being declared exists and what kind of name it is (type, variable, function, etc.) The definition could be with the declaration (as with your class A) or be elsewhere—the compiler and linker will have to connect the two later.
The key point of a variable or function definition is that it tells the compiler and linker where this variable/function will live. If you have a variable, there needs to be a place in memory for it. If you have a function, there needs to be a place in the binary containing the function's instructions.
For non-static data members, the declaration is also the definition. That is, you're giving them a place to live¹. This place is within each instance of the class. Every time you make a new A object, it comes with an ns as part of it.
Static data members, on the other hand, have no associated object. Without a definition, you've got a situation where you have N instances of A all sharing the same s, but nowhere to put s. Therefore, C++ makes you choose one translation unit for it via a definition, most often the source file that acommpanies that header.
You could argue that the compiler should just pick one instance for it, but this won't work for various reasons, one being that you can use static data members before ever creating an instance, after the last instance is gone, or without having instances at all.
Now you might wonder why the compiler and linker still can't just figure it out on their own, and... that's actually pretty much what happens if you slap an inline on the variable or function. You can end up with multiple definitions, but only one will be chosen.
1: Giving them a place to live is a little beside the point here. All the compiler needs to know when it creates an object of that class is how much space to give it and which parts of that space are which data members. You could think of it as the compiler doing the definition part for you since there's only one place that data member could possibly live.
static members are essentially global variables with a special name and access rules tied to the class. Hence, they inherit all the problems for usual global variables. Namely, in the whole C++ program (which is the union of all translation units aka .cpp files) there should be exactly one definition of each global variable, no more.
You can think of "variable definition" as "the place which will allocate memory for the variable".
However, classes are typically defined in a header file (.h/.hpp/etc) which is included in multiple translation units. So it's up to the programmer to specify which translation unit actually defines the variable. Note that since C++17 we have the inline keyword which places this burden on a compiler, look for "inline variables". The naming is weird for historical reasons.
However, non-static members do not really exist until you create an instance of the class, i.e. an object. And it's the object lifetime and storage duration which define how each individual member is created/stored/destroyed. So there is no need to actually define them anywhere outside of the class.
static variables belongs to the class definition. non-static variables belong to the instances created with the class definition.
int main()
{
A::s = 5; // this is ok
A a;
a.ns = 5 // this is also ok
}

A global variable in C++ can be called?

So I was looking through a few old C++ Test Books and I found the solution to one of the questions very cool! I have never seen this "syntax" before and wanted to ask if anyone knows how it actually works and why it isnt taught widely!
Question: Give the output to the following code ->
int g =10; //TAKE NOTE OF THIS VARIABLE
void func(int &x, int y){
x = x-y;
y = x*10;
cout << x << ',' << y << "\n";
}
void main(int argc, char** argv){
int g = 7; //Another NOTE
func(::g,g); // <----- "::g" is different from "g"
cout << g << ',' << ::g << "\n";
func(g,::g);
cout << g << ',' << ::g << "\n";
}
The Output:
3,30
7,3
4,30
4,3
My question was how does the "::(variable)" syntax work exactly? It gets the variable stored outside of the main but where is that memory stored(Stack/Heap)? Can we change the value of that "Global" variable through pointers?
I thought this might allow for some really cool implementations, and wanted to share this knowledge with those like me did not know of this :)
My question was how does the "::(variable)" syntax work exactly?
:: is the scope resolution operator. With the name of a class or namespace before it, it means that the name after it is scoped inside that class or namespace. With, as here, nothing before it, it means that the name after it is scoped in the global namespace; that is, it's declared outside any classes, functions, or namespaces.
Often, you can refer to a global variable by name without ::. It's needed here since the global is hidden by a local variable with the same name. This is one reason to avoid global variables: the meaning of code can change if you add a declaration that hides it.
where is that memory stored(Stack/Heap)?
It's in static storage, neither on a stack nor the heap. The storage is allocated when the program begins, and lasts until it ends.
If the variable has a complicated type, it might not be initialised until some time after the program starts; and you might get obscure and painful bugs when your code uses it before initialisation. This is another reason to avoid global variables.
Can we change the value of that "Global" variable through pointers?
Yes. Your example does that, albeit with a reference rather than a pointer. It can also be changed directly, e.g. ::g = 42;, by any code at any time, so it's hard to reason about the state of a program that contains them. This is yet another reason to avoid global variables.
I thought this might allow for some really cool implementations
Global variables are nearly always more trouble than they're worth, for the reasons I've mentioned here and others. I'd avoid them if I were you.
Plain g means "use the most local g", ::g means use the global g.
More general, example::g means "use the g from namespace example".
Also, if you can somehow avoid it (and you usually can), do not use global variables and do not use trickery like this, it is very error-prone.
Static variables are in a category of their own: they're not really on "the stack" or "the heap." There is a specific section of your process' memory space set aside to hold static variables which is distinct from the stack or the heap. See the link given for a full discussion.
You can still use this variable as you would any other, and you can indeed take a pointer to it and change its value via the pointer.
As others have suggested, don't get too excited about this: global variables are frowned upon. They are often a sign of poor design, and there are many real-world disadvantages and pitfalls.
By prepending :: to a variable or function, you tell the compiler that it should look for this variable/function in the global namespace, i.e. outside the function's scope.
:: is the scope resolution operator. If you have some bar identifier declared within the foo namespace, you can use foo::bar to denote it. If you have the scope resolution operator with nothing preceding it, then it denotes the global namespace. That's why ::g refers to the g in the global namespace.
It gets the variable stored outside of the main but where is that memory stored(Stack/Heap)? Can we change the value of that "Global" variable through pointers?
I'm not sure why these questions arose. They are using the global variable g just as you would be able to if the local g wasn't declared. The only reason the :: is required is because there are two g identifiers. You can do pretty much anything with it that you would be able to do with any other object with the same type.
The :: in C++ is called the scope resolution operator. It's documented in many places (here for example). It provides a way to name variables that may be in a different scope than the current code. There is an optional scope specification in front of the :: (A::g, for instance, for a namespace A); no scope specification indicates the "global" scope. Anything you can do with variables (including modifying through pointers) can be done with variables that have a scope resolution operator. Where the variable lives is defined by what scope it was defined in.
Static variables (global or not) are in one of a couple of places, depending on the compiler and possibly on how they are initialized. See, for example, this thread.

is static const string member variable always initialized before used?

In C++, if I want to define some non-local const string which can be used in different classes, functions, files, the approaches that I know are:
use define directives, e.g.
#define STR_VALUE "some_string_value"
const class member variable, e.g.
class Demo {
public:
static const std::string ConstStrVal;
};
// then in cpp
std::string Demo::ConstStrVal = "some_string_value";
const class member function, e.g.
class Demo{
public:
static const std::string GetValue(){return "some_string_value";}
};
Now what I am not clear is, if we use the 2nd approach, is the variable ConstStrVal always initialized to "some_string_value" before it is actually used by any code in any case? Im concerned about this because of the "static initialization order fiasco". If this issue is valid, why is everybody using the 2nd approach?
Which is the best approach, 2 or 3? I know that #define directives have no respect of scope, most people don't recommend it.
Thanks!
if we use the 2nd approach, is the variable ConstStrVal always initialized to "some_string_value" before it is actually used by any code in any case?
No
It depends on the value it's initialized to, and the order of initialization. ConstStrVal has a global constructor.
Consider adding another global object with a constructor:
static const std::string ConstStrVal2(ConstStrVal);
The order is not defined by the language, and ConstStrVal2's constructor may be called before ConstStrVal has been constructed.
The initialization order can vary for a number of reasons, but it's often specified by your toolchain. Altering the order of linked object files could (for example) change the order of your image's initialization and then the error would surface.
why is everybody using the 2nd approach?
many people use other approaches for very good reasons…
Which is the best approach, 2 or 3?
Number 3. You can also avoid multiple constructions like so:
class Demo {
public:
static const std::string& GetValue() {
// this is constructed exactly once, when the function is first called
static const std::string s("some_string_value");
return s;
}
};
caution: this is approach is still capable of the initialization problem seen in ConstStrVal2(ConstStrVal). however, you have more control over initialization order and it's an easier problem to solve portably when compared to objects with global constructors.
In general, I (and many others) prefer to use functions to return values rather than variables, because functions give greater flexibility for future enhancement. Remember that most of the time spent on a successful software project is maintaining and enhancing the code, not writing it in the first place. It's hard to predict if your constant today might not be a compile time constant tomorrow. Maybe it will be read from a configuration file some day.
So I recommend approach 3 because it does what you want today and leaves more flexibility for the future.
Avoid using the preprocessor with C++. Also, why would you have a string in a class, but need it in other classes? I would re-evaluate your class design to allow better encapsulation. If you absolutely need this global string then I would consider adding a globals.h/cpp module and then declare/define string there as:
const char* const kMyErrorMsg = "This is my error message!";
Don't use preprocessor directives in C++, unless you're trying to achieve a holy purpose that can't possibly be achieved any other way.
From the standard (3.6.2):
Objects with static storage duration (3.7.1) shall be zero-initialized
(8.5) before any other initialization takes place. A reference with
static storage duration and an object of POD type with static storage
duration can be initialized with a constant expression (5.19); this is
called constant initialization. Together, zero-initialization and
constant initialization are called static initialization; all other
initialization is dynamic initialization. Static initialization shall
be performed before any dynamic initialization takes place. Dynamic
initialization of an object is either ordered or unordered.
Definitions of explicitly specialized class template static data
members have ordered initialization. Other class template static data
members (i.e., implicitly or explicitly instantiated specializations)
have unordered initialization. Other objects defined in namespace
scope have ordered initialization. Objects defined within a single
translation unit and with ordered initialization shall be initialized
in the order of their definitions in the translation unit. The order
of initialization is unspecified for objects with unordered
initialization and for objects defined in different translation units.
So, the fate of 2 depends on whether your variable is static initialised or dynamic initialised. For instance, in your concrete example, if you use const char * Demo::ConstStrVal = "some_string_value"; (better yet const char Demo::ConstStrVal[] if the value will stay constant in the program) you can be sure that it will be initialised no matter what. With a std::string, you can't be sure since it's not a POD type (I'm not dead sure on this one, but fairly sure).
3rd method allows you to be sure and the method in Justin's answer makes sure that there are no unnecessary constructions. Though keep in mind that the static method has a hidden overhead of checking whether or not the variable is already initialised on every call. If you're returning a simple constant, just returning your value is definitely faster since the function will probably be inlined.
All of that said, try to write your programs so as not to rely on static initialisation. Static variables are best regarded as a convenience, they aren't convenient any more when you have to juggle their initialisation orders.

Location of const in a function

A similar question was previously asked, but none of the answers really provided what I was looking for.
I am having trouble deciding where consts should be located in a function. I know a lot of people put them at the top, but if you put them as close as possible to where they are used, you'll reduce code span. I.e.
void f() {
const FOO = 3;
...// some code
if ( bar > FOO ) {
...// do stuff
}
}
or
void f() {
...// some code
const FOO = 3;
if ( bar > FOO ) {
...// do stuff
}
}
I'm leaning towards using the const at the top in small functions, and keeping the span as close as possible in large functions, but I was wondering what others' styles/thoughts are regarding this.
At the lowest scope possible, and directly before their first use.
As a matter of style, exceptions can be made for clarity/asthetics, e.g., grouping conceptually similar constants.
Many times, const values are placed at the top of the file so that they are easily recognizable (and "findable") to individuals doing development. However, if you only need a const value for a very small piece of code, it would be better for it to be scoped to only where it is needed as you suggested.
I recommend putting them in the header file under a namespace or class.
Your approach sounds about right.
I even put these magic numbers at the top of a file sometime, to make sure any "settings" or tweakables are highly visible to others.
It depends on what you really want to do. I usually put them very close where they are actually used.
I put them on top when they are grouped and in order to make sense of one you have to look at the others (for instance when a constant depends on another constant).
Especially if you are going to code some longer algorithm having all start values (including const values) and variables declared at the top of the function makes for a lot more clarity when reading the algorithm itself.
In pre-C99 versions of C, you could only define variables at the beginning of blocks. Consequently, the second alternative was not valid C code. I believe that Code Complete favors putting the declaration as close as possible to the first use, but some would have argued against that rule on the grounds making it makes things inconsistent between C and C++.
Now that both standard C and C++ allow you to move the declaration close to the first usage, that objection no longer holds.
There are times when there are compelling reasons why putting the declaration as late as possible is better for non-const variables than at the top. For example, a declaration without an initialization opens up the possibility of accidentally reading an uninitialized variable. Furthermore, in C++, declaring a class variable at the top of the function with no initialization invokes the default constructor. When it's later assigned, it invokes the assignment operator. If the variable were instead declared at the point of initialization, that invokes the copy constructor. The cost of a default constructor + assignment can often be larger than the cost of the copy constructor.
This last argument can only apply to non-const variables, obviously, since there is no assignment on a const variable. But, why would you want to have to look in a different place for your const declarations? And if const int n=3; is obviously const, what about const char *s = "FOO";? Is that const enough to belong at top or not? Or does it have to be const char * const s = "FOO";? Also, what if you don't yet know what value you want your const variable initialized to at the top, then you must postpone declaring your const variable until you know what it needs to be initialized to.

Difference between static in C and static in C++??

What is the difference between the static keyword in C and C++?
The static keyword serves the same purposes in C and C++.
When used at file level (outside of a function), it sets the visibility of the item it's applied to. Static items are not visible outside of their compilation unit (e.g., to the linker). Their duration is the same as the duration of the program.
These file-level items (functions and data) should be static unless there's a specific need to access them from outside (and there's almost never a need to give direct access to data since that breaks the central tenet of encapsulation).
If (as your comment to the question indicates) this is the only use of static you're concerned with then, no, there is no difference between C and C++.
When used within a function, it sets the duration of the item. Again, the duration is the same as the program and the item continues to exist between invocations of that function.
It does not affect the visibility of that item since it's visible only within the function. An example is a random number generator that needs to keep its seed value between invocations but doesn't want that value visible to other functions.
C++ has one more use, static within a class. When used there, it becomes a single class variable that's common across all objects of that class. One classic example is to store the number of objects that have been instantiated for a given class.
As others have pointed out, the use of file-level static has been deprecated in favour of unnamed namespaces. However, I believe it'll be a cold day in a certain warm place before it's actually removed from the language - there's just too much code using it at the moment. And ISO C have only just gotten around to removing gets() despite the amount of time we've all known it was a dangerous function.
And even though it's deprecated, that doesn't change its semantics now.
The use of static at the file scope to restrict access to the current translation unit is deprecated in C++, but still acceptable in C.
Instead, use an unnamed namespace
namespace
{
int file_scope_x;
}
Variables declared this way are only available within the file, just as if they were declared static.
The main reason for the deprecation is to remove one of the several overloaded meanings of the static keyword.
Originally, it meant that the variable, such as in a function, would be given storage for the lifetime of the program in an area for such variables, and not stored on the stack as is usual for function local variables.
Then the keyword was overloaded to apply to file scope linkage. It's not desirable to make up new keywords as needed, because they might break existing code. So this one was used again with a different meaning without causing conflicts, because a variable declared as static can't be both inside a function and at the top level, and functions didn't have the modifier before. (The storage connotation is totally lost when referring to functions, as they are not stored anywhere.)
When classes came along in C++ (and in Java and C#) the keyword was used yet again, but the meaning is at least closer to the original intention. Variables declared this way are stored in a global area, as opposed to on the stack as for function variables, or on the heap as for object members. Because variables cannot be both at the top level and inside a class definition, extra meaning can be unambiguously attached to class variables. They can only be referenced via the class name or from within an object of that class.
It has the same meaning in both languages.
But C++ adds classes. In the context of a class (and thus a struct) it has the extra meaning of making the method/variable class members rather members of the object.
class Plop
{
static int x; // This is a member of the class not an instance.
public:
static int getX() // method is a member of the class.
{
return x;
}
};
int Plop::x = 5;
Note that the use of static to mean "file scope" (aka namespace scope) is only deoprecated by the C++ Standard for objects, not for functions. In other words,:
// foo.cpp
static int x = 0; // deprecated
static int f() { return 1; } // not deprecated
To quote Annex D of the Standard:
The use of the static keyword is
deprecated when declaring objects in
namespace scope.
You can not declare a static variable inside structure in C... But allowed in Cpp with the help of scope resolution operator.
Also in Cpp static function can access only static variables but in C static function can have static and non static variables...😊