Initialising variables - why and what are the risks?

Initialising variables - why and what are the risks? - c++

Having recently got feedback from Code Review stating the impropriety of non-initialized variables, my class variable initialization now seems very ugly:
class MyClass
{
private:
int variable_one;
int variable_two;
int variable_three;
MyClass():variable_one(0),variable_two(0),variable_three(0){};
//...
};
Previously, I wouldn't define my variables until they are needed:
class MyClass
{
private:
int variable_one;
void MyFunction(int x)
{
variable_one = x;
}
};
Why is my second example frowned upon? What are the risks involved by bot initializing variables?

The risk with leaving variables uninitialized is that you might read them before they've been set up. That can lead to extremely hard-to-diagnose bugs and erratic behavior. You can also initialize them to sentinel values to make it easier to detect when they haven't been set up.
As a note, since C++11 (what's now supported by most compilers), you can just do this:
class MyClass
{
private:
int variable_one = 0;
int variable_two = 0;
int variable_three = 0;
};
Now there's little code overhead and it makes clear that they get default values unless you specifically set them to something else.

It is frowned upon because someone, someday is going to use your class and assume that internal state has been setup during initialization. You are perhaps worried about initialization not needed and wasting time? Instead, depending on how many various and assorted methods you have, you will be repeating the initialization code in every one of them, until you forget in just one of them, but it works fine because you ran in debug mode so all memory is cleared. Then a month later, someone compiles in release and all the sudden they think that their code is broken because you didn't initialize your code in a central location.
RAII - Resource Allocation Is Initialization whenever possible. When they create an instance of your class, make sure it is initialized.

The answers provided so far are all correct, but no one mentioned there's a more general OO principle at work: An object's methods transform it from one internally consistent state to another. It should never be possible to use an object where it does something silly (unless it's a silly object).
For example, if an object has an array and a count of currently active elements in the array, it should never be true that there are active elements when count is zero. The method that updates the array also updates the count, keeping the state consistent with itself. The momentary inconsistency -- after the element is added and before the count is updated -- is not visible to the user of the object.
In your example, MyClass gets off on the wrong foot by creating a nondeterministic initial state. Whatever relationship the member variables have to each other, their values are determined by compiler happenstance. The more it's used, the probability that that's what you want approaches zero.

The first method you've specified is called as initializer list, and it's the only way to initialize when you've const or reference data members in your class.
If you don't initialize your data members that are C++ objects, it will still be initialized to a default value by calling the corresponding constructor, and then the same exercise will be repeated when you try to initialize it at a later time (like how you're doing it inside a function, when you deem it to be necessary).

Related

What's better to use and why?

class MyClass {
private:
unsigned int currentTimeMS;
public:
void update() {
currentTimeMS = getTimeMS();
// ...
}
};
class MyClass {
public:
void update() {
unsigned int currentTimeMS = getTimeMS();
// ...
}
};
update() calls in main game loop so in the second case we get a lot of allocation operations (unsigned int currentTimeMS). In the first case we get only one allocate and use that allocated variable before.
Which of this code better to use and why?

I recommend the second variant because it is stateless and the scope of the variable is smaller. Use the first one only if you really experience a performance issue, which I consider unlikely.
If you do not modify the variable value later, you should also consider to make it const in order to express this intent in your code and to give the compiler additional optimization options.

It depends upon your needs. If currentTimeMS is needed only temporarily in the update(), then surely declare it there. (in your case, #option2)
But if it's value is needed for the instance of the class (i.e. being used in some other method), then you should declare it as a field (in your case, #option1).

In the first example, you are saving the state of this class object. In the second one, you're not, so the currentTime will be lost the instant update() is called.
It is really up to you to decide which one you need.

The first case is defining a member variable the second a local variable. Basic class stuff. A private member variable is available to any function (method) in that class. a local variable is only available in the function in which it is declared.

Which of this code better to use and why?
First and foremost, the cited code is at best a tiny micro-optimization. Don't worry about such things unless you have to.
In fact, this is most likely a disoptimization. Sometimes automatic variables are allocated on the stack. Stack allocation is extremely fast (and even free sometimes). There is no need to worry. Other times, the compiler may place a small automatic variable such the unsigned int used here in a register. There's no allocation whatsoever.
Compare that to making the variable a data member of the class, and solely for the purpose of avoiding that allocation. Accessing that variable involves going through the this pointer. Pointer dereference has a cost, potentially well beyond that of adding an offset to a pointer. The dereference might result in a cache miss. Even worse, this dereferencing may well be performed every time the variable is referenced.
That said, sometimes it is better to create data members solely for the purpose of avoiding automatic variables in various member functions. Large arrays declared as local automatic variables might well result in stack overflow. Note, however, that making double big_array[2000][2000] a data member of MyClass will most likely make it impossible to have a variable of type MyClass be declared as a local automatic variable in some function.
The standard solution to the problems created by placing large arrays on the stack is to instead allocate them on the heap. This leads to another place where creating a data member to avoid a local variable can be beneficial. While stack allocation is extremely fast, heap allocation (e.g., new) is quite slow. A member function that is called repeatedly may benefit by making the automatic variable std::unique_ptr<double> big_array = std::make_unique<double>(2000*2000) a data member of MyClass.
Note that neither of the above applies to the sample code in the question. Note also that the last concern (making an heap-allocated variable a data member so as to avoid repeated allocations and deallocations) means that the code has to go through the this pointer to access that memory. In tight code, I've sometimes been forced to create a local automatic pointer variable such as double* local_pointer = this->some_pointer_member to avoid repeated traversals through this.

Access member variables directly or pass as parameter?

I noticed that even when paying respect to the single responsibility principle of OOD, sometimes classes still grow large. Sometimes accessing member variables directly in methods feels like having global state, and a lot of stuff exists in the current scope. Just by looking at the method currently working in, it is not possible anymore to determine where invidiual variables accessible in the current scope come from.
When working together with a friend lately, I realized I write much more verbose code than him, because I pass member variables still as parameters into every single method.
Is this bad practice?
edit: example:
class AddNumbers {
public:
int a, b;
// ...
int addNumbers {
// I could have called this without arguments like this:
// return internalAlgorithmAddNumbers();
// because the data needed to compute the result is in members.
return internalAlgorithmAddNumbers(a,b);
}
private:
int internalAlgorithmAddNumbers(int sum1, int sum2) { return sum1+sum2; }
};

If a class has member variables, use them. If you want to pass parameters explicitly, make it a free function. Passing member variables around not only makes the code more verbose, it also violates people's expectations and makes the code hard to understand.
The entire purpose of a class is to create a set of functions with an implicitly passed shared state. If this isn't what you want to do, don't use a class.

Yes, definetely a bad practice.
Passing a member variable to a member function has no sense at all, from my point of view.
It has several disadvantages:
Decrease code readability
Cost in term of performances to copy the parameter on the stack
Eventually converting the method to a simple function, may have sense. In fact, from a performance point of view, call to non-member function are actually faster (doesn't need to dereference this pointer).
EDIT:
Answer to your comment. If the function can perform its job only using a few parameters passed explicitely, and doesn't need any internal state, than probably there is no reason to declare it has a member function. Use a simple C-style function call and pass the parameters to it.

I understand the problem, having had to maintain large classes in code I didn't originally author. In C++ we have the keyword const to help identify methods that don't change the state:
void methodA() const;
Use of this helps maintainability because we can see if a method may change the state of an object.
In other languages that don't have this concept I prefer to be clear about whether I'm changing the state of the instance variable by either having it passed in by reference or returning the change
this->mMemberVariable = this->someMethod();
Rather than
void someMethod()
{
this->mMemberVariable = 1; // change object state but do so in non transparent way
}
I have found over the years that this makes for easier to maintain code.

What purpose does this code change serve?

I am trying to understand the implications / side effects / advantages of a recent code change someone made. The change is as follows:
Original
static List<type1> Data;
Modified
static List<type1> & getData (void)
{
static List<type1> * iList = new List<type1>;
return * iList;
}
#define Data getData()
What purpose could the change serve?

The benefit to the revision that I can see is an issue of 'initialization time'.
The old code triggered an initialization before main() is called.
The new code does not trigger initialization until getData() is called for the first time; if the function is never called, you never pay to initialize a variable you didn't use. The (minor) downside is that there is an initialization check in the generated code each time the function is used, and there is a function call every time you need to access the list of data.

If you have a variable with static duration, it is created when the application is initialized. When the application terminates the object is destroyed. It is not possible to control the order in which different objects are created.
The change will make the object be created when it is first used, and (as it is allocated dynamically) it will never be destroyed.
This can be a good thing if other objects need this objects when they are destroyed.
Update
The original code accessed the object using the variable Data. The new code does not have to be modified in any way. When the code use Data it will, in fact, be using the macro Data, which will be expanded into getData(). This function will return a reference to the actual (dynamically allocated object). In practice, the new code will work as a drop-in replacement for the old code, with the only noticable difference being what I described in the original answer above.

Delaying construction until the first use of Data avoids the "static initialization order fiasco".
Making some guesses about your List,... the default-constructed Data is probably an empty list of type1 items, so it's probably not at great risk of causing the fiasco in question. But perhaps someone felt it better to be safe than sorry.

There are several reasons why that change was made :
to prevent the static order initialization fiasco
to delay the initialization of the static variable (for whatever reason)

class with no constructor forces the use of a pointer type?

I don't want to use pointers when I don't have to, but here's the problem: in the code below if I remove the asterisk and make level simply an object, and of course remove the line level = new Level; I get a runtime error, the reason being that level is then initialized on the first line, BEFORE initD3D and init_pipeline - the methods that set up the projection and view for use. You see the problem is that level uses these two things but when done first I get a null pointer exception.
Is this simply a circumstance where the answer is to use a pointer? I've had this problem before, basically it seems extremely vexing that when a class type accepts no arguments, you are essentially initializing it where you declare it.... or am I wrong about this last part?
Level* level;
D3DMATRIX* matProjection,* matView;
//called once
void Initialise(HWND hWnd)
{
initD3D(hWnd);
init_pipeline();
level = new Level;
}
I'm coming from c# and in c#, you are simply declaring a name with the line Level level; arguments or not, you still have to initialize it at some point.

You are correct that if you do:
Level level;
then level will be instantiated at that point. That is because the above expression, which appears to be a global, isn't just a declaration, but also a definition.
If this is causing you problems because Level is being instantiated before something else is being instantiated, then you have encountered a classic reason why globals suck.
You have attempted to resolve this by making level a pointer and then "initializing" it later. Wjhat might suprise you is that level is still being instantiated at the same point. The difference now is the type of level. It's not a Level anymore; now its a pointer-to-level. If you examine the value of level when your code enters Initialize you'll see that it has a value of NULL.
It has a value of NULL instead of a garbage value because globals are static initialized, which in the case here, means zero-initialized.
But this is all somewhat tangential to the real problem, which is that you are using globals in the first place. If you need to instantiate objects in a specific order, then instantiate them in that order. Don't use globals, and you may find that by doing that, you don't need to use pointers, either.

Is this simply a circumstance where the answer is to use a pointer
Yea, basically.

Should I use static data members? (C++)

Let's consider a C++ class. At the beginning of the execution I want to read a set of values from an XML file and assign them to 7 of the data members of this class. Those values do not change during the whole execution and they have to be shared by all the objects / instances of the class in question. Are static data members the most elegant way to achieve this behavior? (Of course, I do not consider global variables)

As others have mentioned, the use of static members in this case seems appropriate. Just remember that it is not foolproof; some of the issues with global data apply to static members:
You cannot control the order of initialization of your static members, so you need to make sure that no globals or other statics refer to these objects. See this C++ FAQ Question for more details and also for some tips for avoiding this problem.
If your accessing these in a multi-threaded environment you need to make sure that the members are fully initialized before you spawn any threads.

Sounds like a good use of static variables to me. You're using these more as fixed parameters than variables, and the values legitimately need to be shared.

static members would work here and are perfectly acceptable. Another option is to use a singleton pattern for the class that holds these members to ensure that they are constructed/set only once.

It is not a clean design. Static class members are global state and global state is bad.
It might not cause you trouble if this is a small- to medium-sized project and you do not have high goals for automatic testing, but since you ask: there are better ways.
A cleaner design would be to create a Factory for the class and have the Factory pass your seven variables to the class when it constructs it. It is then the Factory's responsility to ensure that all instances share the same values.
That way your class becomes testable and you have properly separated your concerns.
PS. Don't use singletons either.

sounds like an appropriate use of static class members. just don't forget that they're really global variables with a namespace and (maybe) some protection. therefore, if there's the possibility that your application could someday evolve separate 'environments' or something that would need a set of these globals for each, you'd have painted yourself into a corner.
as suggested by Rob, consider using a singleton, which is easier to turn later into some kind of managed environment variable.

At the beginning of the execution I want to read a set of values from an
XML file and assign them to 7 of the
data members of this class. Those
values do not change during the whole
execution and they have to be shared
by all the objects / instances of the
class in question.
The sentence in boldface is the kicker here. As long as that statement holds, the use of static variables is OK. How will this be enforced?
It's hard to. So, if for your use right now the statement is always true, go ahead. If you want to be same from some future developer (or you) using your classes wrong (like reading another XML file midway in the program), then do something like what Rasmus Farber says.

Yes, static datamembers are what you look for. But you have to take care for the initialization/destruction order of your static variables. There is no mechanism in C++ to ensure that your static variables are initialized before you use them across translation units. To be safe, use what looks like the singleton pattern and is well known to fix that issue. It works because:
All static objects are completely constructed after the complete construction of any xml_stuff instance.
The order of destruction of static objects in C++ is the exact opposite of the completion of their construction (when their constructor finishes execution).
Code:
class xml_stuff {
public:
xml_stuff() {
// 1. touch all members once
// => 2. they are created before used
// => 3. they are created before the first xml_stuff instance
// => 4. they will be destructed after the last xml_stuff instance is
// destructed at program exit.
get_member1();
get_member2();
get_member3();
// ...
}
// the first time their respective function is called, these
// objects will be created and references to them are returned.
static type1 & get_member1() { static type1 t; return t; }
static type2 & get_member2() { static type2 t; return t; }
static type1 & get_member3() { static type1 t; return t; }
// ... all other 7 members
};
Now, the objects returned by xml_stuff::get_memberN() are valid the whole lifetime of any xml_stuff instance, because any of those members were constructed before any xml_stuff instance. Using plain static data members, you cannot ensure that, because order of creation across translation units is left undefined in C++.

As long as you think of testability and you have another way to set the static variables besides reading in a file, plus you don't rely on the data benig unchanged for the entire execution time of the process - you should be fine.
I've found that thinking of writing tests when you design your code helps you keep the code well-factored and reusable.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js