Initialize primitives or not - c++ - c++

When do I need to initialize variables in c++? Some people assert that its important but maybe this is more an issue in c-language?
I am refferering to primitives i.e. char, int, long, double
Let say I have the following code-snippet
int len;
double sum, mean;
char ch;
while (true) {
// here I use these primitives where they are initialized.
}
So - should I initialized these primitives as a good programming pratice here?

In c++ compiler usualy do not initialize local (automatic) variables. These variables are created on the stack and they are filled with random values. Usualy you do not need to inicialize variables but read carefuly what the compiler says. Try:
int main() {
int x;
x=x+1;
}
and compile it with -Wall switch (I'm using gcc). When the message
x.cpp: In function ‘int main()’:
x.cpp:3:6: warning: ‘x’ is used uninitialized in this function [-Wuninitialized]
x=x+1;
is written, then it would be better to initialize such variable.

The problem is, of course, the use of unitialised variables as in
int x;
int y=1+x; // oops what is y?
AFAIK, the language standard allows the compiler to initialise x to 0, but also to leave it unitialised. In any case, most optimisations (-O) will omit an initialisation in the above situation.
If you use full warning compiler flags (e.g. -Wall -Wextra -pedantic) the compiler will almost certainly spot the usage of unitialised variables (it will also warn about usage of unitialised variables in library header files, such as boost headers -- the boost developers appear to not use such useful diagnostics).
In general, whether or not to initialise all variables is a matter of style. I would provide an explicit initialisation whenever there is a sensible initial value for a variable and/or if there is the danger of it being used unitialised. Different from C, the possibility of unitialised variables is quite rare in C++, in particular when passing by return value (including move semantics).

You should initalize all variables to prevent: "trash in input - trash in output".

When do I need to initialize variables in c++?
You should initialize local variables with a sensible value when you define them. If you cannot give a variable a sensible value yet, then you should probably define it later.
The goal here is to minimize the amount of state in your functions in order to make them easier to understand. When all variables are defined at the beginning of the function, you don't know what they are used for. When they are defined at the point they are needed, it's clear that they are not used before that point. This also helps to limit the scope in which variables are declared (e.g. inside the loop instead of before it, thus less state outside the loop) and it allows you to define more variables as const (thus not adding state).

Related

Why can we use uninitialized variables in C++?

In programming languages like Java, C# or PHP we can't use uninitialized variables. This makes sense to me.
C++ dot com states that uninitialized variables have an undetermined value until they are assigned a value for the first time. But for integer case it's 0?
I've noticed we can use it without initializing and the compiler shows no error and the code is executed.
Example:
#include <iostream>
using namespace std;
int main()
{
int a;
char b;
a++; // This works... No error
cout<< a << endl; // Outputs 1
// This is false but also no error...
if(b == '0'){
cout << "equals" << endl;
}
return 0;
}
If I tried to replicate above code in other languages like C#, it gives me compilation error. I can't find anything in the official documentation.
I highly value your help.
C++ gives you the ability to shoot yourself in the foot.
Initialising an integral type variable to 0 is a machine instruction typically of the form
REG XOR REG
Its presence is less than satisfactory if you want to initialise it to something else. That's abhorrent to a language that prides itself on being the fastest. Your assertion that integers are initialised to zero is not correct.
The behaviour of using an uninitialised variable in C++ is undefined.
It isn't feasible or even possible to detect or prove that variable is used uninitialized in all cases. For example:
int a;
if (<complex condition>)
a = 0;
if (<another complex condition>)
a = 1;
++a;
Can there be case when both conditions are false? You wouldn't know, unless you do an extensive analysis of your program. Pointers to variables can be passed, multithreading might be involved, making analysis even harder.
So, the decision was made to trust the programmer and merely declare those UB.
Modern compilers can issue warnings in many cases of uninitialized variable usage, and you should always use maximum warning level.
Anything is possible when your code has undefined behavior.
Correct code does not contain undefined behavior. Using the value of an uninitialized variable is undefined behavior.
The concept of undefined behavior is not unique to C++, but in C++ it is more important than elsewhere because there are so many chances to write wrong code without getting a compiler error.
However, the compiler is your friend. Use it! For example with gcc -Wall -Werror should be your default to get the error message:
<source>: In function 'int main()':
<source>:9:6: error: 'a' is used uninitialized [-Werror=uninitialized]
9 | a++; // This works... No error
| ~^~
<source>:13:5: error: 'b' is used uninitialized [-Werror=uninitialized]
13 | if(b == '0'){
| ^~
cc1plus: all warnings being treated as errors
Though, not all cases of undefined behavior can be caught by warnings (that can be treated as errors).
C++ dot com states that uninitialized variables have an undetermined value until they are assigned a value for the first time. But for integer case it's 0?
The correct term is indeterminate. As you can see in the above compiler output, there is no difference for your int a;. When anything can happen then undefined behavior can look like correct behavior, nevertheless it must be fixed.
TL;DR: You cannot use the value of an uninitialized variable. Code that compiles without errors is not necessarily correct.
There is no way to "mark" a variable as being uninitialized unless you store an extra bit of information somewhere, or reserve a value in the range of values that the data type covers. Plus every reference to the variable would have to test for uninitializedness.
All of this is completely unacceptable.
Also note that automatic variables are not implicitly initialized to some value (say 0) because this has a cost at run-time, even if the variable is not used.
As others have stated it's not always feasible for the compiler to detect if the variable is uninitialized and C and C++ prefer performance in those cases.
However, there are some additional points:
There are dynamic checkers that will detect if any of your test-cases uses an uninitialized variable. That only works if you don't zero-initialize them "just in case".
In C++ you can mix statements and declarations, so instead of
int a,b,c;
...
c=2;
a=12*c;
b=...;
you can write:
...
int c=2;
int a=12*c;
int b=...;
and if you don't modify them further you can add const as well, and lambdas are also useful for this.
If you really need to represent a possibly uninitialized variable use std::optional<...>. It can avoid some of those 'possibly uninitialized' cases and can detect if you try to access it when uninitialized. But it has a cost.

C++ manipulators

I have started learning C++ and I think the language is great, but few things are baffling me while I am on my path learning it. In this example:
cout << setiosflags(ios::fixed) << setiosflags(ios::showpoint);
In this example why do we type the whole setiosflags(ios::...) when the program still does the same if I only type showpoint without setiosflags?
Second question I have is simple. If we have the following:
int x=0;
cin>>x;
Why do we define a value for int if we later change it to something different than 0?
why do we type the wholesetiosflags(ios::...)when the program still does the same if I only type showpoint without setiosflags?
We don't, unless we want the program to be more verbose than necessary. As you say, streaming setioflags with a single flag is equivalent to streaming the flag itself. You might use setioflags if you have a pre-computed set of flags you want to set.
Why do we define a value for int if we later change it to something different than 0?
Again, we don't, unless we like unnecessary verbiage. But it's a good habit to initialise variables, to avoid undefined behaviour if you later change the code to assume it has been initialised.
Its optional and flexibility language provide, so either you can set manipulators using setiosflgas or as showing below:
float y= 1.45;
std::cout << std::fixed<<std::showpoint<<y;
Why insisting to initialize variables is because before C++11 these uninitialized variables can hold garbage value until you set value for them. And it may create unwanted issues and bugs. So better practice always initialize variables when you define it.
Since C++11 all fundamental data types are initialized to zero if you use explicit constructor as follows:
int i2 = int(); // initialized with zero
int i3{}; // initialized with zero (since C + + 11)
The stream manipulator std::setiosflags(ios_base::fmtflags mask)- is a function that sets the format flags specified by parameter mask. It can be used for multiple flags simultaneously, by using binary AND : &. It probably exists to provide full/complete functionality of the class that belongs to. Now regarding your question:
If you can access a flag(member) directly, why bother using a function(setter)?
I can't think of any reason why you shouldn't. However have in mind that manipulators are global functions and these constants, ios_base::fmtflags, are member constants. For more information on manipulators check this.
Regarding the second question: you initialize a variable when you define it to avoid undefined behaviour in case you use it, by mistake, before assigning it any value. Local variables need initialization, global variables are initialized by default.

How to store a C++ variable in a register

I would like some clarification regarding a point about the storage of register variables:
Is there a way to ensure that if we have declared a register variable in our code, that it will ONLY be stored in a register?
#include<iostream>
using namespace std;
int main()
{
register int i = 10;// how can we ensure this will store in register only.
i++;
cout << i << endl;
return 0;
}
You can't. It is only a hint to the compiler that suggests that the variable is heavily used. Here's the C99 wording:
A declaration of an identifier for an object with storage-class specifier register suggests that access to the object be as fast as possible. The extent to which such suggestions are effective is implementation-defined.
And here's the C++11 wording:
A register specifier is a hint to the implementation that the variable so declared will be heavily used. [ Note: The hint can be ignored and in most implementations it will be ignored if the address of the variable is taken. This use is deprecated (see D.2). —end note ]
In fact, the register storage class specifier is deprecated in C++11 (Annex D.2):
The use of the register keyword as a storage-class-specifier (7.1.1) is deprecated.
Note that you cannot take the address of a register variable in C because registers do not have an address. This restriction is removed in C++ and taking the address is pretty much guaranteed to ensure the variable won't end up in a register.
Many modern compilers simply ignore the register keyword in C++ (unless it is used in an invalid way, of course). They are simply much better at optimizing than they were when the register keyword was useful. I'd expect compilers for niche target platforms to treat it more seriously.
The register keyword has different meanings in C and C++. In C++ it is in fact redundant and seems even to be deprecated nowadays.
In C it is different. First don't take the name of the keyword literally, it is has not always to do with a "hardware register" on a modern CPU. The restriction that is imposed on register variables is that you can't take their address, the & operation is not allowed. This allows you to mark a variable for optimization and ensure that the compiler will shout at you if you try to take its address. In particular a register variable that is also const qualified can never alias, so it is a good candidate for optimization.
Using register as in C systematically forces you to think of every place where you take the address of a variable. This is probably nothing you would want to do in C++, which heavily relies on references to objects and things like that. This might be a reason why C++ didn't copy this property of register variables from C.
Generally it's impossibly. Specifically one can take certain measures to increase the probability:
Use proper optimization level eg. -O2
Keep the number of the variables small
register int a,b,c,d,e,f,g,h,i, ... z; // can also produce an error
// results in _spilling_ a register to stack
// as the CPU runs out of physical registers
Do not take an address of the register variable.
register int a;
int *b = &a; /* this would be an error in most compilers, but
especially in the embedded world the compilers
release the restrictions */
In some compilers, you can suggest
register int a asm ("eax"); // to put a variable to a specific register
Generally CPP compilers(g++) do quite a few optimizations to the code. So when you declare a register variable, it is not necessary that the compiler will store that value directly in the register. (i.e) the code 'register int x' may not result in compiler storing that int directly in the register. But if we can force the compiler to do so, we may be successful.
For example, if we use the following piece of code, then we may force the compiler to do what we desire. Compilation of the following piece of code may error out, which indicates that the int is actually getting stored directly in the register.
int main() {
volatile register int x asm ("eax");
int y = *(&x);
return 0;
}
For me, g++ compiler is throwing the following error in this case.
[nsidde#nsidde-lnx cpp]$ g++ register_vars.cpp
register_vars.cpp: In function ‘int main()’:
register_vars.cpp:3: error: address of explicit register variable ‘x’ requested
The line 'volatile register int x asm ("eax")' is instructing the compiler that, store the integer x in 'eax' register and in doing so do not do any optimizations. This will make sure that the value is stored in the register directly. That is why accessing the address of the variable is throwing an error.
Alternatively, the C compiler (gcc), may error out with the following code itself.
int main() {
register int a=10;
int c = *(&a);
return 0;
}
For me, the gcc compiler is throwing the following error in this case.
[nsidde#nsidde-lnx cpp]$ gcc register.c
register.c: In function ‘main’:
register.c:5: error: address of register variable ‘a’ requested
It's just a hint to the compiler; you can't force it to place the variable in a register. In any event, the compiler writer probably has much better knowledge of the target architecture than the application programmer, and is therefore better placed to write code that makes register allocation decisions. In other words, you are unlikely to achieve anything by using register.
The "register" keyword is a remnant of the time when compilers had to fit on machines with 2MB of RAM (shared between 18 terminals with a user logged in on each). Or PC/Home computers with 128-256KB of RAM. At that point, the compiler couldn't really run through a large function to figure out which register to use for which variable, to use the registers most effectively. So if the programmer gave a "hint" with register, the compiler would put that in a register (if possible).
Modern compilers don't fit several times in 2MB of RAM, but they are much more clever at assigning variables to registers. In the example given, I find it very unlikley that the compiler wouldn't put it in a register. Obviously, registers are limited in number, and given a sufficiently complex piece of code, some variables will not fit in registers. But for such a simple example, a modern compiler will make i a register, and it will probably not touch memory until somewhere inside ostream& ostream::operator<<(ostream& os, int x).
The only way to ensure that you are using a register, is to use inline assembly. But, even if you do this, you are not guaranteed that the compiler won't store your value outside of the inline assembly block. And, of course, your OS may decide to interrupt your program at any point, storing all your registers to memory, in order to give the CPU to another process.
So, unless you write assembler code within the kernel with all interrupts disabled, there is absolutely no way to ensure that your variable will never hit memory.
Of course, that is only relevant if you are concerned about safety. From a performance perspective, compiling with -O3 is usually enough, the compiler usually does quite a good job at determining which variables to hold in registers. Anyway, storing variables in registers is only one small aspect of performance tuning, the much more important aspect is to ensure that no superfluous or expensive work gets done in the inner loop.
Here you can use volatile register int i = 10 in C++ to ensure i to be stored in register. volatile keyword will not allow the compiler to optimize the variable i.

C++ : how to make sure all variables are initialized?

Recently I had lots of trouble with a non initialized variable.
In Java, the default value of variable is null, therefore an exception is likely to be thrown when if the non-initialized variable is used. If I understood, in C++, the variable is initialized with whatever data turns out to be in the memory. Which means that the program is likely to run, and it might be hard to even know there is something wrong with it.
What would be the clean way to deal with this ? Is there some good programming habit that would reduce the risk ? In my case, the variable was declared in the header file and should have been initialized in the cpp file, which is an example of things that makes error more likely.
thx
Edition after receiving few answers:
My apologies, my question was not specific enough.
The answer I get to use flag for the compilers to get informed of non-initialized variables will be useful.
But there are rare cased variables can not be initialized at the beginning, because depending on the behavior of your system.
in header file
double learnedValue;
in cpp file
/* code that has nothing to do with learnedValue
...
*/
learnedValue = a*b*c; // values of a, b and c computed in the code above
/*code making use of learned value
...
*/
Now what happened is that forgot the line "learnedValue=a*b*c".
But the program was working good, just with value of learnedValue initialized with whatever what was in the memory when it was declared.
In Java, such error is not an issue, because the code making use of learned value is likely to crash or throw an exception (at least you get to know what was wrong).
In C++, you can apparently be happy and never get to know there is a problem at all. Or ?
Pls make sure you have appropriate warning levels set while compiling your program.
Compilers issue appropriate warning whenever un-initialized variables are used.
On g++, -Wall compiler option would show all warnings.
On Visual studio, you might have to use warning level 4.
Also, there are some static code analysis tool available in the market.
cppCheck is one such tool available for free.
You should not define a variable in a header (only declare it). Otherwise you will get other errors when you include the header in several .cpp files.
When actually defining a variable, you can also give it an initial value (like 0). In C++ it is also common to defer the definition of (local) variables until you have a value to assign to them.
In the header file
extern double learnedValue;
^^^^^^
In the cpp file
double learnedValue = 0;
/* code that has nothing to do with learnedValue
...
*/
learnedValue = a*b*c; // values of a, b and c computed in the code above
/*code making use of learned value
...
*/
you can define the variables on the spot they are declared
c++11 allows you to initialize variables inside class. If that is not implemented by the compiler yet then the constructor initialization list is the area to check.
The C# can initialize the variable. But C++ not, so when use a pointer without initialized, it always throw exception. You should make a good habit to initialize all the variables in the class constructor.

Function and declaring a local variable

Just having an conversation with collegue at work how to declare a variables.
For me I already decided which style I prefer, but maybe I wrong.
"C" style - all variable at the begining of function.
If you want to know data type of variable, just look at the begining of function.
bool Foo()
{
PARAM* pParam = NULL;
bool rc;
while (true)
{
rc = GetParam(pParam);
... do something with pParam
}
}
"C++" style - declare variables as local as possible.
bool Foo()
{
while (true)
{
PARAM* pParam = NULL;
bool rc = GetParam(pParam);
... do something with pParam
}
}
What do you prefer?
Update
The question is regarding POD variables.
The second one. (C++ style)
There are at least two good reasons for this:
This allow you to apply the YAGNI principle in the code, as you only declare variable when you need them, as close as possible to their use. That make the code easier to understand quickly as you don't have to get back and forth in the function to understand it all. The type of each variable is the main information about the variable and is not always obvious in the varaible name. In short : the code is easier to read.
This allow better compiler optimizations (when possible). Read : http://www.tantalon.com/pete/cppopt/asyougo.htm#PostponeVariableDeclaration
If due to the language you are using you are required to declare variables at the top of the function then clearly you must do this.
If you have a choice then it makes more sense to declare variables where they are used. The rule of thumb I use is: Declare variables with the smallest scope that is required.
Reducing the scope of a variable prevents some types errors, for example where you accidentally use a variable outside of a loop that was intended only to be used inside the loop. Reducing the scope of the variable will allow the compiler to spot the error instead of having code that compiles but fails at runtime.
I prefer the "C++ style". Mainly because it allows RAII, which you do in both your examples for the bool variable.
Furthermore, having a tight scope for the variable provides the compile better oppertunities for optimizations.
This is probably a bit subjective.
I prefer as locally as possible because it makes it completely clear what scope is intended for the variable, and the compiler generates an error if you access it outside the intended useful scope.
This isn't a style issue. In C++, non-POD types will have their constructors called at the point of declaration and destructors called at the end of the scope. You have to be wise about selecting where to declare variables or you will cause unnecessary performance issues. For example, declaring a class variable inside a loop may not be the wisest idea since constructor/destructor will be called every iteration of the loop. But sometimes, declaring class variables at the top of the function may not be the best if there is a chance that variable doesn't get used at all (like a variable is only used inside some 'if' statement).
I prefer C style because the C++ style has one major flaw to me: in a dense function it is very hard on eyes to find the declaration/initialization of the variable. (No syntax highlighting was able yet to cope reliably and predictably with my C++ coding hazards habits.)
Though I do adhere to no style strictly: only key variables are put there and most smallish minor variables live within the block where they are needed (like bool rc in your example).
But all important key variables in my code inevitably end up being declared on the top. And if in a nested block I have too much local variables, that is the sign that I have to start thinking about splitting the code into smaller functions.