Clang not reporting uninitalized variables in C++? - c++

I understand that local variables are not initialised automatically in C++, so before using them, you should always assign a value to them. However, at least in simple cases, the compiler should warn you in case you forget it. I'm more or less relying on and referring to this article.
Given this program, I'd assume to get a warning when sending x to std::cout...
#include <iostream>
int main(int argc, const char * argv[])
{
int x;
std::cout << x;
return 0;
}
...but no warning pops up. If, however, I run the static analyzer, I do get the expected warning: Function call argument is an uninitialized value.
I compile & run using Xcode 5.1 with the Apple LLVM 5.1 compiler. I use the standard build settings from Xcode's command line project template (C++), the language dialects are set to GNU99 (for C) and GNU++11 (C++).
The Uninitialized Variables option is set to Yes (Aggressive) (-Wconditional-uninitialized). Switching to just Yes (-Wuninitialized) raises the warning: Variable 'x' is uninitialized when used here. Question Part 1: Why does the warning not show with the default setting (-Wconditional-uninitialized)? The documentation in Xcode suggests that the aggressive option finds more issues:
You can toggle between the normal uninitialized value checking or the more aggressive (conservative) checking which finds more issues but the checking is much stricter.
Strangely enough, when I run the program, the value is always set to 0, so for some reason, it seems to be initialised. Question Part 2: Why is that?

Why not the warning?
Using clang with -Wall on my system correctly warns about the error. Apparently, the default settings do not include -Wall (may be to avoid generating warnings with correct code that was written before some of the warnings were introduced).
In general, you're however going to be in trouble if you rely on the compiler to help you with sloppy programming. Typing in code without thinking carefully and hoping the compiler will tell you all mistakes is bad in any language but a true total disaster with C++. The main philosophy of C++ is simply that the programmer doesn't make any error, so just don't make them ;-)
Think carefully and also always work with -Wall if you can.
Why is initialized?
Apparently, you've not understood what is the meaning of "undefined behavior". It doesn't mean the program crashes, it doesn't mean it will do anything funny. It means it can do anything and normally the programs do whatever will create the most problems for you in the future.
Often this most dangerous behavior is to make it look as if everything is fine (e.g. that your variable is indeed initialized). The bad values will only show once you put that code in production or only when you show your program running in front of a vast audience. At that point, the value will be different and the video of your public crash will go viral on youtube, your wife will change the door locks and even your parents will not answer your phone calls.
Just initialize your variables; it's better :-)

It's necessary but enough to turn on -Wall to make our code safe.
e.g. The following code, under MacOS's g++(it's clang) will print out 0, but it is definitely a UB:
#include <stdio.h>
#include <iostream>
using namespace std;
class Blob {
public:
int age;
void hello() { printf("hello\n"); }
};
int main() {
Blob a;
//a.hello();
cout << a.age << endl;
return 0;
}

Related

How c++ is compiled? (Regarding variable declaration)

So I went through this video - https://youtu.be/e4ax90XmUBc
Now, my doubt is that if C++ is compiled language, that is, it goes through the entire code and translates it, then if I do something like
void main() {
int a;
cout<<"This is a number = "<<a; //This will give an error (Why?)
a = 10;
}
Now, answer for this would be that I have not defined the value for a, which I learned in school. But if a compiler goes through the entire code and then translates it then I think it shouldn't give any error.
But by giving an error like this, it looks to me as if C++ is a interpreted language.
Can anyone put some light on this and help me solve my dilemma here?
Technically, the C++ standard doesn't mandate that the compiler has to compile C++ into machine code. As an example LLVM Clang first compiles it to IR (Intermediate Representation) and only then to machine code.
Similarly, a compiler could embed a copy of itself in a program that it compiles and then, when the program is executed compile the program, immediately invoke it and delete the executable afterwards which in practice would be very similar to the program being interpreted. In practice, all widely used C++ compilers parse and assemble programs beforehand.
Regarding your example, the statement "This will give an error" is a bit ambiguous. I'm not sure if you're saying that you're getting a compile-time error or a runtime error. As such, I will discuss both possibilities.
If you're getting a compile time error, then your compiler has noticed that your program has undefined behaviour. This is something that you always want to avoid (in some cases, such as when your application operates outside the scope of the C++ Standard, such as when interfacing with certain hardware, UB occurs by definition, as certain behaviour is not defined by the Standard). This is a simple form of static analysis. The Standard doesn't mandate the your compiler informs you of this error and it would usually be a runtime error, but your compiler informed you anyway because it noticed that you probably made a mistake. For example on g++ such behaviour could be achieved by using the -Wall -Werror flags.
In the case of the error being a runtime error then you're most likely seeing a message like "Memory Access Violation" (on Windows) or "Signal 11" (on Linux). This is due to the fact that your program accessed uninitialized memory which is Undefined Behaviour.
In practice, you wouldn't most likely get any error at all at runtime. Unless the compiler has embedded dynamic checks in your program, it would just silently print a (seemingly) random value and continue. The value comes from uninitialized memory.
Side note: main returns int rather than void. Also using namespace std; considered harmful.

Why can I compile a code with 2 returns?

Since am comming from a java island, I wounder why the compiler doesnt warns about unreachable code in something like:
int main(int argc, char** argV)
{
std::list<int> lst = {1,2,3,4};
return 0;
std::cout << "Done!!!" << std::endl;
return 0;
}
my question:
Why can I compile a code with 2 returns?
my Compiler is gcc for c++11, on Windows, code block
I wounder why the compiler doesnt warns about unreachable code in something like
It is pretty well explained in gcc documentaion about warnings:
-Wunreachable-code
Warn if the compiler detects that code will never be executed. This option is intended to warn when the compiler detects
that at least a whole line of source code will never be executed,
because some condition is never satisfied or because it is after a
procedure that never returns.
It is possible for this option to produce a warning even though there
are circumstances under which part of the affected line can be
executed, so care should be taken when removing apparently-unreachable
code.
For instance, when a function is inlined, a warning may mean that the
line is unreachable in only one inlined copy of the function.
This option is not made part of -Wall because in a debugging version
of a program there is often substantial code which checks correct
functioning of the program and is, hopefully, unreachable because the
program does work. Another common use of unreachable code is to
provide behavior which is selectable at compile-time.
Though g++ 5.1.0 does not produce any warnings for this code even with this option enabled.
Why shouldn't you be able to compile code that has multiple returns?
Because the code is unreachable? Most compilers can issue a warning for that.
However, I often see code like:
if(a)
{
// Do stuff
}
else
{
// Do other stuff
if(b)
{
// Do more stuff
}
else
{
// Do other more stuff
}
}
That could be simplified as
if(a)
{
// Do stuff
return;
}
// Do other stuff
if(b)
{
// Do more stuff
return;
}
// Do other more stuff
About a decade ago, people frowned on having more than one return in a function of method, but there really is no reason to continue frowning on it with modern compilers.
Because this part
std::cout << "Done!!!" << std::endl;
return 0;
will never be called because of the first return statement, but it is not an error aborting the compilation, rather the compiler might drop a warning, depending on what compiler you are using (e.g. Microsofts VC++ compiler warns you about that).
Unreachable code is not a compile error in C++, but usually gives a warning, depending on your compiler and flags.
You can try to add -Wall option when you call your compiler. This will
active many useful warning.
Mainly because more often than not the compiler cannot know for sure. (There have been attempts to do this in Java but there the criteria for defining reachability have been decided upon.)
In this case, indeed, it is obvious.
Some compilers do issue reachability warnings but the C++ standard does not require this.
No answer on reachability is complete without referencing this: https://en.wikipedia.org/wiki/Halting_problem
As a final remark on Java, consider these two Java snippets:
if (true){
return;
}
; // this statement is defined to be reachable
and
while (true){
return;
}
; // this statement is defined to be unreachable
The worst of both worlds is attained, in my humble opinion.
There are two reasons for this:
C++ has many standards (c++11, c++14, c++17 etc. ) unlike java (java is very rigid in standard and only thing that matters really with java is the version you are using), so, some compilers might warn you about the unreachable code while others might not.
The statements after return 0, though are unreachable logically, do not cause any fatal error like ambiguity, syntax error, etc. and can be compiled easily (if the compiler wills to ;) ).

G++ 4.4 "uninitialized" variables

I am using g++ on Ubuntu 10.10(64-bit) if the OS is at all relevant for the matter.
I saw something strange so i decided to check and for some reason this code
#include <iostream>
int main()
{
int a;
std::cout << a << std::endl;
return 0;
}
always prints 0. Obviously g++ does auto initialization of uninitialized variables to their corresponding null-value. The thing is I want to turn that feature off, or at least make g++ show warning about using uninitialized variables, since this way my code won't work well when compiled on VS for instance. Besides I'm pretty sure the C++ standard states that a variable which isn't implicitly initialized with some value has an undefined value amongst all it's possible values, which should in fact be different with every execution of the program, since different parts of the operating memory are used every time it's executed.
Explicit question: Is there a way to make g++ show warnings for uninitialized variables?
GCC does not initialize uninitialized variables to 0. It's just a case that a is 0.
If what you want to do is to receive warnings when you use uninitialized variables you could use GCC option -Wuninitialized (also included by -Wall).
However it can't statically spot any possible usage of uninitialized variables: you'll need some run time tools to spot that, and there's valgrind for this.
You might also try to use a tool like cppcheck. In general, in well written C++ there is rarely a reason to declare a variable without initializing it.

Why is it that an int in C++ that isnt initialized (then used) doesn't return an error?

I am new to C++ (just starting). I come from a Java background and I was trying out the following piece of code that would sum the numbers between 1 and 10 (inclusive) and then print out the sum:
/*
* File: main.cpp
* Author: omarestrella
*
* Created on June 7, 2010, 8:02 PM
*/
#include <cstdlib>
#include <iostream>
using namespace std;
int main() {
int sum;
for(int x = 1; x <= 10; x++) {
sum += x;
}
cout << "The sum is: " << sum << endl;
return 0;
}
When I ran it it kept printing 32822 for the sum. I knew the answer was supposed to be 55 and realized that its print the max value for a short (32767) plus 55. Changing
int sum;
to
int sum = 0;
would work (as it should, since the variable needs to be initialized!). Why does this behavior happen, though? Why doesnt the compiler warn you about something like this? I know Java screams at you when something isnt initialized.
Thank you.
Edit:
Im using g++. Here is the output from g++ --version:
Im on Mac OS X and am using g++.
nom24837c:~ omarestrella$ g++ --version
i686-apple-darwin10-g++-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5646)
Reading from an uninitialized variable results in undefined behavior and the compiler isn't required to diagnose the error.
Note that most modern compilers will warn you if you attempt to read an uninitialized variable. With gcc you can use the -Wuninitialized flag to enable that warning; Visual C++ will warn even at warning level 1.
Because it's not part of the C++ standard. The variable will just take whatever value is currently sitting in the memory it's assigned. This saves operations which can sometimes be unnecessary because the variable will be assigned a value later.
It's interesting to note and very important for Java/.Net programmers to note when switching to C/C++. A program written in C++ is native and machine-level. It is not running on a VM or a some other sort of framework. It is a collection of raw operations (for the most part). You do not have a virtual machine running in the background checking you variables and catching exceptions or segfaults for you. This is a big difference which can lead to a lot of confusion in the way C++ handles variables and memory, as opposed to Java or a .Net language. Hell, in .Net all your integers are implicitly initialised to 0!
Consider this code fragment:
int x;
if ( some_complicated_condition ) {
x = foo;
} else if ( another_condition ) {
// ...
if ( yet_another_condition ) x = bar;
} else {
return;
}
printf("%d\n",x);
Is x used uninitialized? How do you know? What if there are preconditions to the code?
These are hard questions to answer automatically, and enforcing initialization might be inefficient in some small way.
In point of fact modern compilers do a pretty good job of static analysis and can often tell if a variable is or might be used uninitialized, and they generally warn you in that case (at least if you turn the warning level up high enough). But c++ follows closely on the c tradition of expecting the programmer to know what he or she is doing.
It's how the spec is defined--that uninitialized variables have no guarantee's, and I believe the reason is that it has to do with optimizations (though I may be wrong here...)
Many compilers will warn you, depending on the warning level used. I know by default VC++ 2010 warns for this case.
What compiler are you using? There is surely a warning level you can turn on via command-line switch that would warn you of this. Try compiling with /W3 or /W4 on Windows and you'll get 4700 or 6001.
This is because C++ was designed as a superset of C, to allow easy upgrading of existing code. C works that way because it dates back to the 70s when CPU cycles were rare and precious things, so weren't wasted initialising variables when it might not be necessary (also, programmers were trusted to know that they'd have to do it themselves).
Obviously that wasn't really the case once Java appeared, so they found it a better tradeoff to spend a few CPU cycles to avoid that class of bugs. As others have noted though, modern C or C++ compilers will normally have warnings for this kind of thing.
Because C developers care about speed. Uselessly initializing a variable is a crime against performance.
Detecting an uninitialized variable is a Quality-of-Implementation (QoI) issue. It is not mandatory, since the language standard does not require it.
Most compilers I know will actually warn you about the potential problem with an initialized variable at compile-time. On top of that, compilers like MS Visual Studio 2005 will actually trap the use of an uninitialized variable during run-time in debug builds.
So, what compiler are you using?
Well, it depends on what compiler you use. Use a smart one. :)
http://msdn.microsoft.com/en-us/library/axhfhh6x.aspx
Initialising variables is one of the most important tenets of C/C++. Any types without a constructor should be initialised, period. The reason for compiler not enforcing this is largely historical. It stems from the fact sometimes it's not necessary to init something and it would be wasteful to do so.
These days this sort of optimisation is best left to compiler and it's a good habit to always initialise everything. You can get the compiler to generate a warning for you as others suggested. Also you can make it treat warnings as errors to further simulate javac behaviour.
C++ is not a pure object oriented programming language.
In C++ there is no implicit way of memory management and variable initialization.
If a variable is not initialized in C++ then it may take any value at runtime because the compiler does not understand such internal errors in C++.

Undefined/Unspecified/Implementation-defined behaviour warnings?

Can't a compiler warn (even better if it throws errors) when it notices a statement with undefined/unspecified/implementation-defined behaviour?
Probably to flag a statement as error, the standard should say so, but it can warn the coder at least. Is there any technical difficulties in implementing such an option? Or is it merely impossible?
Reason I got this question is, in statements like a[i] = ++i; won't it be knowing that the code is trying to reference a variable and modifying it in the same statement, before a sequence point is reached.
It all boils down to
Quality of Implementation: the more accurate and useful the warnings are, the better it is. A compiler that always printed: "This program may or may not invoke undefined behavior" for every program, and then compiled it, is pretty useless, but is standards-compliant. Thankfully, no one writes compilers such as these :-).
Ease of determination: a compiler may not be easily able to determine undefined behavior, unspecified behavior, or implementation-defined behavior. Let's say you have a call stack that's 5 levels deep, with a const char * argument being passed from the top-level, to the last function in the chain, and the last function calls printf() with that const char * as the first argument. Do you want the compiler to check that const char * to make sure it is correct? (Assuming that the first function uses a literal string for that value.) How about when the const char * is read from a file, but you know that the file will always contain valid format specifier for the values being printed?
Success rate: A compiler may be able to detect many constructs that may or may not be undefined, unspecified, etc.; but with a very low "success rate". In that case, the user doesn't want to see a lot of "may be undefined" messages—too many spurious warning messages may hide real warning messages, or prompt a user to compile at "low-warning" setting. That is bad.
For your particular example, gcc gives a warning about "may be undefined". It even warns for printf() format mismatch.
But if your hope is for a compiler that issues a diagnostic for all undefined/unspecified cases, it is not clear if that should/can work.
Let's say you have the following:
#include <stdio.h>
void add_to(int *a, int *b)
{
*a = ++*b;
}
int main(void)
{
int i = 42;
add_to(&i, &i); /* bad */
printf("%d\n", i);
return 0;
}
Should the compiler warn you about *a = ++*b; line?
As gf says in the comments, a compiler cannot check across translation units for undefined behavior. Classic example is declaring a variable as a pointer in one file, and defining it as an array in another, see comp.lang.c FAQ 6.1.
Different compilers trap different conditions; most compilers have warning level options, GCC specifically has many, but -Wall -Werror will switch on most of the useful ones, and coerce them to errors. Use \W4 \WX for similar protection in VC++.
In GCC You could use -ansi -pedantic, but pedantic is what it says, and will throw up many irrelevant issues and make it hard to use much third party code.
Either way, because compilers catch different errors, or produce different messages for the same error, it is therefore useful to use multiple compilers, not necessarily for deployment, but as a poor-man's static analysis. Another approach for C code is to attempt to compile it as C++; the stronger type checking of C++ generally results in better C code; but be sure that if you want C compilation to work, don't use the C++ compilation exclusively; you are likely to introduce C++ specific features. Again this need not be deployed as C++, but just used as an additional check.
Finally, compilers are generally built with a balance of performance and error checking; to check exhaustively would take time that many developers would not accept. For this reason static analysers exist, for C there is the traditional lint, and the open-source splint. C++ is more complex to statically analyse, and tools are often very expensive. One of the best I have used is QAC++ from Programming Research. I am not aware of any free or open source C++ analysers of any repute.
gcc does warn in that situation (at least with -Wall):
#include <stdio.h>
int main(int argc, char *argv[])
{
int a[5];
int i = 0;
a[i] = ++i;
printf("%d\n", a[0]);
return 0;
}
Gives:
$ make
gcc -Wall main.c -o app
main.c: In function ‘main’:
main.c:8: warning: operation on ‘i’ may be undefined
Edit:
A quick read of the man page shows that -Wsequence-point will do it, if you don't want -Wall for some reason.
Contrarily, compilers are not required to make any sort of diagnosis for undefined behavior:
§1.4.1:
The set of diagnosable rules consists of all syntactic and semantic rules in this International Standard except for those rules containing an explicit notation that “no diagnostic is required” or which are described as resulting in “undefined behavior.”
Emphasis mine. While I agree it may be nice, the compiler's have enough problem trying to be standards compliant, let alone teach the programmer how to program.
GCC warns as much as it can when you do something out of the norms of the language while still being syntactically correct, but beyond the certain point one must be informed enough.
You can call GCC with the -Wall flag to see more of that.
If your compiler won't warn of this, you can try a Linter.
Splint is free, but only checks C http://www.splint.org/
Gimpel Lint supports C++ but costs US $389 - maybe your company c an be persuaded to buy a copy? http://www.gimpel.com/