Looking at What's the point of g++ -Wreorder, I fully understand what -Wreorder is useful for. But it doesn't seem unreasonable that the compiler would be able to detect whether such a reordering is harmless:
struct Harmless {
C() : b(1), a(2) {}
int a;
int b;
};
or broken:
struct Broken {
C() : b(1), a(b + 1) {}
int a;
int b;
};
My question is then: why doesn't GCC detect (and warn about) the actual use of an undefined member in an initializer instead of this blanket warning on the ordering of initializers?
As far as I understand, -Wuninitialized only applies to automatic variables, and indeed it does not detect the error above.
EDIT:
A stab at formalizing the behavior I want:
Given initializer list : a1(expr1), a2(expr2), a3(expr3) ... an(exprn), I want a warning if (and only if) the execution of any of the initializers, in the order they will be executed, would reference an uninitialized value. I.e. in the same manner as -Wuninitialized warns about use of uninitialized automatic variables.
Some additional background: I work in a mostly windows-based company, where basically everybody but me uses Visual Studio. VS does not have this warning, thus nobody cares about having the correct order (and have no means of knowing when they screw up the ordering except manual inspection), thus leaving me with endless warnings that I have to constantly fix everytime someone breaks something. I would like to be informed about only the cases that are really problematic and ignore the benign cases. So my question is maybe better phrased as: is it technically feasible to implement a warning/error like this? My gut feeling says it is, but the fact that it isn't already implemented makes me doubt it.
My speculation is that it's for the same reason we have -Wold-style-cast: safety erring on the side of being too conservative. All it takes is a moment's inattention to transform Harmless into CarelessMistake. Maybe this developer's in a hurry or has an older version of GCC or sees that it's "just a warning" and presses on.
This is basically true among many warnings. They often are spurious, and require a little bit of restructuring to compile cleanly, but on some occasions they represent real problems. Every good programmer will prefer some working through some false positives if that means they get fewer false negatives.
I would be surprised if there's a valid direct answer to the question. There's no technical reason I see that it couldn't be done. It's just . . . why bother trying to figure out if something questionable is actually okay? Programming is the human's job.
As a personal reason, I think initializing variables in the order you declare them often makes sense.
Related
Why does the following code compile?
class Demo
{
public:
Demo() : a(this->a){}
int& a;
};
int main()
{
Demo d;
}
In this case, a is a reference to an integer. However, when I initialize Demo, I pass a reference to a reference of an integer which has not yet been initialized. Why does this compile?
This still compiles even if instead of int, I use a reference to a class which has a private default constructor. Why is this allowed?
Why does this compile?
Because it is syntactically valid.
C++ is not a safe programming language. There are several features that make it easy to do the right thing, but preventing someone from doing the wrong thing is not a priority. If you are determined to do something foolish, nothing will stop you. As long as you follow the syntax, you can try to do whatever you want, no matter how ludicrous the semantics. Keep that in mind: compiling is about syntax, not semantics.*
That being said, the people who write compilers are not without pity. They know the common mistakes (probably from personal experience), and they recognize that your compiler is in a good position to spot certain kinds of semantic mistakes. Hence, most compilers will emit warnings when you do certain things (not all things) that do not make sense. That is why you should always enable compiler warnings.
Warnings do not catch all logical errors, but for the ones they do catch (such as warning: 'Demo::a' is initialized with itself and warning: '*this.Demo::a' is used uninitialized), you've saved yourself a ton of debugging time.
* OK, there are some semantics involved in compiling, such as giving a meaning to identifiers. When I say compiling is not about semantics, I am referring to a higher level of semantics, such as the intended behavior.
Why does this compile?
Because there is no rule that would make the program ill-formed.
Why is this allowed?
To be clear, the program is well-formed, so it compiles. But the behaviour of the program is undefined, so from that perspective, the premise of your question is flawed. This isn't allowed.
It isn't possible to prove all cases where an indeterminate value is used, and it isn't easy to specify which of the easy cases should be detected by the compiler, and which would be considered to be too difficult. As such, the standard doesn't attempt to specify it, and leaves it up to the compiler to warn when it is able to detect it. For what it's worth, GCC is able to detect it in this case for example.
C++ allows you to pass a reference to a reference to uninitialized data because you might want to use the called function as the initializer.
#include <vector>
std::vector<int>::iterator foo();
void bar(void*) {}
int main()
{
void* p;
while (foo() != foo() && (p = 0, true))
{
bar(p);
}
return 0;
}
Results in error:
c:\users\jessepepper\source\repos\testcode\consoleapplication1\consoleapplication1.cpp(15): error C4703: potentially uninitialized local pointer variable 'p' used
It's kind of a bug, but very typical for the kind of code you write.
First, this isn't an error, it's a warning. C4703 is a level 4 warning (meaning that it isn't even enabled by default). So in order to get it reported as an error (and thus interrupt compilation), compiler arguments or pragmas were passed to enable this warning and turn it into an error (/W4 and /Werror are the most likely I think).
Then there's a trade-off in the compiler. How complex should the data flow analysis be to determine whether a variable is actually uninitialized? Should it be interprocedural? The more complex it is, the slower the compiler gets (and because of the halting problem, the issue may be undecidable anyway). The simpler it is, the more false positives you get because the condition that guarantees initialization is too complex for the compiler to understand.
In this case, I suspect that the compiler's analysis works as follows: the assignment to p is behind a conditional (it only happens if foo() != foo()). The usage of p is also behind a conditional (it only happens if that complex and-expression is true). The compiler cannot establish a relationship between these conditions (the analysis is not complex enough to realize that foo() != foo() is a precondition to the entire while loop condition being true). Thus, the compiler errs on the side of assuming that the access could happen without prior initialization and emits the warning.
So it's an engineering trade-off. You could report the bug, but if you do, I suggest you supply a more compelling real-world example of idiomatic code to argue in favor of making the analysis more complex. Are you sure you can't restructure your original code to make it more approachable to the compiler, and more readable for humans at the same time?
I did some experimenting with VC++2017 Preview.
It's definitely a bug bug. It makes it impossible to compile and link code that might be correct, albetit smelly.
A warning would be acceptable. (See #SebastianRedl answer.) But in the latest and greatest VC++2017, it is being treated as an error, not warning, even with warnings turned off, and "Treat warnings as errors" set to No. Something odd is happening. The "error" is being thrown late - after it says, "Generating code". I would guess, and it's only a guess, that the "Generating code" pass is doing global analysis to determine if un-initialized access is possible, and it's getting it wrong. Even then, you should be able to disable the error, IMO.
I do not know if this is new behavior. Reading Sebastian's answer, I presume it is. When I get any kind of warning at any level, I always fix it in the code, so I would not know.
Jesse, click on the triangular flag near the top right of Visual Studio, and report it.
For sure it's a bug. I tried to remove it in all possible ways, including #pragma. The real thing is that this is reported as an error, not as a warning as Microsoft say. This is a big mistake from Microsoft. It's NOT a WARNING, it's an ERROR. Please, do not repeat again that it's a warning, because it's NOT.
What I'm doing is trying to compile some third party library whose sources I do not want to fix in any way, and should compile in normal cases, but it DOESN'T compile in VS2017 because the infamous "error C4703: potentially uninitialized local pointer variable *** used".
Someone found a solution for that?
Neither g++ (4.4 and 4.6) nor clang++ (3.2) nor coverity, with -Wall and -Wextra (+ some others) or -Weverything respectively gives me a warning for the following code snippet:
class B {
char *t2;
char *t;
public:
B() : t2(t), t(new char[100]) {}
};
I would at least expect a small warning about the usage of uninitialized (member-) variables.
Is there something I'm missing? Is this a wanted "no-warning"-scenario. I have (now had) at least one bug in my software which was hard to find.
EDIT: As can be read in this new question I realized that coverity warns about this problem in some cases.
There is no good reason not to issue a warning here.
G++ isn't smart enough to diagnose unintialized members in constructors, see http://gcc.gnu.org/PR2972
I have a work-in-progress patch to fix it which I hope to finish "some time this year"
Even with my patch I'm not sure G++ would warn, because t2 is initialized, but it's initialized to an indeterminate value. For the compiler to track that is not trivial, but should be possible (so I'm surprised even Coverity misses it.) Run-time tools such as valgrind get it right though.
When I revisit my patch I'll consider this case and see whether I can make it warn, without adding too much overhead (currently my patch checks whether members without an initializer would leave data uninitialized, to catch this I would need to also check members with an initializer and check whether that initializer relies on another member which isn't yet initialized, which would need to be checked for every member, which might have an effect on compilation speed for classes with lots of members.)
The C++ standard says that using uninitialized variables leads to undefined behaviour. It does not mandate that the compiler issue a diagnostic about it. So getting a warning or not is a QOI (Quality of Implementation) thing.
Is it possible in either gcc/g++ or ms c++ to set a flag which only allows defined behavior? so something like the below gives me a warning or preferably an error
func(a++, a, ++a)
Undefined and unspecified behavior is designated so in the standard specifically because it could cause undue burden on the implementation to diagnose all examples of it (or it would be impossible to determine).
It's expected that the programmer take care to avoid those areas that are undefined.
For your stated example it should be fairly obvious to a programmer to just not write that code in the first place.
That being said, g++ -Wall will catch some bad code, such as missing return in a non-void function to give one example.
EDIT: #sehe also points out -Wsequence-point which will catch this precise code construct, although there should be a sequence point between evaluation of each argument (the order in which arguments is evaluated is unspecified however).
GNU C++ has the following
-Wsequence-point
Warn about code that may have undefined semantics because of violations of sequence point rules in the C and C++ standards.
This will correctly flag the invocation you showed
-Wstrict-overflow
-Wstrict-overflow
-fstrict-aliasing
-fstrict-overflow
HTH
No. For example, consider the following:
int badfunc(int &a, int &b) {
return func(a++, b++);
}
This has undefined behavior if a and b have the same referand. In general the compiler cannot know what arguments will be passed to a function, so it can't reliably catch this case of undefined behavior. Therefore it can't catch all undefined behavior.
Compiler warnings serve to identify some instances of undefined behavior, but never all.
In theory you could write a C++ implementation that does vast numbers of checks at runtime to ensure that undefined behavior is always identified and dealt with in ways defined by that implementation. It still wouldn't tell you at compile time (see: halting problem), and in practice you'd probably be better off with C#, which was designed to make the necessary runtime checks reasonably efficient...
Even if you built that magical checking C++ implementation, it still might not tell you what you really want to know, which is whether your code is correct. Sometimes (hang on to your seats), it is implementation-defined whether or not behavior is undefined. For a simple example, tolower((char)-1); has defined behavior[*] if the char type is unsigned, but undefined behavior if the char type is signed.
So, unless your magical checking implementation makes all the same implementation choices as the "real" implementation that you want your code to run on, it won't tell you whether the code has defined behavior for the set of implementation choices made in the "real" implementation, only whether it has defined behavior for the implementation choices made in the magical checking implementation.
To know that your code is correct and portable, you need to know (for starters) that it produces no undefined behavior for any set of implementation choices. And, for that matter, for any input, not just the inputs used in your tests. You might think that this is a big deficiency in C++ compared to languages with no undefined behavior. Certainly it is inconvenient at times, and affects how you go about sandboxing programs for security. In practice, though, for you to consider your code correct you don't just need it to have defined behavior, you need the behavior to match the specification document. That's a much bigger problem, and in practice it isn't very much harder to write a bug in (say) Java or Python than it is in C++. I've written countless bugs in all three, and knowing that in Java or Python the behavior was defined but wrong didn't help me all that much.
[*] Well, the result is still implementation-defined, it depends on the execution character set, but the implementation has to return the correct result. If char is signed it's allowed to crash.
This gave me a good laugh. Sorry about that, didn't mean any offense; it's a good question.
There is no compiler on the planet that only allows 100% defined behavior. It's the undefined nature of things that makes it so hard. There are a lot of cases taken up in the standard, but they're often too vague to efficiently implement in a compiler.
I know Clang developers showed some interest to adding that functionality, but they haven't started as far as I know.
The only thing you can do now and in the near/far future is cranking up the warning level and strictness of your compiler. Sadly, even in recent versions, MSVC is a pain in that regard. On warning level 4 and up, it spits some stupid warnings that have nothing to do with code correctness, and you often have to jump through hoops to get them to go away.
GCC is better at that in my personal experience. I personnally use these options, ensuring the strictest checks (I currently know of)
-std=c++0x -pedantic -Wextra -Weffc++ -Wmissing-include-dirs -Wstrict-aliasing
I of course ensure zero warnings, if you want to enforce even that, just add -Werror to the line above and any error will error out. It's mostly the std and pedantic options that enforce Standard behavior, Wextra catches some off-chance semi-errors.
And of course, compile your code with different compilers if possible (and make sure they are correctly diagnosing the problem by asking here, where people know what the Standard says/means).
While I agree with Mark's answer, I just thought I should let you know...
#include <stdio.h>
int func(int a, int b, int c)
{
return a + b + c;
}
int main()
{
int a=0;
printf("%d\n", func(a++, a, ++a)); /* line 11 */
return 0;
}
When compiling the code above with gcc -Wall, I get the following warnings:
test.c:11: warning: operation on ‘a’ may be undefined
test.c:11: warning: operation on ‘a’ may be undefined
because of a++ and ++a, I suppose. So to some degree, it's been implemented. But obviously we can't expect all undefined behavior to be recognized by the compiler.
See, what I don't get is, why should programs like the following be legal?
int main()
{
static const int i = 0;
i < i > i;
}
I mean, surely, nobody actually has any current programs that have expressions with no side effects in them, since that would be very pointless, and it would make parsing & compiling the language much easier. So why not just disallow them? What benefit does the language actually gain from allowing this kind of syntax?
Another example being like this:
int main() {
static const int i = 0;
int x = (i);
}
What is the actual benefit of such statements?
And things like the most vexing parse. Does anybody, ever, declare functions in the middle of other functions? I mean, we got rid of things like implicit function declaration, and things like that. Why not just get rid of them for C++0x?
Probably because banning then would make the specification more complex, which would make compilers more complex.
it would make parsing & compiling the
language much easier
I don't see how. Why is it easier to parse and compile i < i > i if you're required to issue a diagnostic, than it is to parse it if you're allowed to do anything you damn well please provided that the emitted code has no side-effects?
The Java compiler forbids unreachable code (as opposed to code with no effect), which is a mixed blessing for the programmer, and requires a little bit of extra work from the compiler than what a C++ compiler is actually required to do (basic block dependency analysis). Should C++ forbid unreachable code? Probably not. Even though C++ compilers certainly do enough optimization to identify unreachable basic blocks, in some cases they may do too much. Should if (foo) { ...} be an illegal unreachable block if foo is a false compile-time constant? What if it's not a compile-time constant, but the optimizer has figured out how to calculate the value, should it be legal and the compiler has to realise that the reason it's removing it is implementation-specific, so as not to give an error? More special cases.
nobody actually has any current
programs that have expressions with no
side effects in them
Loads. For example, if NDEBUG is true, then assert expands to a void expression with no effect. So that's yet more special cases needed in the compiler to permit some useless expressions, but not others.
The rationale, I believe, is that if it expanded to nothing then (a) compilers would end up throwing warnings for things like if (foo) assert(bar);, and (b) code like this would be legal in release but not in debug, which is just confusing:
assert(foo) // oops, forgot the semi-colon
foo.bar();
things like the most vexing parse
That's why it's called "vexing". It's a backward-compatibility issue really. If C++ now changed the meaning of those vexing parses, the meaning of existing code would change. Not much existing code, as you point out, but the C++ committee takes a fairly strong line on backward compatibility. If you want a language that changes every five minutes, use Perl ;-)
Anyway, it's too late now. Even if we had some great insight that the C++0x committee had missed, why some feature should be removed or incompatibly changed, they aren't going to break anything in the FCD unless the FCD is definitively in error.
Note that for all of your suggestions, any compiler could issue a warning for them (actually, I don't understand what your problem is with the second example, but certainly for useless expressions and for vexing parses in function bodies). If you're right that nobody does it deliberately, the warnings would cause no harm. If you're wrong that nobody does it deliberately, your stated case for removing them is incorrect. Warnings in popular compilers could pave the way for removing a feature, especially since the standard is authored largely by compiler-writers. The fact that we don't always get warnings for these things suggests to me that there's more to it than you think.
It's convenient sometimes to put useless statements into a program and compile it just to make sure they're legal - e.g. that the types involve can be resolved/matched etc.
Especially in generated code (macros as well as more elaborate external mechanisms, templates where Policies or types may introduce meaningless expansions in some no-op cases), having less special uncompilable cases to avoid keeps things simpler
There may be some temporarily commented code that removes the meaningful usage of a variable, but it could be a pain to have to similarly identify and comment all the variables that aren't used elsewhere.
While in your examples you show the variables being "int" immediately above the pointless usage, in practice the types may be much more complicated (e.g. operator<()) and whether the operations have side effects may even be unknown to the compiler (e.g. out-of-line functions), so any benefit's limited to simpler cases.
C++ needs a good reason to break backwards (and retained C) compatibility.
Why should doing nothing be treated as a special case? Furthermore, whilst the above cases are easy to spot, one could imagine far more complicated programs where it's not so easy to identify that there are no side effects.
As an iteration of the C++ standard, C++0x have to be backward compatible. Nobody can assert that the statements you wrote does not exist in some piece of critical software written/owned by, say, NASA or DoD.
Anyway regarding your very first example, the parser cannot assert that i is a static constant expression, and that i < i > i is a useless expression -- e.g. if i is a templated type, i < i > i is an "invalid variable declaration", not a "useless computation", and still not a parse error.
Maybe the operator was overloaded to have side effects like cout<<i; This is the reason why they cannot be removed now. On the other hand C# forbids non-assignment or method calls expresions to be used as statements and I believe this is a good thing as it makes the code more clear and semantically correct. However C# had the opportunity to forbid this from the very beginning which C++ does not.
Expressions with no side effects can turn up more often than you think in templated and macro code. If you've ever declared std::vector<int>, you've instantiated template code with no side effects. std::vector must destruct all its elements when releasing itself, in case you stored a class for type T. This requires, at some point, a statement similar to ptr->~T(); to invoke the destructor. int has no destructor though, so the call has no side effects and will be removed entirely by the optimizer. It's also likely it will be inside a loop, then the entire loop has no side effects, so the entire loop is removed by the optimizer.
So if you disallowed expressions with no side effects, std::vector<int> wouldn't work, for one.
Another common case is assert(a == b). In release builds you want these asserts to disappear - but you can't re-define them as an empty macro, otherwise statements like if (x) assert(a == b); suddenly put the next statement in to the if statement - a disaster! In this case assert(x) can be redefined as ((void)0), which is a statement that has no side effects. Now the if statement works correctly in release builds too - it just does nothing.
These are just two common cases. There are many more you probably don't know about. So, while expressions with no side effects seem redundant, they're actually functionally important. An optimizer will remove them entirely so there's no performance impact, too.