In C++, imagine I have a function like
bool Aclass::func(){
return true;
}
which is called in the main in this way
if(!func()) {
//do stuff
}
Does the compiler generate these lines of code?
Like all optimization questions, it depends on the compiler and the flags given. Having said that, a decent modern compiler will be able to remove dead code like this if optimizations flags are provided. Try https://godbolt.org/ to see for yourself which compiler and which flags will succeed in removing the dead code.
The compiler at compilation step will treat those lines of code as valid. For example, if you have an error in those lines of code then it will be flagged by the compiler. So for example, the following will not compile
if (false) {
auto s = std::string{1.0};
}
But most optimizers will not add that code in the compiled form for that source file. However related code is still added if needed, for example
if (true) { ... }
else { ... }
here the else code for the else statement will essentially be converted to
{
...
}
when the code gets converted to its compiled form.
#Yakk brings up a great point. The compiler not including such code is called dead code elimination. However labels can still be used to reach the body code.
Also note that in these situations where an expression is evaluated at compile time. Then you can use a new construct from C++17 known as if constexpr. However as I mentioned with compiler errors persisting even in dead code in runtime ifs the situation is different for if constexprs, for more read the code examples here http://en.cppreference.com/w/cpp/language/if#Constexpr_If and this answer https://stackoverflow.com/a/38317834/5501675
Related
tl;dr: Can it be ensured somehow (e.g. by writing a unit test) that some things are optimized away, e.g. whole loops?
The usual approach to be sure that something is not included in the production build is wrapping it with #if...#endif. But I prefer to stay with C++ mechanics instead. Even there, instead of complicated template specializations I like to keep implementations simple and argue "hey, the compiler will optimize this out anyway".
Context is embedded SW in automotive (binary size matters) with often poor compilers. They are certified in the sense of safety, but usually not good in optimizations.
Example 1: In a container the destruction of elements is typically a loop:
for(size_t i = 0; i<elements; i++)
buffer[i].~T();
This works also for build-in types such as int, as the standard allows the explicit call of the destructor also for any scalar types (C++11 12.4-15). In such case the loop does nothing and is optimized out. In GCC it is, but in another (Aurix) not, I saw a literally empty loop in the disassembly! So that needed a template specialization to fix it.
Example 2: Code, which is intended for debugging, profiling or fault-injection etc. only:
constexpr bool isDebugging = false; // somehow a global flag
void foo(int arg) {
if( isDebugging ) {
// Albeit 'dead' section, it may not appear in production binary!
// (size, security, safety...)
// 'if constexpr..' not an option (C++11)
std::cout << "Arg was " << arg << std::endl;
}
// normal code here...
}
I can look at the disassembly, sure. But being an upstream platform software it's hard to control all targets, compilers and their options one might use. The fear is big that due to any reason a downstream project has a code bloat or performance issue.
Bottom line: Is it possible to write the software in a way, that certain code is known to be optimized away in a safe manner as a #if would do? Or a unit tests, which give a fail if optimization is not as expected?
[Timing tests come to my mind for the first problem, but being bare-metal I don't have convenient tools yet.]
There may be a more elegant way, and it's not a unit test, but if you're just looking for that particular string, and you can make it unique,
strings $COMPILED_BINARY | grep "Arg was"
should show you if the string is being included
if constexpr is the canonical C++ expression (since C++17) for this kind of test.
constexpr bool DEBUG = /*...*/;
int main() {
if constexpr(DEBUG) {
std::cerr << "We are in debugging mode!" << std::endl;
}
}
If DEBUG is false, then the code to print to the console won't generate at all. So if you have things like log statements that you need for checking the behavior of your code, but which you don't want to interact with in production code, you can hide them inside if constexpr expressions to eliminate the code entirely once the code is moved to production.
Looking at your question, I see several (sub-)questions in it that require an answer. Not all answers might be possible with your bare-metal compilers as hardware vendors don't care that much about C++.
The first question is: How do I write code in a way that I'm sure it gets optimized. The obvious answer here is to put everything in a single compilation unit so the caller can see the implementation.
The second question is: How can I force a compiler to optimize. Here constexpr is a bless. Depending on whether you have support for C++11, C++14, C++17 or even the upcoming C++20, you'll get different feature sets of what you can do in a constexpr function. For the usage:
constexpr char c = std::string_view{"my_very_long_string"}[7];
With the code above, c is defined as a constexpr variable. Because you apply it to the variable, you require some things:
Your compiler should optimize the code so the value of c is known at compile time. This even holds true for -O0 builds!
All functions used for calculate c are constexpr and available. (and by result, enforce the behaviour of the first question)
No undefined behaviour is allowed to be triggered in the calculation of c. (For the given value)
The negative about this is: Your input needs to be known at compile time.
C++17 also provides if constexpr which has similar requirements: condition needs to be calculated at compile time. The result is that 1 branch of the code ain't allowed to be compiled (as it even can contain elements that don't work on the type you are using).
Which than brings us to the question: How do I ensure sufficient optimizations for my program to run fast enough, even if my compiler ain't well behaving. Here the only relevant answer is: create benchmarks and compare the results. Take the effort to setup a CI job that automates this for you. (And yes, you can even use external hardware although not being that easy) In the end, you have some requirements: handling A should take less than X seconds. Do A several times and time it. Even if they don't handle everything, as long as it's within the requirements, its fine.
Note: As this is about debug, you most likely can track the size of an executable as well. As soon as you start using streams, a lot of conversions to string ... your exe size will grow. (And you'll find it a bless as you will immediately find commits which add 10% to the image size)
And than the final question: You have a buggy compiler, it doesn't meet my requirements. Here the only answer is: Replace it. In the end, you can use any compiler to compiler your code to bare metal, as long as the linker scripts work. If you need a start, C++Now 2018: Michael Caisse “Modern C++ in Embedded Systems” gives you a very good idea of what you need to use a different compiler. (Like a recent Clang or GCC, on which you even can log bugs if the optimization ain't good enough)
Insert a reference to external data or function into the block that should be verified to be optimised away. Like this:
extern void nop();
constexpr bool isDebugging = false; // somehow a global flag
void foo(int arg) {
if( isDebugging ) {
nop();
std::cout << "Arg was " << arg << std::endl; // may not appear in production binary!
}
// normal code here...
}
In Debug-Builds, link with an implementation of nop() in a extra compilation unit nop.cpp:
void nop() {}
In Release-Builds, don't provide an implementation.
Release builds will only link if the optimisable code is eliminated.
`- kisch
Here's another nice solution using inline assembly.
This uses assembler directives only, so it might even be kind of portable (checked with clang).
constexpr bool isDebugging = false; // somehow a global flag
void foo(int arg) {
if( isDebugging ) {
asm(".globl _marker\n_marker:\n");
std::cout << "Arg was " << arg << std::endl; // may not appear in production binary!
}
// normal code here...
}
This would leave an exported linker symbol in the compiled executable, if the code isn't optimised away. You can check for this symbol using nm(1).
clang can even stop the compilation right away:
constexpr bool isDebugging = false; // somehow a global flag
void foo(int arg) {
if( isDebugging ) {
asm("_marker=1\n");
std::cout << "Arg was " << arg << std::endl; // may not appear in production binary!
}
asm volatile (
".ifdef _marker\n"
".err \"code not optimised away\"\n"
".endif\n"
);
// normal code here...
}
This is not an answer to "How to ensure some code is optimized away?" but to your summary line "Can a unit test be written that e.g. whole loops are optimized away?".
First, the answer depends on how far you see the scope of unit-testing - so if you put in performance tests, you might have a chance.
If in contrast you understand unit-testing as a way to test the functional behaviour of the code, you don't. For one thing, optimizations (if the compiler works correctly) shall not change the behaviour of standard-conforming code.
With incorrect code (code that has undefined behaviour) optimizers can do what they want. (Well, for code with undefined behaviour the compiler can do it also in the non-optimizing case, but sometimes only the deeper analyses peformed during optimization make it possible for the compiler to detect that some code has undefined behaviour.) Thus, if you write unit-tests for some piece of code with undefined behaviour, the test results may differ when you run the tests with and without optimization. But, strictly speaking, this only tells you that the compiler translated the code both times in a different way - it does not guarantee you that the code is optimized in the way you want it to be.
Here's another different way that also covers the first example.
You can verify (at runtime) that the code has been eliminated, by comparing two labels placed around it.
This relies on the GCC extension "Labels as Values" https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html
before:
for(size_t i = 0; i<elements; i++)
buffer[i].~T();
behind:
if (intptr_t(&&behind) != intptr_t(&&before)) abort();
It would be nice if you could check this in a static_assert(), but sadly the difference of &&label expressions is not accepted as compile-time constant.
GCC insists on inserting a runtime comparison, even though both labels are in fact at the same address.
Interestingly, if you compare the addresses (type void*) directly, without casting them to intptr_t, GCC falsely optimises away the if() as "always true", whereas clang correctly optimises away the complete if() as "always false", even at -O1.
Since am comming from a java island, I wounder why the compiler doesnt warns about unreachable code in something like:
int main(int argc, char** argV)
{
std::list<int> lst = {1,2,3,4};
return 0;
std::cout << "Done!!!" << std::endl;
return 0;
}
my question:
Why can I compile a code with 2 returns?
my Compiler is gcc for c++11, on Windows, code block
I wounder why the compiler doesnt warns about unreachable code in something like
It is pretty well explained in gcc documentaion about warnings:
-Wunreachable-code
Warn if the compiler detects that code will never be executed. This option is intended to warn when the compiler detects
that at least a whole line of source code will never be executed,
because some condition is never satisfied or because it is after a
procedure that never returns.
It is possible for this option to produce a warning even though there
are circumstances under which part of the affected line can be
executed, so care should be taken when removing apparently-unreachable
code.
For instance, when a function is inlined, a warning may mean that the
line is unreachable in only one inlined copy of the function.
This option is not made part of -Wall because in a debugging version
of a program there is often substantial code which checks correct
functioning of the program and is, hopefully, unreachable because the
program does work. Another common use of unreachable code is to
provide behavior which is selectable at compile-time.
Though g++ 5.1.0 does not produce any warnings for this code even with this option enabled.
Why shouldn't you be able to compile code that has multiple returns?
Because the code is unreachable? Most compilers can issue a warning for that.
However, I often see code like:
if(a)
{
// Do stuff
}
else
{
// Do other stuff
if(b)
{
// Do more stuff
}
else
{
// Do other more stuff
}
}
That could be simplified as
if(a)
{
// Do stuff
return;
}
// Do other stuff
if(b)
{
// Do more stuff
return;
}
// Do other more stuff
About a decade ago, people frowned on having more than one return in a function of method, but there really is no reason to continue frowning on it with modern compilers.
Because this part
std::cout << "Done!!!" << std::endl;
return 0;
will never be called because of the first return statement, but it is not an error aborting the compilation, rather the compiler might drop a warning, depending on what compiler you are using (e.g. Microsofts VC++ compiler warns you about that).
Unreachable code is not a compile error in C++, but usually gives a warning, depending on your compiler and flags.
You can try to add -Wall option when you call your compiler. This will
active many useful warning.
Mainly because more often than not the compiler cannot know for sure. (There have been attempts to do this in Java but there the criteria for defining reachability have been decided upon.)
In this case, indeed, it is obvious.
Some compilers do issue reachability warnings but the C++ standard does not require this.
No answer on reachability is complete without referencing this: https://en.wikipedia.org/wiki/Halting_problem
As a final remark on Java, consider these two Java snippets:
if (true){
return;
}
; // this statement is defined to be reachable
and
while (true){
return;
}
; // this statement is defined to be unreachable
The worst of both worlds is attained, in my humble opinion.
There are two reasons for this:
C++ has many standards (c++11, c++14, c++17 etc. ) unlike java (java is very rigid in standard and only thing that matters really with java is the version you are using), so, some compilers might warn you about the unreachable code while others might not.
The statements after return 0, though are unreachable logically, do not cause any fatal error like ambiguity, syntax error, etc. and can be compiled easily (if the compiler wills to ;) ).
Encountered a bit of weird-looking code and it got me wondering if there's any practical application to it, or if it's just a random oddity.
The code essentially looks like this:
#ifdef PREPROCESSOR_CONDITION
if (runtime_condition) {
} else
#endif
{
//expression
}
I included the macro bit, though I doubt it has any bearing. There's no code that runs when runtime_condition is true, only the else block. I figure this ought to be completely identical to using if(!runtime_condition) and no else block (which would have been more straightforward), but maybe there's some kind of compiler-optimizey thing happening?
Or, you know, it could be that there used to be something in the if block that got deleted and nobody bothered to change the expression.
The "macro bit" is significant.
Consider what happens if the programmer mistakenly changes the snippet to
#ifdef PREPROCESSOR_CONDITION
if (runtime_condition)
{
}
else
#endif
{
//expression
}
else
{
// another expression
}
This will result in a compilation error, regardless of whether PREPROCESSOR_CONDITION is defined or not.
With your change, viz;
#ifdef PREPROCESSOR_CONDITION
if (!runtime_condition)
#endif
{
//expression
}
else
{
// another expression
}
it will compile if PREPROCESSOR_CONDITION is defined, but fail if it is not defined.
If the programmer who adds the else only attempts compilation in conditions with PREPROCESSOR_CONDITION defined, no problem will be found. There will then be a latent defect in the code, that will not be exposed until the code is compiled with PREPROCESSOR_CONDITION undefined.
This may seem minor in a single code snippet, but compilation errors occurring by surprise (e.g. code breaking in unexpected places) is a significant productivity concern if it occurs in larger projects.
Ensuring the conditional definitely always has two branches (on those occasions when it is compiled as a conditional) guarantees that someone who works on a platform that always has PREPROCESSOR_CONDITION defined won't absent-mindedly add a real else block below, which clearly isn't how this block is supposed to work (or worse, "fix" it so that their code does compile everywhere, but in the process damage whatever the intent of the original author was in constructing their blocks that way).
If that is the intent, it would normally make sense to communicate it explicitly by hiding the exact details of the if-empty-else behind a macro named something like UNLESS, though.
It definitely isn't anything to do with optimization. If you think in terms of the jumps involved, a compiler already has to invert the value of the predicate to decide whether to skip the block or not (i.e. if (A) {B;} C:... really means if (!A) goto C; B; C:...), which means it folds the hand-written ! at the outermost level of the expression into the conditional's structure anyway; since most instruction sets will provide both jump-if-true and jump-if-not-true instructions, doing this is completely free, and both ways to write "if not" in the source will produce the same machine code on even a very simple compiler without any optimization.
Okay, little oddity I discovered with my C++ compiler.
I had a not-overly complex bit of code to refactor, and I accidentally managed to leave in a path that didn't have a return statement. My bad. On the other hand, this compiled, and segfaulted when I ran it and that path was hit, obviously.
Here's my question: Is this a compiler bug, or is there no guarantee that a C++ compiler will enforce the need for a return statement in a non-void return function?
Oh, and to be clear, in this case it was an unecessary if statement without an accompanying else. No gotos, no exits, no aborts.
Personally I think this should be an error:
int f() {
}
int main() {
int n = f();
return 0;
}
but most compilers treat it as a warning, and you may even have to use compiler switches to get that warning. For example, on g++ you need -Wall to get:
[neilb#GONERIL NeilB]$ g++ -Wall nr.cpp
nr.cpp: In function 'int f()':
nr.cpp:2: warning: no return statement in function returning non-void
Of course, with g++ you should always compile with at least -Wall anyway.
There is no guarantee that a C++ compiler will enforce that. A C++ function could jump out of its control flow by mechanisms unknown to the compiler. Context switches when C++ is used to write an OS kernel is an example of that. An uncaught exception thrown by a called function (whose code isn't necessarily available to the caller) is another one.
Some other languages, like Java, explicitly enforce that with knowledge available at compile time, all paths return a value. In C++ this isn't true, as is with many other occasions in the language, like accessing an array out of its bounds isn't checked either.
The compiler doesn't enforce this because you have knowledge about what paths are practically possible that the compiler doesn't. The compiler typically only knows about that particular file, not others that may affect the flow inside any given function. So, it isn't an error.
In Visual Studio, though, it is a warning. And we should pay attention to all warnings.... right? :)
Edit:
There seems to be some discussion about when this could happen. Here's a modified but real example from my personal code library;
enum TriBool { Yes, No, Maybe };
TriBool GetResult(int input) {
if (TestOne(input)) {
return Yes;
} else if (TestTwo(input)) {
return No;
}
}
Bear with me because this is old code. Originally there was an "else return maybe" in there. :) If TestOne and TestTwo are in a different compilation unit then when the compiler hits this code, it can not tell if TestOne and TestTwo could both return false for a given input. You, as the programmer that wrote TestOne and TestTwo, know that if TestOne fails then TestTwo will succeed. Maybe there are side effects of those tests so they have to be done. Would it be better to write it without the "else if"? Maybe. Probably. But the point is that this is legal C++ and the compiler can't know if it is possible to exit without a return statement. It is, I agree, ugly and not good coding but it is legal and Visual Studio will give you a warning but it will compile.
Remember that C++ isn't about protecting you from yourself. It is about letting you do what your heart desires within the constraints of the language even if that includes shooting yourself in the foot.
I came across the following code that compiles fine (using Visual Studio 2005):
SomeObject SomeClass::getSomeThing()
{
for each (SomeObject something in someMemberCollection)
{
if ( something.data == 0 )
{
return something;
}
}
// No return statement here
}
Why does this compile if there is no return statement at the end of the method?
This is to support backwards compatibility with C which did not strictly require a return from all functions. In those cases you were simply left with whatever the last value in the return position (stack or register).
If this is compiling without warning though you likely don't have your error level set high enough. Most compilers will warn about this now.
It's possible to write code that is guaranteed to always return a value, but the compiler might not be able to figure that out. One trivial example would be:
int func(int x)
{
if(x > 0)
return 1;
else if(x == 0)
return 0;
else if(x < 0)
return -1;
}
As far as the compiler is concerned, it's possible that all 3 if statements evaluate to false, in which case control would fall off the end of the function, returning an undefined result. Mathematically, though, we know it's impossible for that to happen, so this function has defined behavior.
Yes, a smarter compiler might be able to figure this out, but imagine that of integer comparisons, we had calls to external functions defined in a separate translation unit. Then, as humans we can prove that all control paths return values, but the compiler certainly can't figure that out.
The reason this is allowed is for compatibility with C, and the reason that C allows it is for compatibility with legacy C code that was written before C was standardized (pre-ANSI). There was code that did exactly this, so to allow such code to remain valid and error-free, the C standard permitted this. Letting control fall off a function without returning a value is still undefined behavior, though.
Any decent compiler should provide a warning about this; depending on your compiler, you may have to turn your warning level way up. I believe the option for this warning with gcc is -Wextra, which also includes a bunch of other warnings.
Set the warning level to 4 and tryout.Not all control path returns a value is the warning I remember getting this warning.
Probably your particular compiler is not doing as good flow control analysis as it should.
What compiler version are you using and what switches are you using to compile with?