I've read already a couple times (e.g. here Compiler: What if condition is always true / false) that any decent c++ compiler will opt-out something like
if(false)
{
...
}
But what if there is an intentional jump into this if(false) block. I'm having something like this in mind
#include <iostream>
void func(int part){
switch (part) {
case 0:{
if(false)
case 1:{std::cout << "hello" << std::endl;}
break;
}
default:
break;
}
}
int main()
{
func(0);
func(1);
return 0;
}
Is any decent c++ compiler going to respect the jump or will there eventually going to be some problems with opting-out?
The code doesn't appear to be Undefined Behavior. Therefore any optimizations are not allowed to produce any effects which would affect the behavior of the code.
Note: Related to this kind of code, one thing you are not allowed to do is "goto" over definitions of local variables. But this code doesn't do that, so no problem.
Another note: If you have this kind of code in a "real" (not toy, experiment, obfuscation exercise etc) program, you should really refactor it into something which doesn't elicit quite so many WTFs from anybody reading the code.
Related
Recently, I was reviewing some code I maintain and I noticed a practice different than what I am used to. As a result, I'm wondering which method to use when performing an early return in a function.
Here's some example:
Version 1:
int MyFunction(int* ptr)
{
if(!ptr) { // oh no, NULL pointer!
return -1; // what was the caller doing? :(
}
// other code goes here to do work on the pointer
// ...
return 0; // we did it!
}
Version 2:
int MyFunction(int* ptr)
{
if(!ptr) { // oh no, NULL pointer!
return -1; // what was the caller doing? :(
} else { // explicitly show that this only gets call when if statement fails
// other code goes here to do work on the pointer
// ...
return 0; // hooray!
}
}
As a result, I'm wondering which is considered the "best practice" for those of you who have endured (and survived) many code reviews. I know each effectively does the same thing, but does the "else" add much in terms of readability and clarity? Thanks for the help.
The else would only add clarity if the else clause is short, a few lines of code at best. And if you have several initial conditions you want to check, the source gets cluttered very quickly.
The only time I would use an else if it is a small function with a small else, meaning less than about 10 source lines, and there are no other initial checks to make.
In some cases I have used a single loop so that a series of initial checks can use a break to leave.
do {
...
} while (0);
I am loathe to use a goto which is practically guaranteed to get at least one true believer of goto less programming up in arms.
So much would depend on any code standards of your organization. I tend to like minimalism so I use the first version you provide without the else.
I might also do something like the following in a smaller function say less than 20 or 30 lines:
int MyFunction(int* ptr)
{
int iRetStatus = -1; // we have an error condition
if (ptr) { // good pointer
// stuff to do in this function
iRetStatus = 0;
}
return iRetStatus; // we did it!
}
The only problem with returns in the body of the function is that sometimes people scanning the function do not realize that there is a return. In small functions where everything can be pretty much seen on a single screen, the chance of missing a return is pretty small. However for large functions, returns in the middle can be missed especially large complex functions that have gone through several maintenance cycles and had a lot of cruft and work arounds put into them.
Let's say you have a function in C/C++, that behaves a certain way the first time it runs. And then, all other times it behaves another way (see below for example). After it runs the first time, the if statement becomes redundant and could be optimized away if speed is important. Is there any way to make this optimization?
bool val = true;
void function1() {
if (val == true) {
// do something
val = false;
}
else {
// do other stuff, val is never set to true again
}
}
gcc has a builtin function that let you inform the implementation about branch prediction:
__builtin_expect
http://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
For example in your case:
bool val = true;
void function1()
{
if (__builtin_expect(val, 0)) {
// do something
val = false;
}
else {
// do other stuff, val is never set to true again
}
}
You should only make the change if you're certain that it truly is a bottleneck. With branch-prediction, the if statement is probably instant, since it's a very predictable pattern.
That said, you can use callbacks:
#include <iostream>
using namespace std;
typedef void (*FunPtr) (void);
FunPtr method;
void subsequentRun()
{
std::cout << "subsequent call" << std::endl;
}
void firstRun()
{
std::cout << "first run" << std::endl;
method = subsequentRun;
}
int main()
{
method = firstRun;
method();
method();
method();
}
produces the output:
first run subsequent call subsequent call
You could use a function pointer but then it will require an indirect call in any case:
void (*yourFunction)(void) = &firstCall;
void firstCall() {
..
yourFunction = &otherCalls;
}
void otherCalls() {
..
}
void main()
{
yourFunction();
}
One possible method is to compile two different versions of the function (this can be done from a single function in the source with templates), and use a function pointer or object to decide at runtime. However, the pointer overhead will likely outweigh any potential gains unless your function is really expensive.
You could use a static member variable instead of a global variable..
Or, if the code you're running the first time changes something for all future uses (eg, opening a file?), you could use that change as a check to determine whether or not to run the code (ie, check if the file is open). This would save you the extra variable. Also, it might help with error checking - if for some reason the initial change is be unchanged by another operation (eg, the file is on removable media that is removed improperly), your check could try to re-do the change.
A compiler can only optimize what is known at compile time.
In your case, the value of val is only known at runtime, so it can't be optimized.
The if test is very quick, you shouldn't worry about optimizing it.
If you'd like to make the code a little bit cleaner you could make the variable local to the function using static:
void function() {
static bool firstRun = true;
if (firstRun) {
firstRun = false;
...
}
else {
...
}
}
On entering the function for the first time, firstRun would be true, and it would persist so each time the function is called, the firstRun variable will be the same instance as the ones before it (and will be false each subsequent time).
This could be used well with #ouah's solution.
Compilers like g++ (and I'm sure msvc) support generating profile data upon a first run, then using that data to better guess what branches are most likely to be followed, and optimizing accordingly. If you're using gcc, look at the -fprofile-generate option.
The expected behavior is that the compiler will optimize that if statement such that the else will be ordered first, thus avoiding the jmp operation on all your subsequent calls, making it pretty much as fast as if it wern't there, especially if you return somewhere in that else (thus avoiding having to jump past the 'if' statements)
One way to make this optimization is to split the function in two. Instead of:
void function1()
{
if (val == true) {
// do something
val = false;
} else {
// do other stuff
}
}
Do this:
void function1()
{
// do something
}
void function2()
{
// do other stuff
}
One thing you can do is put the logic into the constructor of an object, which is then defined static. If such a static object occurs in a block scope, the constructor is run the fist time that an execution of that scope takes place. The once-only check is emitted by the compiler.
You can also put static objects at file scope, and then they are initialized before main is called.
I'm giving this answer because perhaps you're not making effective use of C++ classes.
(Regarding C/C++, there is no such language. There is C and there is C++. Are you working in C that has to also compile as C++ (sometimes called, unofficially, "Clean C"), or are you really working in C++?)
What is "Clean C" and how does it differ from standard C?
To remain compiler INDEPENDENT you can code the parts of if() in one function and else{} in another. almost all compilers optimize the if() else{} - so, once the most LIKELY being the else{} - hence code the occasional executable code in if() and the rest in a separate function that's called in else
I have a "MyFunction" I keep obsessing over if I should or shouldn't use goto on it and in similar (hopefully rare) circumstances. So I'm trying to establish a hard-and-fast habit for this situation. To-do or not-to-do.
int MyFunction()
{ if (likely_condition)
{
condition_met:
// ...
return result;
}
else /*unlikely failure*/
{ // meet condition
goto condition_met;
}
}
I was intending to net the benefits of the failed conditional jump instruction for the likely case. However I don't see how the compiler could know which to streamline for case probability without something like this.
it works right?
are the benefits worth the confusion?
are there better (less verbose, more structured, more expressive) ways to enable this optimization?
It appears to me that the optimization you're trying to do is mostly obsolete. Most modern processors have branch prediction built in, so (assuming it's used enough to notice) they track how often a branch is taken or not and predict whether the branch is likely to be taken or not based on its past pattern of being taken or not. In this case, speed depends primarily on how accurate that prediction is, not whether the prediction is for taken vs. not taken.
As such, you're probably best off with rather simpler code:
int MyFunction() {
if (!likely_condition) {
meet_condition();
}
// ...
return result;
}
A modern CPU will take that branch either way with equal performance if it makes the correct branch prediction. So if that is in an inner loop, the performance of if (unlikely) { meet condition } common code; will match what you have written.
Also, if you spell out the common code in both branches the compiler will generate code that is identical to what you have written: The common case will be emitted for the if clause and the else clause will jmp to the common code. You see this all the time with simpler terminal cases like *out = whatever; return result;. When debugging it can be hard to tell which return you're looking at because they've all been merged.
It looks like the code should work as you expect as long as condition_met: doesn't skip variable initializations.
No, and you don't even know that the obfuscated version compiles into more optimal code. Compiler optimizations (and processor branch prediction) are getting very smart in recent times.
3.
int MyFunction()
{
if (!likely_condition)
{
// meet condition
}
condition_met:
// ...
return result;
}
or, if it helps your compiler (check the assembly)
int MyFunction()
{
if (likely_condition); else
{
// meet condition
}
condition_met:
// ...
return result;
}
I would highly recommend using the __builtin_expect() macro (GCC) or alike for your particular C++ compiler (see Portable branch prediction hints) instead of using goto:
int MyFunction()
{ if (__builtin_expect(likely_condition))
{
// ...
return result;
}
else /*unlikely failure*/
{ // meet condition
}
}
As others also mentioned goto is error prone and evil from the bones.
Like we all know, it's not that easy to break from a nested loop out of an outer loop without either:
a goto (Example code.)
another condition check in the outer loop (Example code.)
putting both loops in an extra function and returning instead of breaking (Example code.)
Though, you gotta admit, all of those are kinda clumsy. Especially the function version lacks because of the missing context where the loops are called, as you'd need to pass everything you need in the loops as parameters.
Additionally, the second one gets worse for each nested loop.
So, I personally, still consider the goto version to be the cleanest.
Now, thinking all C++0x and stuff, the third option brought me this idea utilizing lambda expressions:
#include <iostream>
bool CheckCondition(){
return true;
}
bool CheckOtherCondition(){
return false;
}
int main(){
[&]{while(CheckCondition()){
for(;;){
if(!CheckOtherCondition())
return;
// do stuff...
}
// do stuff...
}}();
std::cout << "yep, broke out of it\n";
}
(Example at Ideone.)
This allows for the semantic beauty of a simple return that the third option offers while not suffering from the context problems and being (nearly) as clean as the goto version. It's also even shorter (character-wise) than any of the above options.
Now, I've learned to keep my joy down after finding beautiful (ab)uses of the language, because there's almost always some kind of drawback. Are there any on this one? Or is there even a better approach to the problem?
Please don't do that in a project I'm managing. That's an awkward abuse of lambdas in my opinion.
Use a goto where a goto is useful.
Perfectly valid in my opinion. Though I prefer to assign mine with names, making the code more self documenting, i.e.
int main(){
auto DoThatOneThing = [&]{while(CheckCondition()){
for(;;){
if(!CheckOtherCondition())
return;
// do stuff...
}
// do stuff...
}};
DoThatOneThing();
std::cout << "yep, broke out of it\n";
}
In which way is that an improvement over
void frgleTheBrgls()
{
while(CheckCondition()) {
for(;;) {
if(!CheckOtherCondition())
return;
// do stuff...
}
// do stuff...
}
}
int main()
{
frgleTheBrgls();
std::cout << "yep, broke out of it\n";
}
This is much well-known (functions, you know, as in BASIC), clearer (the algorithm's got a nice name explaining what it does), and does exactly the same as yours does.
Especially the function version lacks because of the missing context where the loops are called, as you'd need to pass everything you need in the loops as parameters.
I see that as an advantage. You see exactly what is needed to frgle the brgls. Explicity, when programming, often is a good thing.
One drawback with your proposed syntax: you cannot have more than 2 nested loops. The 'goto' syntax allows this:
int main()
{
for (;;)
{
for (;;)
{
for (;;)
{
if (CheckCondition1()) goto BREAK_ON_COND1;
if (CheckCondition2()) goto BREAK_ON_COND2;
if (CheckCondition3()) break;
// Do stuff when all conditions are false
}
// Do stuff when condition 3 becomes true
}
BREAK_ON_COND2:
// Do stuff when condition 2 becomes true
}
BREAK_ON_COND1: // When condition 1 becomes true
std::cout << "yep, broke out of it\n";
}
Sun Studio 12.1 prints the warning
Warning: The last statement should return a value.
frequently for functions like that:
int f()
{
/* some code that may return */
// if we end up here, something is broken
throw std::runtime_error("Error ...");
}
It is perfectly clear that we do not need a return value at the end of the function. I hesitate to insert something like
// Silence a compiler warning
return 42;
at the end of such a function, since it is dead code anyway. For more complicated return types, it might actually be difficult to construct a 'sensible' bogus value.
What is the recommended way to silence such a warning?
Can you reorganize the code in the function in such a way (hopefully more logical as well) that the normal path happens at the end of the function so that a return can be used, and the exceptional path happens earlier, NOT as the last statement?
EDIT: If reorganizing the function really doesn't make sense, you can always just put a dummy return 0; with a comment. It's better to squelch the warning that way than more globally.
If you really want to quiet the warning permanently, you can use #pragma error_messages (off, wnoretvalue) but note that the warning really is useful most of the time so I absolutely don't suggest turning it off. You can use the on version of the pragma to re-enable the warning after the function, but the compiler will still emit the warning if your function is ever inlined. If you put the function in its own source file and use the pragma that should shush the warning relatively safely though, since it can't affect other translation units.
Another really wacky possibility is to switch to g++. Unless you're compiling for SPARC g++ may actually generate better code than Sun studio.
I find it a perfect spot for abort(). You should never end there, according to you, so something like:
UNREACHABLE("message")
which expands into:
#ifdef NDEBUG
#define UNREACHABLE(Message_) abort();
#else
#define UNREACHABLE(Message_) assert(0 && Message_);
#endif
Looks appropriate
Since you know the exception will be systematically called, why don't you simply return a 0?
Perhaps encapsulate the contents in a do { } while (false); construct:
int my_function()
{
int result = DEFAULT_VALUE;
do
{
result = /*...*/
// Whatever
if (error)
{
throw std::runtime_error("Error ...");
}
} while (false);
return result;
}
The idea is for normal operation to set the result value then let the execution flow to the end or use a break to jump to the return statement.
I don't know of a "recommended" way to deal with it, but to answer your question about coping with more complex types, what about:
ComplexType foo()
{
...
throw std::runtime( "Error..." );
return *(ComplexType*)(0);
}
This would then work with any return type. I realise it looks evil, but its there just to silence the warning. As you say, this code will never be executed, and it may even be optimised out.