IF statements and code efficiency - if-statement

this is not related to any problem in particular but just me thinking.Does the presence of lots of IF statements in code signify bad code design and reduce efficiency or not.

If you really want to optimize the code, be aware of this:
if (complexCalculation(someVariable) > 10)
{
}
else if (complexCalculation(someVariable) > 5)
{
}
the point is, if you are trying to optimize some code, try to "cache" the result of calculations in variables, instead of redoing many times the same calculation
int cached = complexCalculation(someVariable);
if (cached > 10)
{
}
else if (cached > 5)
{
}
Why this? Now... If complexCalculation is deterministic based on its parameters (so complexCalculation(N) == complexCalculation(N) always, in simple words, you call it twice with the same parameters and you will receive both times the same result always) and is without side-effects (so it doesn't modify anything else), then the compiler could optimize it freely. The problem is that quite often the compiler isn't able to verify if a function is deterministic and without side-effects, and very very few languages (primarily the functional languages like F#, Haskell...) make it easy to tell it to the compiler (technically in the functiona languages all the functions should be deterministic and without side effects :-) ).

Theoretically, lots of 'if' statements would not significantly reduce code efficiency. The code simply determines the boolean value of the expression and decides whether or not to continue. If there are many 'if' statements within an iterative loop, however, that could cause a larger problem in terms of efficiency.
Bad design is a whole other issue that I'm not going to mention (Ziminji has covered it better than I could).

In general, the presence of a lot of "if" statement is considered bad design. Consider replacing conditionals with polymorphism. This one of the topics in Martin Fowler's book "Refactoring: Improving the Design of Existing Code" on page 255. Checkout the following article if you don't have the book: http://sourcemaking.com/refactoring/replace-conditional-with-polymorphism

Related

Is it a good style to write constants on the left of equal to == in If statement in C++? [duplicate]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Okay, we know that the following two lines are equivalent -
(0 == i)
(i == 0)
Also, the first method was encouraged in the past because that would have allowed the compiler to give an error message if you accidentally used '=' instead of '=='.
My question is - in today's generation of pretty slick IDE's and intelligent compilers, do you still recommend the first method?
In particular, this question popped into my mind when I saw the following code -
if(DialogResult.OK == MessageBox.Show("Message")) ...
In my opinion, I would never recommend the above. Any second opinions?
I prefer the second one, (i == 0), because it feel much more natural when reading it. You ask people, "Are you 21 or older?", not, "Is 21 less than or equal to your age?"
It doesn't matter in C# if you put the variable first or last, because assignments don't evaluate to a bool (or something castable to bool) so the compiler catches any errors like "if (i = 0) EntireCompanyData.Delete()"
So, in the C# world at least, its a matter of style rather than desperation. And putting the variable last is unnatural to english speakers. Therefore, for more readable code, variable first.
If you have a list of ifs that can't be represented well by a switch (because of a language limitation, maybe), then I'd rather see:
if (InterstingValue1 == foo) { } else
if (InterstingValue2 == foo) { } else
if (InterstingValue3 == foo) { }
because it allows you to quickly see which are the important values you need to check.
In particular, in Java I find it useful to do:
if ("SomeValue".equals(someString)) {
}
because someString may be null, and in this way you'll never get a NullPointerException. The same applies if you are comparing constants that you know will never be null against objects that may be null.
(0 == i)
I will always pick this one. It is true that most compilers today do not allow the assigment of a variable in a conditional statement, but the truth is that some do. In programming for the web today, I have to use myriad of langauges on a system. By using 0 == i, I always know that the conditional statement will be correct, and I am not relying on the compiler/interpreter to catch my mistake for me. Now if I have to jump from C# to C++, or JavaScript I know that I am not going to have to track down assignment errors in conditional statements in my code. For something this small and to have it save that amount of time, it's a no brainer.
I used to be convinced that the more readable option (i == 0) was the better way to go with.
Then we had a production bug slip through (not mine thankfully), where the problem was a ($var = SOME_CONSTANT) type bug. Clients started getting email that was meant for other clients. Sensitive type data as well.
You can argue that Q/A should have caught it, but they didn't, that's a different story.
Since that day I've always pushed for the (0 == i) version. It basically removes the problem. It feels unnatural, so you pay attention, so you don't make the mistake. There's simply no way to get it wrong here.
It's also a lot easier to catch that someone didn't reverse the if statement in a code review than it is that someone accidentally assigned a value in an if. If the format is part of the coding standards, people look for it. People don't typically debug code during code reviews, and the eye seems to scan over a (i = 0) vs an (i == 0).
I'm also a much bigger fan of the java "Constant String".equals(dynamicString), no null pointer exceptions is a good thing.
You know, I always use the if (i == 0) format of the conditional and my reason for doing this is that I write most of my code in C# (which would flag the other one anyway) and I do a test-first approach to my development and my tests would generally catch this mistake anyhow.
I've worked in shops where they tried to enforce the 0==i format but I found it awkward to write, awkward to remember and it simply ended up being fodder for the code reviewers who were looking for low-hanging fruit.
Actually, the DialogResult example is a place where I WOULD recommend that style. It places the important part of the if() toward the left were it can be seen. If it's is on the right and the MessageBox have more parameters (which is likely), you might have to scroll right to see it.
OTOH, I never saw much use in the "(0 == i) " style. If you could remember to put the constant first, you can remember to use two equals signs,
I'm trying always use 1st case (0==i), and this saved my life a few times!
I think it's just a matter of style. And it does help with accidentally using assignment operator.
I absolutely wouldn't ask the programmer to grow up though.
I prefer (i == 0), but I still sort of make a "rule" for myself to do (0 == i), and then break it every time.
"Eh?", you think.
Well, if I'm making a concious decision to put an lvalue on the left, then I'm paying enough attention to what I'm typing to notice if I type "=" for "==". I hope. In C/C++ I generally use -Wall for my own code, which generates a warning on gcc for most "=" for "==" errors anyway. I don't recall seeing that warning recently, perhaps because the longer I program the more reflexively paranoid I am about errors I've made before...
if(DialogResult.OK == MessageBox.Show("Message"))
seems misguided to me. The point of the trick is to avoid accidentally assigning to something.
But who is to say whether DialogResult.OK is more, or less likely to evaluate to an assignable type than MessageBox.Show("Message")? In Java a method call can't possibly be assignable, whereas a field might not be final. So if you're worried about typing = for ==, it should actually be the other way around in Java for this example. In C++ either, neither or both could be assignable.
(0==i) is only useful because you know for absolute certain that a numeric literal is never assignable, whereas i just might be.
When both sides of your comparison are assignable you can't protect yourself from accidental assignment in this way, and that goes for when you don't know which is assignable without looking it up. There's no magic trick that says "if you put them the counter-intuitive way around, you'll be safe". Although I suppose it draws attention to the issue, in the same way as my "always break the rule" rule.
I use (i == 0) for the simple reason that it reads better. It makes a very smooth flow in my head. When you read through the code back to yourself for debugging or other purposes, it simply flows like reading a book and just makes more sense.
My company has just dropped the requirement to do if (0 == i) from its coding standards. I can see how it makes a lot of sense but in practice it just seems backwards. It is a bit of a shame that by default a C compiler probably won't give you a warning about if (i = 0).
Third option - disallow assignment inside conditionals entirely:
In high reliability situations, you are not allowed (without good explanation in the comments preceeding) to assign a variable in a conditional statement - it eliminates this question entirely because you either turn it off at the compiler or with LINT and only under very controlled situations are you allowed to use it.
Keep in mind that generally the same code is generated whether the assignment occurs inside the conditional or outside - it's simply a shortcut to reduce the number of lines of code. There are always exceptions to the rule, but it never has to be in the conditional - you can always write your way out of that if you need to.
So another option is merely to disallow such statements, and where needed use the comments to turn off the LINT checking for this common error.
-Adam
I'd say that (i == 0) would sound more natural if you attempted to phrase a line in plain (and ambiguous) english. It really depends on the coding style of the programmer or the standards they are required to adhere to though.
Personally I don't like (1) and always do (2), however that reverses for readability when dealing with dialog boxes and other methods that can be extra long. It doesn't look bad how it is not, but if you expand out the MessageBox to it's full length. You have to scroll all the way right to figure out what kind of result you are returning.
So while I agree with your assertions of the simplistic comparison of value types, I don't necessarily think it should be the rule for things like message boxes.
both are equal, though i would prefer the 0==i variant slightly.
when comparing strings, it is more error-prone to compare "MyString".equals(getDynamicString())
since, getDynamicString() might return null.
to be more conststent, write 0==i
Well, it depends on the language and the compiler in question. Context is everything.
In Java and C#, the "assignment instead of comparison" typo ends up with invalid code apart from the very rare situation where you're comparing two Boolean values.
I can understand why one might want to use the "safe" form in C/C++ - but frankly, most C/C++ compilers will warn you if you make the typo anyway. If you're using a compiler which doesn't, you should ask yourself why :)
The second form (variable then constant) is more readable in my view - so anywhere that it's definitely not going to cause a problem, I use it.
Rule 0 for all coding standards should be "write code that can be read easily by another human." For that reason I go with (most-rapidly-changing value) test-against (less-rapidly-changing-value, or constant), i.e "i == 0" in this case.
Even where this technique is useful, the rule should be "avoid putting an lvalue on the left of the comparison", rather than the "always put any constant on the left", which is how it's usually interpreted - for example, there is nothing to be gained from writing
if (DateClass.SATURDAY == dateObject.getDayOfWeek())
if getDayOfWeek() is returning a constant (and therefore not an lvalue) anyway!
I'm lucky (in this respect, at least) in that these days in that I'm mostly coding in Java and, as has been mentioned, if (someInt = 0) won't compile.
The caveat about comparing two booleans is a bit of a red-herring, as most of the time you're either comparing two boolean variables (in which case swapping them round doesn't help) or testing whether a flag is set, and woe-betide-you if I catch you comparing anything explicitly with true or false in your conditionals! Grrrr!
In C, yes, but you should already have turned on all warnings and be compiling warning-free, and many C compilers will help you avoid the problem.
I rarely see much benefit from a readability POV.
Code readability is one of the most important things for code larger than a few hundred lines, and definitely i == 0 reads much easier than the reverse
Maybe not an answer to your question.
I try to use === (checking for identical) instead of equality. This way no type conversion is done and it forces the programmer do make sure the right type is passed,
You are right that placing the important component first helps readability, as readers tend to browse the left column primarily, and putting important information there helps ensure it will be noticed.
However, never talk down to a co-worker, and implying that would be your action even in jest will not get you high marks here.
I always go with the second method. In C#, writing
if (i = 0) {
}
results in a compiler error (cannot convert int to bool) anyway, so that you could make a mistake is not actually an issue. If you test a bool, the compiler is still issuing a warning and you shouldn't compare a bool to true or false. Now you know why.
I personally prefer the use of variable-operand-value format in part because I have been using it so long that it feels "natural" and in part because it seems to the predominate convention. There are some languages that make use of assignment statements such as the following:
:1 -> x
So in the context of those languages it can become quite confusing to see the following even if it is valid:
:if(1=x)
So that is something to consider as well. I do agree with the message box response being one scenario where using a value-operand-variable format works better from a readability stand point, but if you are looking for constancy then you should forgo its use.
This is one of my biggest pet peeves. There is no reason to decrease code readability (if (0 == i), what? how can the value of 0 change?) to catch something that any C compiler written in the last twenty years can catch automatically.
Yes, I know, most C and C++ compilers don't turn this on by default. Look up the proper switch to turn it on. There is no excuse for not knowing your tools.
It really gets on my nerves when I see it creeping into other languages (C#,Python) which would normally flag it anyway!
I believe the only factor to ever force one over the other is if the tool chain does not provide warnings to catch assignments in expressions. My preference as a developer is irrelevant. An expression is better served by presenting business logic clearly. If (0 == i) is more suitable than (i == 0) I will choose it. If not I will choose the other.
Many constants in expressions are represented by symbolic names. Some style guides also limit the parts of speech that can be used for identifiers. I use these as a guide to help shape how the expression reads. If the resulting expression reads loosely like pseudo code then I'm usually satisfied. I just let the expression express itself and If I'm wrong it'll usually get caught in a peer review.
We might go on and on about how good our IDEs have gotten, but I'm still shocked by the number of people who turn the warning levels on their IDE down.
Hence, for me, it's always better to ask people to use (0 == i), as you never know, which programmer is doing what.
It's better to be "safe than sorry"
if(DialogResult.OK == MessageBox.Show("Message")) ...
I would always recommend writing the comparison this way. If the result of MessageBox.Show("Message") can possibly be null, then you risk a NPE/NRE if the comparison is the other way around.
Mathematical and logical operations aren't reflexive in a world that includes NULLs.

What are the advantages of squashing assignment and error checking in one line?

This question is inspired by this question, which features the following code snippet.
int s;
if((s = foo()) == ERROR)
print_error();
I find this style hard to read and prone to error (as the original question demonstrates -- it was prompted by missing parentheses around the assignment). I would instead write the following, which is actually shorter in terms of characters.
int s = foo();
if(s == ERROR)
print_error();
This is not the first time I've seen this idiom though, and I'm guessing there are reasons (perhaps historical) for it being so often used. What are those reasons?
I think it's for hysterical reasons, that early compilers were not so smart at optimizing. By putting it on one line as a single expression, it gives the compiler a hint that the same value fetched from foo() can be tested rather than specifically loading the value from s.
I prefer the clarity of your second example, with the assignment and test done later. A modern compiler will have no trouble optimizing this into registers, avoiding unnecessary loads from memory store.
When you are writing a loop, it is sometimes desirable to use the first form, as in this famous example from K&R:
int c;
while ((c = getchar()) != EOF) {
/* stuff */
}
There is no elegant "second-form" way of writing this without a repetition:
int c = getchar();
while (c != EOF) {
/* stuff */
c = getchar();
}
Or:
int c;
for (c = getchar(); c != EOF; c = getchar()) {
/* stuff */
}
Now that the assignment to c is repeated, the code is more error-prone, because one has to keep both the statements in sync.
So one has to be able to learn to read and write the first form easily. And given that, it seems logical to use the same form in if conditions as well.
I tend to use the first form mostly because I find it easy to read—as someone else said, it couples the function call and the return value test much more closely.
I make a conscious attempt at combining the two whenever possible. The "penalty" in size isn't enough to overcome the advantage in clarity, IMO.
The advantage in clarity comes from one fact: for a function like this, you should always think of calling the function and testing the return value as a single action that cannot be broken into two parts ("atomic", if you will). You should never call such a function without immediately testing its return value.
Separating the two (at all) leads to a much greater likelihood that you'll sometimes skip checking the return value completely. Other times, you'll accidentally insert some code between the call and the test of the return value that actually depends on that function having succeeded. If you always combine it all into a single statement, it (nearly) eliminates any possibility of falling into these traps.
I would always go for the second. It's easier to read, there's no danger of omitting the parentheses around the assignment and it is easier to step through with a debugger.
I often find the separation of the assignment out into a different line makes debugger watch or "locals" windows behave better vis-a-vis the presence and correct value of "s", at least in non-optimized builds.
It also allows the use of step-over separately on the assignment and test lines (again, in non-optimized builds), which can be helpful if you don't want to go mucking around in disassembly or mixed view.
YMMV per compiler and debugger and for optimized builds, of course.
I personally prefer for assignments and tests to be on different lines. It is less syntactically complicated, less error prone, and more easily understood. It also allows the compiler to give you more precise error/warning locations and often makes debugging easier.
It also allows me to more easily do things like:
int rc = function();
DEBUG_PRINT(rc);
if (rc == ERROR) {
recover_from_error();
} else {
keep_on_going(rc);
}
I prefer this style so much that in the case of loops I would rather:
while (1) {
int rc = function();
if (rc == ERROR) {
break;
}
keep_on_going(rc);
}
than do the assignment in the while conditional. I really don't like for my tests to have side-effects.
I often prefer the first form. I couldn't say exactly why, but it has something to do with the semantic involved.
The second style feels to me more like 2 separate operations. Call the function and then do something with the result, 2 different things. In the first style it's one logical unit. Call the function, save the temprary result and eventually handle the error case.
I know it's pretty vague and far from being completely rational, so I will use one or the other depending on the importance of the saved variable or the test case.
I believe that clarity should always prime over optimizations or "simplifications" based only on the amount of characters typed. This belief has stopped me from making many silly mistakes.
Separating the assignement and the comparison makes both clearer and so less error-prone, even if the duplication of the comparison might introduce a mistake once in a while. Among other things, parentheses become quickly hard to distinguish and keeping everything on one line introduces more parentheses. Also, splitting it up limits statements to doing only one of either fetching a value or assigning one.
However, if you expect people who will read your code to be more comfortable using the one-line idiom, then it is wide-spread enough not to cause any problems for most programmers. C programmers will definately be aware of it, even those that might find it awkward.

Does replacing statements by expressions using the C++ comma operator could allow more compiler optimizations?

The C++ comma operator is used to chain individual expressions, yielding the value of the last executed expression as the result.
For example the skeleton code (6 statements, 6 expressions):
step1;
step2;
if (condition)
step3;
return step4;
else
return step5;
May be rewritten to: (1 statement, 6 expressions)
return step1,
step2,
condition?
step3, step4 :
step5;
I noticed that it is not possible to perform step-by-step debugging of such code, as the expression chain seems to be executed as a whole. Does it means that the compiler is able to perform special optimizations which are not possible with the traditional statement approach (specially if the steps are const or inline)?
Note: I'm not talking about the coding style merit of that way of expressing sequence of expressions! Just about the possible optimisations allowed by replacing statements by expressions.
Most compilers will break your code down into "basic blocks", which are stretches of code with no jumps/branches in or out. Optimisations will be performed on a graph of these blocks: that graph captures all the control flow in the function. The basic blocks are equivalent in your two versions of the code, so I doubt that you'd get different optimisations. That the basic blocks are the same isn't entirely obvious: it relies on the fact that the control flow between the steps is the same in both cases, and so are the sequence points. The most plausible difference is that you might find in the second case there is only one block including a "return", and in the first case there are two. The blocks are still equivalent, since the optimiser can replace two blocks that "do the same thing" with one block that is jumped to from two different places. That's a very common optimisation.
It's possible, of course, that a particular compiler doesn't ignore or eliminate the differences between your two functions when optimising. But there's really no way of saying whether any differences would make the result faster or slower, without examining what that compiler is doing. In short there's no difference between the possible optimisations, but it doesn't necessarily follow that there's no difference between the actual optimisations.
The reason you can't single-step your second version of the code is just down to how the debugger works, not the compiler. Single-step usually means, "run to the next statement", so if you break your code into multiple statements, you can more easily debug each one. Otherwise, if your debugger has an assembly view, then in the second case you could switch to that and single-step the assembly, allowing you to see how it progresses. Or if any of your steps involve function calls, then you may be able to "do the hokey-cokey", by repeatedly doing "step in, step out" of the functions, and separate them that way.
Using the comma operator neither promotes nor hinders optimization in any circumstances I'm aware of, because the C++ standard guarantee is only that evaluation will be in left-to-right order, not that statement execution necessarily will be. (This is the same guarantee you get with statement line order.)
What it is likely to do, though, is turn your code into a confusing mess, since many programmers are unaware that the comma-as-operator even exists, and are apt to confuse it with commas used as parameter separators. (Want to really make your code unreadable? Call a function like my_func((++i, y), x).)
The "best" use of the comma operator I've seen is to work with multiple variables in the iteration statement of a for loop:
for (int i = 0, j = 0;
i < 10 && j < 12;
i += j, ++j) // each time through the loop we're tinkering with BOTH i and j
{
}
Very unlikely IMHO. The thing get's compiled down to assembler/machine code, then further low-level optimizations are done, so it probably turns out to the same thing.
OTOH, if the comma operator is overloaded, the game changes completely. But I'm sure you know that. ;)
The obligatory list:
Don't worry about rewriting almost equivalent code to gain performance
If you have a perf-problem, profile to see what the problem is
If you can't get it faster by algorithmic ops, look at the disassembly and see that the compiler does what you intended
If not, ask here and post source and disassembly for both versions. :)

How could this manner of writing code be called?

I'm reviewing a quite old project and see code like this for the second time already (C++ - like pseudocode):
if( conditionA && conditionB ) {
actionA();
actionB();
} else {
if( conditionA ) {
actionA();
}
if( conditionB ) {
actionB();
}
}
in this code conditionA evaluates to the same result on both computations and the same goes for conditionB. So the code is equivalent to just:
if( conditionA ) {
actionA();
}
if( conditionB ) {
actionB();
}
So the former variant is just twice more code for the same effect. How could such manner (I mean the former variant) of writing code be called?
This is indeed bad coding practice, but be warned that if condition A and B evaluations have any side effects (var increments, etc.) the two fragments are not equivalent.
I would call it bad code. Though I've tended to find similar constructs in project that grew without any code review being done. (Or other lax development practices).
Guys? Look at this part: ( conditionA && conditionB )
Basically, if conditionA happens to be false, then it won't evaluate conditionB.
Now, it would be a bad coding style but if conditionA and conditionB aren't just evaluating data but if there's also some code behind these conditions that change data, there could be a huge difference between both notations!
if conditionA is false then conditionA is evaluated twice and conditionB is evaluated just once.
If conditionA is true and conditionB is false, then both conditions are evaluated twice.
If both conditions are true, both are executed just once.
In the second suggestion, both conditions are executed just once... Thus, these methods are only equivalent if both methods evaluate to true.
To make things more complex, if conditionB is false then actionA could change something that would change this validation! Thus the else branch would then execute actionB too. But if both conditions evaluates to true and actionA would change the evaluation of conditionB to false, it would still execute actionB.
I tend to refer to this kind of code as: "Why make things easy when you can do it the hard way?" and consider this a design pattern. Actually, it's "Mortgage-Driven development" where code is made more complex so the main developer will be the only one to understand it, while other developers will just become confused and hopefully give up to redesign the whole thing. As a result, the original developer is required to stay just to maintain this code, which is called "Job security" and thus be able to pay his mortgage for many, many years.I wonder why something like this would be used, then realized that I use a similar structure in my own code. Similar, but not the same:
if (A&&B){
action1;
} elseif(A){
action2;
} elseif(B){
action3;
} else{action4}
In this case, every action would be the display of a different message. A message that could not be generated by just concatenating two strings. Say, it's a part of a game and A checks if you have enough energy while B checks if you have enough mass. Of you don't have enough mass AND energy, you can't build anything anymore and a critical warning needs to be displayed. If you only have energy, a warning that you have to find more mass would be enough. With only energy, your builders would need to recharge. And with both you can continue to build. Four different actions, thus this weird construction.
However, the sample in the Q shows something completely different. Basically, you'd get one message that you're out of mass and another that you're out of energy. These two messages aren't combined into a single message.
Now, in the example, if conditionA would detect energy levels and conditionB would detect mass levels then both solution would just work fine. Now, if actionA tells your builders to drop their mass and start recharging, you'd suddenly gain a little bit of mass again in your game. But if conditionB indicated that you ran out of mass, that would not be true anymore! Simply because actionA released mass again. if actionB would be the command to tell builders to start collecting mass as soon as they're able then the first solution will give all builders this command and they would start collecting mass first, then they would continue their other actions. In the second solution, no such command would be given. The builders are recharged again and start using the little mass that was just released. If this check is done every 5 minutes then those builders would e.g. recharge in one minute to be idle for 4 more minutes because they ran out of mass. In the first solution, they would immediately start collecting mass.
Yeah, it's a stupid example. Been playing Supreme Commander again. :-) Just wanted to come up with a possible scenario, but it could be improved a lot!...
It's code written by someone who doesn't know how to use a Karnaugh Map.
This is very close to the 'fizzbuzz' Design Pattern:
if( fizz && buzz ) {
printFizz();
printBuzz();
} else {
if( fizz ) {
printFizz();
}
else if( buzz ) {
printBuzz();
}
else {
printValue();
}
}
Maybe the code started life as an instance of fizzbuzz (maybe copied-n-pasted), then was refactored slightly into what you see today due to slightly different requirements, but the refactoring didn't go as far as it probably should have (boolean logic can sometimes be a bit trickier than one might think - hence fizzbuzz as an interview weed-out technique).
I would call it bad code too.
There is no best way on indenting, but there is one golden rule : choose one and stick with it.
This is "redundant" code, and yes it is bad. If some precondition must be added to the calls to actionA (assuming that the precondition can't be put into actionA itself), we now have to modify the code in 2 places, and therefore run the risk of overlooking one of them.
This is one of those times where you can feel better about deleting some lines of code, than in writing new ones.
inefficient code?
Also, could be called Paid per line
I would call it 'twice is better'. It's made like that to be sure that the runtime really understood the question ;).
(although in multi-threaded, not-safe environment, the result may differ between the two variants.)
I might call it "code I wrote last month while I was in a hurry / not focused / tired". It happens, we all make or have made these kind of mistakes. Just change it. If you want to you can try and find out who did this, hope it is not you, and give him/her feedback.
Since you said you've seen this more than once, it seems that it's more than a one-time error due to being tired. I see several reasons for someone to repeatedly come up with such code:
The code was originally different, got refactored, but whoever did this oversaw that this is redundant.
Whoever did this didn't have a good grasp of boolean logic.
(Also, there's the slight possibility that there might be more to this than what your simplified snipped shows.)
As pgast has said in a comment there is nothing wrong with this code if actionA effects conditionB (note that this is not a condition with a side effect but an action with a side effect (which you kind of expect))

Is returning early from a function more elegant than an if statement?

Myself and a colleague have a dispute about which of the following is more elegant. I won't say who's who, so it is impartial. Which is more elegant?
public function set hitZone(target:DisplayObject):void
{
if(_hitZone != target)
{
_hitZone.removeEventListener(MouseEvent.ROLL_OVER, onBtOver);
_hitZone.removeEventListener(MouseEvent.ROLL_OUT, onBtOut);
_hitZone.removeEventListener(MouseEvent.MOUSE_DOWN, onBtDown);
_hitZone = target;
_hitZone.addEventListener(MouseEvent.ROLL_OVER, onBtOver, false, 0, true);
_hitZone.addEventListener(MouseEvent.ROLL_OUT, onBtOut, false, 0, true);
_hitZone.addEventListener(MouseEvent.MOUSE_DOWN, onBtDown, false, 0, true);
}
}
...or...
public function set hitZone(target:DisplayObject):void
{
if(_hitZone == target)return;
_hitZone.removeEventListener(MouseEvent.ROLL_OVER, onBtOver);
_hitZone.removeEventListener(MouseEvent.ROLL_OUT, onBtOut);
_hitZone.removeEventListener(MouseEvent.MOUSE_DOWN, onBtDown);
_hitZone = target;
_hitZone.addEventListener(MouseEvent.ROLL_OVER, onBtOver, false, 0, true);
_hitZone.addEventListener(MouseEvent.ROLL_OUT, onBtOut, false, 0, true);
_hitZone.addEventListener(MouseEvent.MOUSE_DOWN, onBtDown, false, 0, true);
}
In most cases, returning early reduces the complexity and makes the code more readable.
It's also one of the techniques applied in Spartan programming:
Minimal use of Control
Minimizing the use of conditionals by using specialized
constructs such ternarization,
inheritance, and classes such as Class
Defaults, Class Once and Class
Separator
Simplifying conditionals with early return.
Minimizing the use of looping constructs, by using action applicator
classes such as Class Separate and
Class FileSystemVisitor.
Simplifying logic of iteration with early exits (via return,
continue and break statements).
In your example, I would choose option 2, as it makes the code more readable. I use the same technique when checking function parameters.
This is one of those cases where it's ok to break the rules (i.e. best practices). In general you want to have as few return points in a function as possible. The practical reason for this is that it simplifies your reading of the code, since you can just always assume that each and every function will take its arguments, do its logic, and return its result. Putting in extra returns for various cases tends to complicate the logic and increase the amount of time necessary to read and fully grok the code. Once your code reaches the maintenance stage then multiple returns can have a huge impact on the productivity of new programmers as they try to decipher the logic (its especially bad when comments are sparse and the code unclear). The problem grows exponentially with respect to the length of the function.
So then why in this case does everyone prefer option 2? It's because you're are setting up a contract that the function enforces through validating incoming data, or other invariants that might need to be checked. The prettiest syntax for constructing the validation is the check each condition, returning immediately if the condition fails validity. That way you don't have to maintain some kind of isValid boolean through all of your checks.
To sum things up: we're really looking at how to write validation code and not general logic; option 2 is better for validation code.
As long as the early returns are organized as a block at the top of the function/method body, then I think they're much more readable than adding another layer of nesting.
I try to avoid early returns in the middle of the body. Sometimes they're the best way, but most of the time I think they complicate.
Also, as a general rule I try to minimize nesting control structures. Obviously you can take this one too far, so you have to use some discretion. Converting nested if's to a single switch/case is much clearer to me, even if the predicates repeat some sub-expressions (and assuming this isn't a performance critical loop in a language too dumb to do subexpression elimination). Particularly I dislike the combination of nested ifs in long function/method bodies, since if you jump into the middle of the code for some reason you end up scrolling up and down to mentally reconstruct the context of a given line.
In my experience, the issue with using early returns in a project is that if others on the project aren't used to them, they won't look for them. So early returns or not - if there are multiple programmers involved, make sure everyone's at least aware of their presence.
I personally write code to return as soon as it can, as delaying a return often introduces extra complexity eg trying to safely exit a bunch of nested loops and conditions.
So when I look at an unfamiliar function, the very first thing I do is look for all the returns. What really helps there is to set up your syntax colouring to give return a different colour from anything else. (I go for red.) That way, the returns become a useful tool for determining what the function does, rather than hidden stumbling blocks for the unwary.
Ah the guardian.
Imho, yes - the logic of it is clearer because the return is explicit and right next to the condition, and it can be nicely grouped with similar structures. This is even more applicable where "return" is replaced with "throw new Exception".
As said before, early return is more readable, specially if the body of a function is long, you may find that deleting a } by mistake in a 3 page function (wich in itself is not very elegant) and trying to compile it can take several minutes of non-automatable debugging.
It also makes the code more declarative, because that's the way you would describe it to another human, so probably a developer is close enough to one to understand it.
If the complexity of the function increases later, and you have good tests, you can simply wrap each alternative in a new function, and call them in case branches, that way you mantain the declarative style.
In this case (one test, no else clause) I like the test-and-return. It makes it clear that in that case, there's nothing to do, without having to read the rest of the function.
However, this is splitting the finest of hairs. I'm sure you must have bigger issues to worry about :)
option 2 is more readable, but the manageability of the code fails when a else may be required to be added.
So if you are sure, there is no else go for option 2, but if there could be scope for an else condition then i would prefer option 1
Option 1 is better, because you should have a minimal number of return points in procedure.
There are exceptions like
if (a) {
return x;
}
return y;
because of the way a language works, but in general it's better to have as few exit points as it is feasible.
I prefer to avoid an immediate return at the beginning of a function, and whenever possible put the qualifying logic to prevent entry to the method prior to it being called. Of course, this varies depending on the purpose of the method.
However, I do not mind returning in the middle of the method, provided the method is short and readable. In the event that the method is large, in my opinion, it is already not very readable, so it will either be refactored into multiple functions with inline returns, or I will explicitly break from the control structure with a single return at the end.
I am tempted to close it as exact duplicate, as I saw some similar threads already, including Invert “if” statement to reduce nesting which has good answers.
I will let it live for now... ^_^
To make that an answer, I am a believer that early return as guard clause is better than deeply nested ifs.
I have seen both types of codes and I prefer first one as it is looks easily readable and understandable for me but I have read many places that early exist is the better way to go.
There's at least one other alternative. Separate the details of the actual work from the decision about whether to perform the work. Something like the following:
public function setHitZone(target:DisplayObject):void
{
if(_hitZone != target)
setHitZoneUnconditionally(target);
}
public function setHitZoneUnconditionally(target:DisplayObject):void
{
_hitZone.removeEventListener(MouseEvent.ROLL_OVER, onBtOver);
_hitZone.removeEventListener(MouseEvent.ROLL_OUT, onBtOut);
_hitZone.removeEventListener(MouseEvent.MOUSE_DOWN, onBtDown);
_hitZone = target;
_hitZone.addEventListener(MouseEvent.ROLL_OVER, onBtOver, false, 0, true);
_hitZone.addEventListener(MouseEvent.ROLL_OUT, onBtOut, false, 0, true);
_hitZone.addEventListener(MouseEvent.MOUSE_DOWN, onBtDown, false, 0, true);
}
Any of these three (your two plus the third above) are reasonable for cases as small as this. However, it would be A Bad Thing to have a function hundreds of lines long with multiple "bail-out points" sprinkled throughout.
I've had this debate with my own code over the years. I started life favoring one return and slowly have lapsed.
In this case, I prefer option 2 (one return) simply because we're only talking about 7 lines of code wrapped by an if() with no other complexity. It's far more readable and function-like. It flows top to bottom. You know you start at the top and end at the bottom.
That being said, as others have said, if there were more guards at the beginning or more complexity or if the function grows, then I would prefer option 1: return immediately at the beginning for a simple validation.