Theory for IF statement refactoring? - if-statement

There are some known rules, which refer to various combinations of IF statements (or if/else or switch/case statements etc). These rules can be used for refactoring.
Example 1:
"Nested IFs with single body can be always replaced by one IF with AND conditions."
if (a) {
if (b) {
doSomething();
}
}
can be rewritten as:
if (a && b) {
doSomething();
}
Example 2:
"Subsequent IFs with the same body can be always replaced by one IF with OR condition."
int i = 0;
if (a) {i = 1;}
if (b) {i = 1;}
can be rewritten as:
if (a || b) {i = 1;}
I'm sure there are many more similar rules. I would like to know if there is a known theory behind this, which gives generic pattern for creating such rules. I would like to see a list of other known rules. I think boolean algebra does not exacly handle this, because it uses mathematical syntax which does not exactly translate to programming syntax.
There are similar questions, which always ask about one rule or another. This question is completely generic, not about particular rules, but about the theory behind them.

Related

Is `if let` just another way of writing other `if` statements?

So I'm learning Rust and I'm learning about pattern matching and "if let" statements as alternatives to matching expressions. I was watching this video regarding "if let" which is mentioned at 11:00 and they give this example:
fn main() {
let some_value: Option<i32> = Some(3);
if let Some(3) = some_value {
println!("three");
}
}
I get that this is useful if you only have one specific pattern you want to match and the matching expression is too verbose, but if this is the case, couldn't you simply do this?:
fn main() {
let some_value: Option<i32> = Some(3);
if some_value == Some(3) {
println!("three");
}
}
Is there something about this expression that is inferior to the "if let" statement that I'm not aware of?
Well the example isn't very good. But consider this:
fn foo(o: Option<i32>) {
if let Some(n) = o {
println!("{n}");
}
}
Which prints the value inside the option only if it is Some(...).
if let is a shortand version of match if you're only interested in one specific case.
Also note that == can only be used with types implementing PartialEq, while if let works with
any type you can pattern match on.
The if let form lets you be more generic than your example. You can't do if some_value == Some(n) unless n is defined, but you can do:
if let Some(n) = some_value { ... }
Where that now gives you an n you can work with.
I think you'll find as you work with Rust more that the match form Some(n) shows up a lot more often than Some(1) for some specific value.
Don't forget you can also do this:
if let Some(1) | Some(2) = some_value { ... }
Where that's a lot more thorny with a regular if unless you're using matching like this:
if matches!(some_value, Some(1) | Some(2)) { ... }
So there's a number of useful options here that might fit better in some situations.
Don't forget about cargo clippy which will point out if there's better ways of expressing something.
Another advantage of if let besides what mentioned in the other answers is that not always you can use ==. There are types that do not support comparisons (they do not implement PartialEq) for various possible reasons (the author of the library didn't think about it/the types contain some structs that doesn't support comparisons/comparisons don't make sense for those types). In those cases, you cannot use ==, but you still can use if let.

Else keyword in non void function in C++ [duplicate]

I am always in the habit of using if, else-if statement instead of multiple if statements.
Example:
int val = -1;
if (a == b1) {
return c1;
} else if (a == b2) {
return c2;
} ...
...
} else {
return c11;
}
How does it compare to example 2:
if (a == b1) {
return c1;
}
if (a == b2) {
return c2;
}
....
if (a == b11) {
return c11;
}
I know functionality wise they are the same. But is it best practice to do if else-if, or not? It's raised by one of my friends when I pointed out he could structure the code base differently to make it cleaner. It's already a habit for me for long but I have never asked why.
if-elseif-else statements stop doing comparisons as soon as it finds one that's true. if-if-if does every comparison. The first is more efficient.
Edit: It's been pointed out in comments that you do a return within each if block. In these cases, or in cases where control will leave the method (exceptions), there is no difference between doing multiple if statements and doing if-elseif-else statements.
However, it's best practice to use if-elseif-else anyhow. Suppose you change your code such that you don't do a return in every if block. Then, to remain efficient, you'd also have to change to an if-elseif-else idiom. Having it be if-elseif-else from the beginning saves you edits in the future, and is clearer to people reading your code (witness the misinterpretation I just gave you by doing a skim-over of your code!).
What about the case where b1 == b2? (And if a == b1 and a == b2?)
When that happens, generally speaking, the following two chunks of code will very likely have different behavior:
if (a == b1) {
/* do stuff here, and break out of the test */
}
else if (a == b2) {
/* this block is never reached */
}
and:
if (a == b1) {
/* do stuff here */
}
if (a == b2) {
/* do this stuff, as well */
}
If you want to clearly delineate functionality for the different cases, use if-else or switch-case to make one test.
If you want different functionality for multiple cases, then use multiple if blocks as separate tests.
It's not a question of "best practices" so much as defining whether you have one test or multiple tests.
The are NOT functionally equivalent.
The only way it would be functionally equivalent is if you did an "if" statement for every single possible value of a (ie: every possibly int value, as defined in limits.h in C; using INT_MIN and INT_MAX, or equivalent in Java).
The else statement allows you to cover every possible remaining value without having to write millions of "if" statements.
Also, it's better coding practice to use if...else if...else, just like how in a switch/case statement, your compiler will nag you with a warning if you don't provide a "default" case statement. This prevents you from overlooking invalid values in your program. eg:
double square_root(double x) {
if(x > 0.0f) {
return sqrt(x);
} else if(x == 0.0f) {
return x;
} else {
printf("INVALID VALUE: x must be greater than zero");
return 0.0f;
}
}
Do you want to type millions of if statements for each possible value of x in this case? Doubt it :)
Cheers!
This totally depends on the condition you're testing. In your example it will make no difference eventually but as best practice, if you want ONE of the conditions to be eventually executed then you better use if else
if (x > 1) {
System.out.println("Hello!");
}else if (x < 1) {
System.out.println("Bye!");
}
Also note that if the first condition is TRUE the second will NOT be checked at all but if you use
if (x > 1) {
System.out.println("Hello!");
}
if (x < 1) {
System.out.println("Bye!");
}
The second condition will be checked even if the first condition is TRUE. This might be resolved by the optimizer eventually but as far as I know it behaves that way. Also the first one is the one is meant to be written and behaves like this so it is always the best choice for me unless the logic requires otherwise.
if and else if is different to two consecutive if statements. In the first, when the CPU takes the first if branch the else if won't be checked. In the two consecutive if statements, even if the the first if is checked and taken, the next if will also be check and take if the the condition is true.
I tend to think that using else if is easier more robust in the face of code changes. If someone were to adjust the control flow of the function and replaces a return with side-effect or a function call with a try-catch the else-if would fail hard if all conditions are truly exclusive. It really depends to much on the exact code you are working with to make a general judgment and you need to consider the possible trade-offs with brevity.
With return statements in each if branch.
In your code, you have return statements in each of the if conditions. When you have a situation like this, there are two ways to write this. The first is how you've written it in Example 1:
if (a == b1) {
return c1;
} else if (a == b2) {
return c2;
} else {
return c11;
}
The other is as follows:
if (a == b1) {
return c1;
}
if (a == b2) {
return c2;
}
return c11; // no if or else around this return statement
These two ways of writing your code are identical.
The way you wrote your code in example 2 wouldn't compile in C++ or Java (and would be undefined behavior in C), because the compiler doesn't know that you've covered all possible values of a so it thinks there's a code path through the function that can get you to the end of the function without returning a return value.
if (a == b1) {
return c1;
}
if (a == b2) {
return c2;
}
...
if (a == b11) {
return c11;
}
// what if you set a to some value c12?
Without return statements in each if branch.
Without return statements in each if branch, your code would be functionally identical only if the following statements are true:
You don't mutate the value of a in any of the if branches.
== is an equivalence relation (in the mathematical sense) and none of the b1 thru b11 are in the same equivalence class.
== doesn't have any side effects.
To clarify further about point #2 (and also point #3):
== is always an equivalence relation in C or Java and never has side effects.
In languages that let you override the == operator, such as C++, Ruby, or Scala, the overridden == operator may not be an equivalence relation, and it may have side effects. We certainly hope that whoever overrides the == operator was sane enough to write an equivalence relation that doesn't have side effects, but there's no guarantee.
In JavaScript and certain other programming languages with loose type conversion rules, there are cases built into the language where == is not transitive, or not symmetric. (In Javascript, === is an equivalence relation.)
In terms of performance, example #1 is guaranteed not to perform any comparisons after the one that matches. It may be possible for the compiler to optimize #2 to skip the extra comparisons, but it's unlikely. In the following example, it probably can't, and if the strings are long, the extra comparisons aren't cheap.
if (strcmp(str, "b1") == 0) {
...
}
if (strcmp(str, "b2") == 0) {
...
}
if (strcmp(str, "b3") == 0) {
...
}
I prefer if/else structures, because it's much easier to evaluate all possible states of your problem in every variation together with switches. It's more robust I find and quicker to debug especially when you do multiple Boolean evaluations in a weak-typed environment such as PHP, example why elseif is bad (exaggerated for demonstration):
if(a && (c == d))
{
} elseif ( b && (!d || a))
{
} elseif ( d == a && ( b^2 > c))
{
} else {
}
This problem has beyond 4^2=16 boolean states, which is simply to demonstrate the weak-typing effects that makes things even worse. It isn't so hard to imagine a three state variable, three variable problem involved in a if ab elseif bc type of way.
Leave optimization to the compiler.
In most cases, using if-elseif-else and switch statements over if-if-if statements is more efficient (since it makes it easier for the compiler to create jump/lookup tables) and better practice since it makes your code more readable, plus the compiler makes sure you include a default case in the switch. This answer, along with this table comparing the three different statements was synthesized using other answer posts on this page as well as those of a similar SO question.
I think these code snippets are equivalent for the simple reason that you have many return statements. If you had a single return statements, you would be using else constructs that here are unnecessary.
Your comparison relies on the fact that the body of the if statements return control from the method. Otherwise, the functionality would be different.
In this case, they perform the same functionality. The latter is much easier to read and understand in my opinion and would be my choice as which to use.
They potentially do different things.
If a is equal to b1 and b2, you enter two if blocks. In the first example, you only ever enter one. I imagine the first example is faster as the compiler probably does have to check each condition sequentially as certain comparison rules may apply to the object. It may be able to optimise them out... but if you only want one to be entered, the first approach is more obvious, less likely to lead to developer mistake or inefficient code, so I'd definitely recommend that.
CanSpice's answer is correct. An additional consideration for performance is to find out which conditional occurs most often. For example, if a==b1 only occurs 1% of the time, then you get better performance by checking the other case first.
Gir Loves Tacos answer is also good. Best practice is to ensure you have all cases covered.

Why we can have a semicolon in if but not in while loop

Why can I do this:
if (int result=getValue(); result > 100) {
}
but cannot do this:
while (int result=getValue(); result > 100) {
}
Why discriminate against while? A condition is a condition. Why while cannot evaluate it like if can?
In order to achieve the desired behavior with while, I'd have to implement it this this way:
int result = getValue();
while (result > 100) {
//do something
result = getValue();
}
Because we already have a while-loop-with-initializer. It's spelled:
for (int result=getValue(); result > 100;) {
}
There are 3 reasons I can think of for not adding this syntax.
There is a perfectly suitable construct, namely the for loop, that can be used for exactly this purpose.
Adding any feature to the language is a lot of work, and a very strong case needs to be made for such a proposal. Given the first point, I don't see this happening.
In my opinion this is the most important one: If this feature were added to the language (let's say for a convenient syntax, or something like that), then it can essentially never be removed from the language. This means that the while (;) syntax is forever banned, and there could very well be some other semantics that we would like to express using such a syntax, and giving up this option is not something that should be done without careful thought.

In recursive DP, break up recursion call by storing variables: inefficient?

Suppose I am solving a dynamic programming problem recursively (top down). For example, a recursive solution to the longest common subsequence problem:
LCS(S,n,T,m)
{
if (n==0 || m==0) return 0;
if (S[n] == T[m]) result = 1 + LCS(S,n-1,T,m-1);
else result = max( LCS(S,n-1,T,m), LCS(S,n,T,m-1) );
return result;
}
Often in such a DP problem at some point we have to take the max of some expressions, representing returns to different choices we can make. In the above case we have the max of two simple expressions, but in worse cases it can be the max of three or four quite complicated expressions involving long function calls. In such situations, I am often tempted to give these complicated expressions their own variable names, to make the code more readable. In the above case that would mean I would write
LCS(S,n,T,m)
{
if (n==0 || m==0) return 0;
if (S[n] == T[m]) result = 1 + LCS(S,n-1,T,m-1);
else
a = LCS(S,n-1,T,m);
b = LCS(S, n, T, m-1);
result = max(a, b);
return result;
}
(In this simplified case a and b are not complicated, but in other cases they are, and there may be even more arguments to the max function, so this could really help it be more understandable.)
My Question: Is this a terrible idea? As I understand it, I'm adding a variable to each layer of the call stack, and I'm thinking that could be wasteful. But on the other hand, at each layer it has to calculate the temporary variable LCS(S,n,T,m) anyway (I'm thinking in terms of C++, say), and as far as I know, there might be not much difference in cost between the two ways.
If this is a terrible idea, is there a more efficient way to break up a complicated recursive function call to make it more readable?
C++ has the "As-If" rule, which states that a compiler can do whatever it wants so long as the observable effects are indistinguishable from what is defined by the standard to happen. In this case, it's trivial to prove both fragments have the same meaning, and a compiler will likely emit identical instructions for both.
Note: You aren't doing dynamic programming here, as you don't memoise parameter / result pairs.

Conditional Operator: How much flexibility?

I would like to perform the following:
if(x == true)
{
// do this on behalf of x
// do this on behalf of x
// do this on behalf of x
}
Using a conditional operator, is this correct?
x == true ? { /*do a*/, /*do b*/, /*do c*/ } : y == true ? ... ;
Is this malformed?
I am not nesting more than one level with a conditional operator.
The expressions I intend to use are highly terse and simple making a conditional operator, in my opinion, worth using.
P.S. I am not asking A. Which I should use? B. Which is better C. Which is more appropriate
P.S. I am asking how to convert an if-else statement to a ternary conditional operator.
Any advice given on this question regarding coding standards etc. are simply undesired.
Don't compare booleans to true and false. There's no point because they're true or false already! Just write
if (x)
{
// do this on behalf of x
// do this on behalf of x
// do this on behalf of x
}
Your second example doesn't compile because you use { and }. But this might
x ? ( /*do a*/, /*do b*/, /*do c*/ ) : y ? ... ;
but it does depend on what /*do a*/ etc are.
Using comma operator to string different expressions together is within the rules of the language, but it makes the code harder to read (because you have to spot the comma, which isn't always easy, especially if the expression isn't really simple.
The other factor is of course that you can ONLY do this for if (x) ... else if(y) ... type conditionals state.
Sometimes, it seems like people prefer "short code" from "readable code", which is of course great if you are in a competition of "who can write this in the fewest lines", but for everything else, particularly code that "on show" or shared with colleagues that also need to understand it - once a software project gets sufficiently large, it usually becomes hard to understand how the code works WITHOUT obfuscation that makes the code harder to read. I don't really see any benefit in using conditional statements in the way your second example described. It is possible that the example is bad, but generally, I'd say "don't do that".
Of course it works (with C++11). I have not tried a solution but following Herb Sutters way you can use ether a function call or a lambda which is immediately executed:
cond ?
[&]{
int i = some_default_value;
if(someConditionIstrue)
{
Do some operations ancalculate the value of i;
i = some calculated value;
}
return i;
} ()
:
somefun() ;
I have not tried to compile it but here you have an result whih is either computed with an lambda or an normal function.