Shall I cache it?

Shall I cache it? - c++

When having a class:
class X
{
int data_;
void f()
{
//shall I cache var data_? by doing
int cached = data_
//and instaed of this:
if (data_ >= 0 && data_ < 1000 || data_ < 0 && data_ > -1000)//first version
{
//do something
}
else
{
//do somegthing else
}
// have this:
if (cashed >= 0 && cashed < 1000 || cashed < 0 && cashed > -1000)//second version
// in my opinion this is bad code
{
//do something
}
else
{
//do somegthing else
}
}
};
Please see comments in code.
I'm asking this question because my accomplice from university states that this kind of code (line 1) is just a bad code. I think that he's talking rubbish but I'm very interested in your opinion on this subject.
Thank you.

*Cached
And unless your data_ variable may change halfway through execution, there is no difference whatsoever between those two code segments.

Don't optimize unless it is already too slow: Write the clearest, simplest possible code, measure it, and if it is too slow, use tools to discover (not guess) which part is the slowest. If possible, use a better algorithm before deploying caching & other minor optimizations. In any case, once you've optimized that part, measure it again, and if it it's still too slow, repeat.
That being said, #Zhais is correct. There's no functional difference in the two code snippets.

This kind of optimization is called premature optimization. You should not do optimization like that until it is proven to be the bottleneck.
In the first case the variable is actually used as this->data_ but you can't be sure because compiler may cache it himself.

Related

Nested if statements vs extra else if?

I come across a lot of logic work where I'm not sure what design pattern is better for if statements. In these situations, I can usually put in a nested if statement, or alternatively. These two cases are shown below. What are the pros and cons of both, is there a standard I should follow?
if (val > 0 && is_on)
{
// (1)
}
else if (val > 0)
{
// (2)
}
else
{
// (3)
}
if (val > 0)
{
if(is_on)
{
// same as (1)
}
else
{
// same as (2)
}
}
else
{
// same as (3)
}

I don't think there is any specific pros or cons to any of the approach. Its all depends upon how you want to design your code and what you think is more readable to anyone who is looking at your code for the first time.
As per me, the first approach looks better as its more readable and contains fewer lines of code.

The first approach is more readable. But as the logic expressions (e.g. "val > 0 && is_on") get longer, it starts to make more sense to merge towards the second approach. The second one is easier to debug, so you could start there and then merge back. I'd match the style of the surrounding code/"code-policy" ultimately.

While the other answers are absolutely right in that your primary focus should be be readability, I want to address another difference: execution performance.
In the first example, there are 2 conditions that need evaluated before the else branch can run. If as we scale the number of conditions baked into this else/if ladder, the amount of evaluation to get to the else branch grows linearly. Now, we aren't expecting to have ten thousand conditions or anything, but it is something to take note of nonetheless.
Now, in your second example, we check the common condition between the first two branches, and if that fails, we quick-fail to the else branch, with no extra tests. In the extreme case, this can somewhat resemble a binary search for the correct code block- branching left and right until it finds its match, as opposed to a linear scan that checks each in order one-by-one.
Now, does this mean you should use the latter? Not necessarily- readability is more important, and if you're writing in a compiled language, the compiler will likely optimize away all that away anyways. And even if you're in an interpreted language, the performance hit is probably going to be negligible compared to everything else anyways, unless this is the hot section of a hot loop.
However, if you are bothered by the "wastefulness" of the repetition in the first example, but would rather avoid huge amounts of nesting, often languages will provide an assignment expression syntax, giving you a 3rd option, where you compute the result once and store it to a variable inline, for reuse in subsequent code.
For example:
if (expensive_func1() > 0 && is_on)
{
// (1)
}
else if (expensive_func1() > 0 && expensive_func2() > 0)
{
// (2)
}
else
{
// (3)
}
Becomes:
if ((is_alive = expensive_func1() > 0) && is_on)
{
// (1)
}
else if (is_alive && expensive_func2() > 0)
{
// (2)
}
else
{
// (3)
}
This saves us from recomputing the common sub expressions between our conditionals, in languages were we can't rely on a compiler to do that for us. Sure, we could just assign these to variables explicitly before the if statements, but then we bite the bullet of evaluating all shared expressions, rather than lazily evaluating them as needed (imagine we compute expensive_func2 > 0 for reuse in a 3rd if/else, only to find out we didn't need it, that we're taking the first branch).

Branching when mixing template parameters and variables in C++

I'm trying to carry out some loop optimization as described here: Optimizing a Loop vs Code Duplication
I have the additional complication that some code inside the loop only needs to be executed depending on a combination of run-time-known variables external to the loop (which can be replaced with template parameters for optimization, as discussed in the link above) and a run-time-known variable that is only known inside the loop.
Here is the completely un-optimized version of the code:
for (int i = 0; i < 100000, i++){
if (external_condition_1 || (external_condition_2 && internal_condition[i])){
run_some_code;
}
else{
run_some_other_code;
}
run_lots_of_other_code;
}
This is my attempt at wrapping the loop in a templated function as suggested in the question linked above to optimize performance and avoid code duplication by writing multiple versions of the loop:
template<bool external_condition_1, external_condition_2>myloop(){
for (int i = 0; i < 100000, i++){
if (external_condition_1 || (external_condition_2 && internal_condition[i]){
run_some_code;
}
else{
run_some_other_code;
}
run_lots_of_other_code;
}
My question is: how can the code be written to avoid branching and code duplication?
Note that the code is sufficiently complex that the function probably can't be inlined, and compiler optimization also likely wouldn't sort this out in general.

My question is: how can the code be written to avoid branching and code duplication?
Well, you already wrote your template to avoid code duplication, right? So let's look at what branching is left. To do this, we should look at each function that is generated from your template (there are four of them). We should also apply the expected compiler optimizations based upon the template parameters.
First up, set condition 1 to true. This should produce two functions that are essentially (using a bit of pseudo-syntax) the following:
myloop<true, bool external_condition_2>() {
for (int i = 0; i < 100000, i++){
// if ( true || whatever ) <-- optimized out
run_some_code;
run_lots_of_other_code;
}
}
No branching there. Good. Moving on to the first condition being false and the second condition being true.
myloop<false, true>(){
for (int i = 0; i < 100000, i++){
if ( internal_condition[i] ){ // simplified from (false || (true && i_c[i]))
run_some_code;
}
else{
run_some_other_code;
}
run_lots_of_other_code;
}
}
OK, there is some branching going on here. However, each i needs to be analyzed to see which code should execute. I think there is nothing more that can be done here without more information about internal_condition. I'll give some thoughts on that later, but let's move on to the fourth function for now.
myloop<false, false>() {
for (int i = 0; i < 100000, i++){
// if ( false || (false && whatever) ) <-- optimized out
run_some_other_code;
run_lots_of_other_code;
}
}
No branching here. You already have done a good job avoiding branching and code duplication.
OK, let's go back to myloop<false,true>, where there is branching. The branching is largely unavoidable simply because of how your situation is set up. You are going to iterate many times. Some iterations you want to do one thing while other iterations should do another. To get around this, you would need to re-envision your setup so that you can do the same thing each iteration. (The optimization you are working from is based upon doing the same thing each iteration, even though it might be a different thing the next time the loop starts.)
The simplest, yet unlikely, scenario would be where internal_condition[i] is equivalent to something like i < 5000. It would also be convenient if you could do all of the "some code" before any of the "lots of other code". Then you could loop from 0 to 4999, running "some code" each iteration. Then loop from 5000 to 99999, running "other code". Then a third loop to run "lots of other code".
Any solution I can think of would involve adapting your situation to make it more like the unlikely simple scenario. Can you calculate how many times internal_condition[i] is true? Can you iterate that many times and map your (new) loop control variable to the appropriate value of i (the old loop control variable)? (Or maybe the exact value of i is not important?) Then do a second loop to cover the remaining cases? In some scenarios, this might be trivial. In others, far from it.
There might be other tricks that could be done, but they depend on more details about what you are doing, what you need to do, and what you think you need to do but don't really. (It's possible that the required level of detail would overwhelm StackOverflow.) Is the order important? Is the exact value of i important?
In the end, I would opt for profiling the code. Profile the code without code duplication but with branching. Profile the code with minimal branching but with code duplication. Is there a measurable change? If so, think about how you can re-arrange your internal condition so that i can cover large ranges without changing the value of the internal condition. Then divide your loop into smaller pieces.

In C++17, to guaranty no extra branches evaluation, you might do:
template <bool external_condition_1, bool external_condition_2>
void myloop()
{
for (int i = 0; i < 100000, i++){
if constexpr (external_condition_1) {
run_some_code;
} else if constexpr (external_condition_2){
if (internal_condition[i]) {
run_some_code;
} else {
run_some_other_code;
}
} else {
run_some_other_code;
}
run_lots_of_other_code;
}
}

Should I use ELSE IF? for better performance?

Newbie here and I just want to know should I use ELSE IF for something like below:
(function)
IF x==1
IF x==2
IF x==3
That is the way I am using, because the x will not be anything else. However, I think that if the x is equal to 1, the program still gonna run through the following codes (which turn out to be FALSE FALSE FALSE ...). Should I use ELSE IF so it won't have to run the rest? Will that help the performance?
Why don't I want to use ELSE IF? because I'd like each code block (IF x==n) to be similar, not like this:
IF x==1
ELSE IF x==2
ELSE IF x==3
(each ELSE IF block is part of the block above it)
But the program will repeatedly call this function so I am worried about the performance or delay.

Short answer: If you do not need to handle a case where multiple conditions might be true at the same time, always use
if (condition) {
//do something
}
else if (other_condition) {
//do something else
}
else { //in all other conditions
//default behaviour
}
Long answer:
As others have already stated, performance is not really a big concern (unless you are writing production code for enterprise software targeted at colossal businesses). In case performance is indeed crucial though, you should go for the above format anyway. So that might be a good practice/habit to get used to (especially if you are now starting your code journey)
Switch could be an alternative, but since you haven't specified the language I would avoid suggesting it since, in some languages, it defaults to fall-through (which might get you where you started in the first place and confuse you even more)
Performance might not be a concern. But keep in mind that logic errors are a huge enemy to programming, and your solution is prone to them if you don't actually need it to be able to match more than one cases. Consider the following case.
if (x == 1) {
x = x + 1
}
if (x == 2) {
x = x + 2
}
if (x >= 3) {
print("Error: x should only be 1 or 2!")
}
In this case, you would expect that if x >= 3 you would warn about an error in value since you only had in mind handling the values 1 or 2. Actually though, even if the value of x is 1 or 2 (which you have considered to be valid) the same error message would be printed!. That's because you have allowed the possibility of more than one conditions being checked and the respective code block being executed each time. Note that this is an oversimplified example. In times, this can be a great pain! Especially if you collaborate with others and you share the code and you are aiming for expendable and maintainable code.
To conclude, do not use a simpler solution if you haven't thought it through. Go for the complete one instead and take in mind all possible outcomes (usually the worst case scenarios and even future features and code).
Best Regards!

If the value being tested is expected to be able to match multiple in a single calling, then test each (IF, IF, ...).
If the value is expected to only match one, then check for it and stop if you find it (IF, ELSE IF, ELSE IF...).
If the values are expected to be one of a known set, then go right to it (switch).

Assuming this is javascript, but this should be about the same for anything else.
The code inside the if statement will only be run if the condition you provide it is true. For example, if you declare x = 1, we could have something like:
function something() {
if(x == 1) {
//do this
}
if(x == 2) {
//do that
}
if(x == 3) {
//do this and that
}
The first block would be run and everything else is ignored. An else-if statement will run if the first if statement is false.
function something() {
if(x == 1) {
//do this
}
else if(x == 2) {
//do that
}
So if x == 1 was false, the next statement would be evaluated.
As for performance, the difference is way too little for you to care about. If you have many conditions you need to test, you may want to look into a switch statement.

Does this function have explicit return values on all control paths?

I have a Heaviside step function centered on unity for any data type, which I've encoded using:
template <typename T>
int h1(const T& t){
if (t < 1){
return 0;
} else if (t >= 1){
return 1;
}
}
In code review, my reviewer told me that there is not an explicit return on all control paths. And the compiler does not warn me either. But I don't agree; the conditions are mutually exclusive. How do I deal with this?

It depends on how the template is used. For an int, you're fine.
But, if t is an IEEE754 floating point double type with a value set to NaN, neither t < 1 nor t >= 1 are true and so program control reaches the end of the if block! This causes the function to return without an explicit value; the behaviour of which is undefined.
(In a more general case, where T overloads the < and >= operators in such a way as to not cover all possibilities, program control will reach the end of the if block with no explicit return.)
The moral of the story here is to decide on which branch should be the default, and make that one the else case.

Just because code is correct, that doesn't mean it can't be better. Correct execution is the first step in quality, not the last.
if (t < 1) {
return 0;
} else if (t >= 1){
return 1;
}
The above is "correct" for any datatype of t than has sane behavior for < and >=. But this:
if (t < 1) {
return 0;
}
return 1;
Is easier to see by inspection that every case is covered, and avoids the second unneeded comparison altogether (that some compilers might not have optimized out). Code is not only read by compilers, but by humans, including you 10 years from now. Give the humans a break and write more simply for their understanding as well.

As noted, some special numbers can be both < and >=, so your reviewer is simply right.
The question is: what made you want to code it like this in the first place? Why do you even consider making life so hard for yourself and others (the people that need to maintain your code)? Just the fact that you are smart enough to deduce that < and >= should cover all cases doesn't mean that you have to make the code more complex than necessary. What goes for physics goes for code too: make things as simple as possible, but not simpler (I believe Einstein said this).
Think about it. What are you trying to achieve? Must be something like this: 'Return 0 if the input is less than 1, return 1 otherwise.' What you've done is add intelligence by saying ... oh but that means that I return 1 if t is greater or equal 1. This sort of needless 'x implies y' is requiring extra think work on behalf of the maintainer. If you think that is a good thing, I would advise to do a couple of years of code maintenance yourself.
If it were my review, I'd make another remark. If you use an 'if' statement, then you can basically do anything you want in all branches. But in this case, you do not do 'anything'. All you want to do is return 0 or 1 depending on whether t<1 or not. In those cases, I think the '?:' statement is much better and more readable than the if statement. Thus:
return t<1 ? 0 : 1;
I know the ?: operator is forbidden in some companies, and I find that a horrible thing to do. ?: usually matches much better with specifications, and it can make code so much easier to read (if used with care) ...

Is using std::out_of_range for logic bad?

In my project, I have a lot of situations like this:
constexpr size_t element_count = 42;
std::array<bool, element_count> elements;
for(size_t i = 0; i < element_count; ++i){
if(i > 0 && elements[i - 1]){/*do something*/}
else{/*do something else*/}
if(i < element_count - 1 && elements[i + 1]){/*do something*/}
else{/*do something else*/}
}
Without checking if i > 0 or i < element_count, I'll get undefined behavior. If I use std::array::at instead of operator[], I can get std::out_of_range exceptions instead. I was wondering if there were any problems with just relying on the exception like this:
for(size_t i = 0; i < element_count; ++i){
try{
if(elements.at(i - 1)){/*do something*/}
}
catch(const std::out_of_range& e){/*do something else*/}
try{
if(elements.at(i + 1)){/*do something*/}
}
catch(const std::out_of_range& e){/*do something else*/}
}
In this example it's more code, but in my real project it would reduce the amount of code because I'm using lots of multidimensional arrays and doing bounds checking for multiple dimensions.

There isn't a problem in the sense that it will work, but that's about it. Using exceptions for basic flow control (which is what you seem to be doing here) is usually frowned upon, with reason, and I don't think I've ever seen it like this in a loop:
makes reading and reasoning about code harder, also because it's unexpected one uses exceptions for flow control (instead of for error handling, which is what it is meant for in C++)
harder to read usually also means harder to write, and makes it harder to spot mistakes
you actually made a mistake already, or at least introduced a behaviour change: i > 0 && elements[i - 1] evaluating to false does not result in 'something else' being called anymore
reduction of the amount of code isn't a good goal anymore if it results in less readable or worse code
might be less performant
Now it would be interesting to see the actual code, but I suspect it could probably do without any bounds checking whatsoever, e.g. by making the loop start at 1 instead of 0 . Or, if this is a recurrnig pattern, you'd write a helper function (or use an existing on) for iteration with access to multiple elements in one iteration. That would be a reduction in code amount which is actually worth it.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Shall I cache it? - c++

*Cached And unless your data_ variable may change halfway through execution, there is no difference whatsoever between those two code segments.

This kind of optimization is called premature optimization. You should not do optimization like that until it is proven to be the bottleneck. In the first case the variable is actually used as this->data_ but you can't be sure because compiler may cache it himself.

Related

Nested if statements vs extra else if?

Branching when mixing template parameters and variables in C++

Should I use ELSE IF? for better performance?

Does this function have explicit return values on all control paths?

Is using std::out_of_range for logic bad?

Categories

Resources