To check an int within range [1, ∞) or not, I can use the following ways (use #1, #2 a lot):
if (a>=1)
if (a>0)
if (a>1 || a==1)
if (a==1 || a>1)
Is there any difference that I should pay attention to among the four versions?
Functionally there is no difference between the 4 ways you listed. This is mainly an issue of style. I would venture that #1 and #2 are the most common forms though, if I saw #3 or #4 on a code review I would suggest a change.
Perf wise I suppose it is possible that some compiler out there optimizes one better than the other. But I really doubt it. At best it would be a micro-optimization and nothing I would ever base my coding style on without direct profiler input
I don't really see why you would use 3 or 4. Apart from being longer to type, they will generate more code. Since in a or condition the second check is skipped if the first is true, there shouldn't be a performance hit except for version 4 if value is not 1 often(of course hardware with branch prediction will mostly negate that).
1. if (a>=1)
2. if (a>0)
3. if (a>1 || a==1)
4. if (a==1 || a>1)
On x86, options 1 and 2 produce a cmp instruction. This will set various registers. The cmp is then followed by a condition branch/jump based on registers. For the first, it emits bge, for the second it emits bgt.
Option 3 and 4 - in theory - require two cmps and two branches, but chances are the compiler will simply optimize them to be the same as 1.
You should generally choose whichever (a) follows the conventions in the code you are working on (b) use whichever most clearly expresses the algorithm you are implementing.
There are times when explicitly writing "if a is equal to one, or it has a value greater than 1", and in those times you should write if (a == 1 || a > 1). But if you are just checking that a has a positive, non-zero, integer value, you should write if (a > 0), since that is what that says.
If you find that such a case is a part of a performance bottleneck, you should inspect the assembly instructions and adjust accordingly - e.g. if you find you have two cmps and branches, then write the code to use one compare and one branch.
Nope! They all are the same for an int. However, I would prefer to use if (a>0).
Related
Which operator is faster: > or ==?
Example: I want to test a value (which can have a positive value or -1) against -1 :
if(time > -1)
// or
if (time != -1)
time has type "int"
The standard doesn't say. So it's up to what opcodes the given compiler generates in its given version, and how fast a given CPU executes them.
I.e., implementation / platform defined.
You can find out for a specific compiler / platform combination by looking at / benchmarking the executable code.
But I seriously doubt it will make much of a difference; this is the kind of micro-optimization that is almost always dwarfed by higher-level architectural decisions.
It is platform-dependent. Generally though, those two operations will translate directly to the assembler instructions "branch if greater than" and "branch if not equal". It is unlikely that there is any performance difference between those two, and if there would be, it would be non-significant.
The only branch instruction which is ever so slightly faster than the others is usually "branch if zero"/"branch if not zero".
(In the dark ages when compilers sucked, C programmers therefore liked to write loops as down-counting to zero, instead of up-counting, so that comparisons would be done against zero instead of a value, in order to gain a few nanoseconds. Modern compilers can do that optimization themselves, but you still see such loops now and then.)
In general, you shouldn't concern yourself with micro-management of performance. If you spend time pondering if > is faster than !=, instead of pondering about program design, readability and functionality, you need to set your priorities straight asap.
Semantically these conditions are different. The first one checks whether object time is positive or zero.
if(time > -1)
In this case it would be better to write
if( time >= 0 )
However some functions return either a non-negative value or -1. For example a search function can return -1 if it did not find an element in an array. Or -1 can signal an error state or an absence of a value.
In this case it is better to use condition
if ( time != -1 )
As for the speed when the compiler can generate only one mashine instruction to make the comparison in the both cases.
It is not the case when you should think about the speed. You should think about what condition is more expressive and shows the intention of the programmer.
I have a bottleneck (about 20% CPU time) in my code which is in following if statement:
if (a == 0) { // here
...
}
where a is a uint8_t, so a number from 0 to 255.
Are there any low level optimizations to make it faster?
I thought about using bitwise NOR (~(a| 0)), but that would work only if a was a 1-bit, right?
Just in case: I don't care about code readability in this particular case.
Unless your compiler is garbage, there is nothing you can do to speed up integer comparison.
However, it is possible that the bottleneck you observe is not really the comparison itself, but rather the result of unlucky branch prediction.
There are two ways of getting around this:
If "to branch or not to branch" follows a pattern, move this last second decision further up in your program logic where you can use the pattern, just don't branch in your hot function. This might require serious thinking. A hacky way to find out whether you have patterns: Print 1 if you branch and 0 else for enough calls, Zip is up and see whether the resulting archive gets much smaller (in bits) than the number of values you printed. (Of course there are also smart formulas for that if you like it more theoretical.)
If you choose one branch over the other most of the time, you can tell the compiler which branch is the likely one. With gcc, checkout __builtin_expect, for other compilers, read the manual.
Important for both solutions: You will need to measure whether that actually helped. Especially the second one will not be magically be better, it might even make things much worse.
Is
if(!test)
faster than
if(test==-1)
I can produce assembly but there is too much assembly produced and I can never locate the particulars I'm after. I was hoping someone just knows the answer. I would guess they are the same unless most CPU architectures have some sort of "compare to zero" short cut.
thanks for any help.
Typically, yes. In typical processors testing against zero, or testing sign (negative/positive) are simple condition code checks. This means that instructions can be re-ordered to omit a test instruction. In pseudo assembly, consider this:
Loop:
LOADCC r1, test // load test into register 1, and set condition codes
BCZS Loop // If zero was set, go to Loop
Now consider testing against 1:
Loop:
LOAD r1, test // load test into register 1
SUBT r1, 1 // Subtract Test instruction, with destination suppressed
BCNE Loop // If not equal to 1, go to Loop
Now for the usual pre-optimization disclaimer: Is your program too slow? Don't optimize, profile it.
It depends.
Of course it's going to depend, not all architectures are equal, not all µarchs are equal, even compilers aren't equal but I'll assume they compile this in a reasonable way.
Let's say the platform is 32bit x86, the assembly might look something like
test eax, eax
jnz skip
Vs:
cmp eax, -1
jnz skip
So what's the difference? Not much. The first snippet takes a byte less. The second snippet might be implemented with an inc to make it shorter, but that would make it destructive so it doesn't always apply, and anyway, it's probably slower (but again it depends).
Take any modern Intel CPU. They do "macro fusion", which means they take a comparison and a branch (subject to some limitations), and fuse them. The comparison becomes essentially free in most cases. The same goes for test. Not inc though, but the inc trick only really applied in the first place because we just happened to compare to -1.
Apart from any "weird effects" (due to changed alignment and whatnot), there should be absolutely no difference on that platform. Not even a small difference.
Even if you got lucky and got the test for free as a result of a previous arithmetic instruction, it still wouldn't be any better.
It'll be different on other platforms, of course.
On x86 there won't be any noticeably difference, unless you are doing some math at the same time (e.g. while(--x) the result of --x will automatically set the condition code, where while(x) ... will necessitate some sort of test on the value in x before we know if it's zero or not.
Many other processors do have a "automatic updates of the condition codes on LOAD or MOVE instructions", which means that checking for "postive", "negative" and "zero" is "free" with every movement of data. Of course, you pay for that by not being able to backward propagate the compare instruction from the branch instruction, so if you have a comparison, the very next instruction MUST be a conditional branch - where an extra instruction between these would possibly help with alleviating any delay in the "result" from such an instruction.
In general, these sort of micro-optimisations are best left to compilers, rather than the user - the compiler will quite often convert for(i = 0; i < 1000; i++) into for(i = 1000-1; i >= 0; i--) if it thinks that makes sense [and the order of the loop isn't important in the compiler's view]. Trying to be clever with these sort of things tend to make the code unreadable, and performance can suffer badly on other systems (because when you start tweaking "natural" code to "unnatural", the compiler tends to think that you really meant what you wrote, and not optimise it the same way as the "natural" version).
I was wondering, if we have if-else condition, then what is computationally more efficient to check: using the equal to operator or the not equal to operator? Is there any difference at all?
E.g., which one of the following is computationally efficient, both cases below will do same thing, but which one is better (if there's any difference)?
Case1:
if (a == x)
{
// execute Set1 of statements
}
else
{
// execute Set2 of statements
}
Case 2:
if (a != x)
{
// execute Set2 of statements
}
else
{
// execute Set1 of statements
}
Here assumptions are most of the time (say 90% of the cases) a will be equal to x. a and x both are of unsigned integer type.
Generally it shouldn't matter for performance which operator you use. However it is recommended for branching that the most likely outcome of the if-statement comes first.
Usually what you should consider is; what is the simplest and clearest way to write this code? IMHO, the first, positive is the simplest (not requiring a !)
In terms of performance there is no differences as the code is likely to compile to the same thing. (Certainly in the JIT for Java it should)
For Java, the JIT can optimise the code so the most common branch is preferred by the branch prediction.
In this simple case, it makes no difference. (assuming a and x are basic types) If they're class-types with overloaded operator == or operator != they might be different, but I wouldn't worry about it.
For subsequent loops:
if ( c1 ) { }
else if ( c2 ) { }
else ...
the most likely condition should be put first, to prevent useless evaluations of the others. (again, not applicable here since you only have one else).
GCC provides a way to inform the compiler about the likely outcome of an expression:
if (__builtin_expect(expression, 1))
…
This built-in evaluates to the value of expression, but it informs the compiler that the likely result is 1 (true for Booleans). To use this, you should write expression as clearly as possible (for humans), then set the second parameter to whichever value is most likely to be the result.
There is no difference.
The x86 CPU architecture has two opcodes for conditional jumps
JNE (jump if not equal)
JE (jump if equal)
Usually they both take the same number of CPU cycles.
And even when they wouldn't, you could expect the compiler to do such trivial optimizations for you. Write what's most readable and what makes your intention more clear instead of worrying about microseconds.
If you ever manage to write a piece of Java code that can be proven to be significantly more efficient one way than the other, you should publish your result and raise an issue against whatever implementation you observed the difference on.
More to the point, just asking this kind of question should be a sign of something amiss: it is an indication that you are focusing your attention and efforts on a wrong aspect of your code. Real-life application performance always suffers from inadequate architecture; never from concerns such as this.
Early optimization is the root of all evil
Even for branch prediction, I think you should not care too much about this, until it is really necessary.
Just as Peter said, use the simplest way.
Let the compiler/optimizer do its work.
It's a general rule of thumb (most nowadays) that the source code should express your intention in the most readable way. You are writing it to another human (and not to the computer), the one year later yourself or your team mate who will need to understand your code with the less effort.
It shouldn't make any difference performance wise but you consider what is easiest to read. Then when you are looking back on your code or if someone is looking at it, you want it to be easy to understand.
it has a little advantage (from point of readability) if the first condition is the one that is true in most cases.
Write the conditions that way that you can read them best. You will not benefit from speed by negating a condition
Most processors use an electrical gate for equality/inequality checks, this means all bits are checked at once. Therefore it should make no difference, but you want to truly optimise your code it is always better to benchmark things yourself and check the results.
If you are wondering whether it's worth it to optimise like that, imagine you would have this check multiple times for every pixel in your screen, or scenarios like that. Imho, it is alwasy worth it to optimise, even if it's only to teach yourself good habits ;)
Only the non-negative approach which you have used at the first seems to be the best .
The only way to know for sure is to code up both versions and measure their performance. If the difference is only a percent or so, use the version that more clearly conveys the intent.
It's very unlikely that you're going to see a significant difference between the two.
Performance difference between them is negligible. So, just think about readability of the code. For readability I prefer the one which has a more lines of code in the If statement.
if (a == x) {
// x lines of code
} else {
// y lines of code where y < x
}
In my code I must choose one of this two expressions (where mask and i non constant integer numbers -1 < i < (sizeof(int) << 3) + 1). I don't think that this will make preformance of my programm better or worse, but it is very interesting for me. Do you know which is better and why?
First of all, whenever you find yourself asking "which is faster", your first reaction should be to profile, measure and find out for yourself.
Second of all, this is such a tiny calculation, that it almost certainly has no bearing on the performance of your application.
Third, the two are most likely identical in performance.
C expressions cannot be "faster" or "slower", because CPU cannot evaluate them directly.
Which one is "faster" depends on the machine code your compiler will be able to generate for these two expressions. If your compiler is smart enough to realize that in your context both do the same thing (e.g. you simply compare the result with zero), it will probably generate the same code for both variants, meaning that they will be equally fast. In such case it is quite possible that the generated machine code will not even remotely resemble the sequence of operations in the original expression (i.e. no shift and/or no bitwise-and). If what you are trying to do here is just test the value of one bit, then there are other ways to do it besides the shift-and-bitwise-and combination. And many of those "other ways" are not expressible in C. You can't use them in C, while the compiler can use them in machine code.
For example, the x86 CPU has a dedicated bit-test instruction BT that extracts the value of a specific bit by its number. So a smart compiler might simply generate something like
MOV eax, i
BT mask, eax
...
for both of your expressions (assuming it is more efficient, of which I'm not sure).
Use either one and let your compiler optimize it however it likes.
If "i" is a compile-time constant, then the second would execute fewer instructions -- the 1 << i would be computed at compile time. Otherwise I'd imagine they'd be the same.
Depends entirely on where the values mask and i come from, and the architecture on which the program is running. There's also nothing to stop the compiler from transforming one into the other in situations where they are actually equivalent.
In short, not worth worrying about unless you have a trace showing that this is an appreciable fraction of total execution time.
It is unlikely that either will be faster. If you are really curious, compile a simple program that does both, disassemble, and see what instructions are generated.
Here is how to do that:
gcc -O0 -g main.c -o main
objdump -d main | less
You could examine the assembly output and then look-up how many clock cycles each instruction takes.
But in 99.9999999 percent of programs, it won't make a lick of difference.
The 2 expressions are not logically equivalent, performance is not your concern!
If performance was your concern, write a loop to do 10 million of each and measure.
EDIT: You edited the question after my response ... so please ignore my answer as the constraints change things.