In many codes, people are using left shift to accomplish the multiplication. For example, 3 * i, some people prefer to use i << 1 + i, is there any benefit over the common one by using left shift.
One reason that jumps to my mind is to make the point that bitwise operations are being performed; The code can become a lot cleaner and intentions clearer with this style of notation than using hex (matter of opinion).
For example - to make the point that the 3rd bit is being set an engineer might do
uint32 foo = 1 << 3;
Note that in this case, the compiler will optimize this away completely and replace with an assign of 0x08; but the intent of 3rd bit being set is very clear.
This has a lot of use in embedded programming, encryption and other bit specific twiddling that don't jump to my mind.
It is not for performance optimizations, since the compiler can be expected to optimize a multiply by 2 to a shift it's not any faster; and in any case you'd be unlikely to notice the gains in most applications.
Your example however, 3*i being replaced by i << 1+i is poor since they are not the same operation.
Assuming I have two atomic variables of types int32, I could instead chose to represent them as std::atomic<int64> both and reserve the first 32 bits for my first in and the last for my second int.
This seems like quite a space & time saver on x64 architectures, not to mention it allows for all sorts of black magic since one can abstract over various operations and make them atomic:
first == a && second ==b
becomes
both == ( int64(a) + int64(b) << 32 )
//Or some such... I'm not 100% sure this is correct but you get the idea
The one problem with this trick that I see is that I'm not particularly found with operating at the bit level and C++ is not very kind when it comes to operation at the bit level, especially once you try to accomplish more complex operations or pack more than two variables (e.g. two numbers and several bools) into the same atomic.
So I'm wondering if there is a standardized way to apply this kind of trick. A pattern or even std functionality that is easily recognizable by other coder when seen and easier to work with for the implementer ? Likewise, is this pattern useful enough to warrant such a standardization, or does its usefulness quickly become obsolete when compares to the possible annoyances and UB it can bring?
The way to get around Read-Then-Write with atomics is using a loop:
void setBit(atomic<int64_t>& bitset, int bit)
{
int64_t val = 1LL << bit;
int64_t prev = bitset;
while ((!(bitset & val)) &&
!bitset.compare_exchange_weak(prev, (prev | val))
;
}
You can extend this method to create generic bitwise operation functions
Does using bitwise operations in normal flow or conditional statements like for, if, and so on increase overall performance and would it be better to use them where possible? For example:
if(i++ & 1) {
}
vs.
if(i % 2) {
}
Unless you're using an ancient compiler, it can already handle this level of conversion on its own. That is to say, a modern compiler can and will implement i % 2 using a bitwise AND instruction, provided it makes sense to do so on the target CPU (which, in fairness, it usually will).
In other words, don't expect to see any difference in performance between these, at least with a reasonably modern compiler with a reasonably competent optimizer. In this case, "reasonably" has a pretty broad definition too--even quite a few compilers that are decades old can handle this sort of micro-optimization with no difficulty at all.
TL;DR Write for semantics first, optimize measured hot-spots second.
At the CPU level, integer modulus and divisions are among the slowest operations. But you are not writing at the CPU level, instead you write in C++, which your compiler translates to an Intermediate Representation, which finally is translated into assembly according to the model of CPU for which you are compiling.
In this process, the compiler will apply Peephole Optimizations, among which figure Strength Reduction Optimizations such as (courtesy of Wikipedia):
Original Calculation Replacement Calculation
y = x / 8 y = x >> 3
y = x * 64 y = x << 6
y = x * 2 y = x << 1
y = x * 15 y = (x << 4) - x
The last example is perhaps the most interesting one. Whilst multiplying or dividing by powers of 2 is easily converted (manually) into bit-shifts operations, the compiler is generally taught to perform even smarter transformations that you would probably think about on your own and who are not as easily recognized (at the very least, I do not personally immediately recognize that (x << 4) - x means x * 15).
This is obviously CPU dependent, but you can expect that bitwise operations will never take more, and typically take less, CPU cycles to complete. In general, integer / and % are famously slow, as CPU instructions go. That said, with modern CPU pipelines having a specific instruction complete earlier doesn't mean your program necessarily runs faster.
Best practice is to write code that's understandable, maintainable, and expressive of the logic it implements. It's extremely rare that this kind of micro-optimisation makes a tangible difference, so it should only be used if profiling has indicated a critical bottleneck and this is proven to make a significant difference. Moreover, if on some specific platform it did make a significant difference, your compiler optimiser may already be substituting a bitwise operation when it can see that's equivalent (this usually requires that you're /-ing or %-ing by a constant).
For whatever it's worth, on x86 instructions specifically - and when the divisor is a runtime-variable value so can't be trivially optimised into e.g. bit-shifts or bitwise-ANDs, the time taken by / and % operations in CPU cycles can be looked up here. There are too many x86-compatible chips to list here, but as an arbitrary example of recent CPUs - if we take Agner's "Sunny Cove (Ice Lake)" (i.e. 10th gen Intel Core) data, DIV and IDIV instructions have a latency between 12 and 19 cycles, whereas bitwise-AND has 1 cycle. On many older CPUs DIV can be 40-60x worse.
By default you should use the operation that best expresses your intended meaning, because you should optimize for readable code. (Today most of the time the scarcest resource is the human programmer.)
So use & if you extract bits, and use % if you test for divisibility, i.e. whether the value is even or odd.
For unsigned values both operations have exactly the same effect, and your compiler should be smart enough to replace the division by the corresponding bit operation. If you are worried you can check the assembly code it generates.
Unfortunately integer division is slightly irregular on signed values, as it rounds towards zero and the result of % changes sign depending on the first operand. Bit operations, on the other hand, always round down. So the compiler cannot just replace the division by a simple bit operation. Instead it may either call a routine for integer division, or replace it with bit operations with additional logic to handle the irregularity. This may depends on the optimization level and on which of the operands are constants.
This irregularity at zero may even be a bad thing, because it is a nonlinearity. For example, I recently had a case where we used division on signed values from an ADC, which had to be very fast on an ARM Cortex M0. In this case it was better to replace it with a right shift, both for performance and to get rid of the nonlinearity.
C operators cannot be meaningfully compared in therms of "performance". There's no such thing as "faster" or "slower" operators at language level. Only the resultant compiled machine code can be analyzed for performance. In your specific example the resultant machine code will normally be exactly the same (if we ignore the fact that the first condition includes a postfix increment for some reason), meaning that there won't be any difference in performance whatsoever.
Here is the compiler (GCC 4.6) generated optimized -O3 code for both options:
int i = 34567;
int opt1 = i++ & 1;
int opt2 = i % 2;
Generated code for opt1:
l %r1,520(%r11)
nilf %r1,1
st %r1,516(%r11)
asi 520(%r11),1
Generated code for opt2:
l %r1,520(%r11)
nilf %r1,2147483649
ltr %r1,%r1
jhe .L14
ahi %r1,-1
oilf %r1,4294967294
ahi %r1,1
.L14: st %r1,512(%r11)
So 4 extra instructions...which are nothing for a prod environment. This would be a premature optimization and just introduce complexity
Always these answers about how clever compilers are, that people should not even think about the performance of their code, that they should not dare to question Her Cleverness The Compiler, that bla bla bla… and the result is that people get convinced that every time they use % [SOME POWER OF TWO] the compiler magically converts their code into & ([SOME POWER OF TWO] - 1). This is simply not true. If a shared library has this function:
int modulus (int a, int b) {
return a % b;
}
and a program launches modulus(135, 16), nowhere in the compiled code there will be any trace of bitwise magic. The reason? The compiler is clever, but it did not have a crystal ball when it compiled the library. It sees a generic modulus calculation with no information whatsoever about the fact that only powers of two will be involved and it leaves it as such.
But you can know if only powers of two will be passed to a function. And if that is the case, the only way to optimize your code is to rewrite your function as
unsigned int modulus_2 (unsigned int a, unsigned int b) {
return a & (b - 1);
}
The compiler cannot do that for you.
Bitwise operations are much faster.
This is why the compiler will use bitwise operations for you.
Actually, I think it will be faster to implement it as:
~i & 1
Similarly, if you look at the assembly code your compiler generates, you may see things like x ^= x instead of x=0. But (I hope) you are not going to use this in your C++ code.
In summary, do yourself, and whoever will need to maintain your code, a favor. Make your code readable, and let the compiler do these micro optimizations. It will do it better.
I've seen this a couple of times, but it seems to me that using the bitwise shift left hinders readability. Why is it used? Is it faster than just multiplying by 2?
You should use * when you are multiplying, and << when you are bit shifting. They are mathematically equivalent, but have different semantic meanings. If you are building a flag field, for example, use bit shifting. If you are calculating a total, use multiplication.
It is faster on old compilers that don't optimize the * 2 calls by emitting a left shift instruction. That optimization is really easy to detect and any decent compiler already does.
If it affects readability, then don't use it. Always write your code in the most clear and concise fashion first, then if you have speed problems go back and profile and do hand optimizations.
It's used when you're concerned with the individual bits of the data you're working with. For example, if you want to set the upper byte of a word to 0x9A, you would not write
n |= 0x9A * 256
You'd write:
n |= 0x9A << 8
This makes it clearer that you're working with bits, rather than the data they represent.
For some architectures, bit shifting is faster than multiplying. However, any compiler worth its salt will optimize *2 (or any multiplication by a power of 2) to a left bit shift (when a bit shift would be faster).
For readability of values used as bitfields:
enum Flags { UP = (1<<0),
DOWN = (1<<1),
STRANGE = (1<<2),
CHARM = (1<<3),
...
which I think is preferable to either '=1,...,=2,...=4' or '=1,...=2, =2*2,...=2*3' especially if you have 8+ flags.
If you are using a old C compiler, it is preferrable to use bitwise. For readability you can comment you code though.
In C++ it's possible to use a logical operator where a biwise operator was intended:
int unmasked = getUnmasked(); //some wide value
int masked = unmasked & 0xFF; // izolate lowest 8 bits
the second statement could be easily mistyped:
int masked = unmasked && 0xFF; //&& used instead of &
This will cause incorrect behaviour - masked will now be either 0 or 1 when it is inteded to be from 0 to 255. And C++ will not ever complain.
Is is possible to design code in such a way that such errors are detected at compiler level?
Ban in your coding standards the direct use of any bitwise operations in an arbitrary part of the code. Make it mandatory to call a function instead.
So instead of:
int masked = unmasked & 0xFF; // izolate lowest 8 bits
You write:
int masked = GetLowestByte(unmasked);
As a bonus, you'll get a code base which doesn't have dozens of error prone bitwise operations spread all over it.
Only in one place (the implementation of GetLowestByte and its sisters) you'll have the actual bitwise operations. Then you can read these lines two or three times to see if you blew it. Even better, you can unit test that part.
This is a bit Captain Obvious, but you could of course apply encapsulation and just hide the bitmask inside a class. Then you can use operator overloading to make sure that the boolean operator&&() as you see fit.
I guess that a decent implementation of such a "safe mask" need not be overly expensive performance-wise, either.
In some instances you might get a compiler warning (I wouldn't expect one in your example though). A tool like lint might be able to spot possible mistakes.
I think the only way to be sure is to define your coding standards to make the difference between the two operators more obvious - something like:
template<typename T>
T BitwiseAnd( T value, T mask ) { return value & mask; }
and ban the bitwise operators & and |
Both operators represent valid operations on integers, so I don't see any way of detecting a problem. How is the compiler supposed to know which operation you really wanted?