Is if(A | B) always faster than if(A || B)?

Is if(A | B) always faster than if(A || B)? - c++

I am reading this book by Fedor Pikus and he has some very very interesting examples which for me were a surprise.
Particularly this benchmark caught me, where the only difference is that in one of them we use || in if and in another we use |.
void BM_misspredict(benchmark::State& state)
{
std::srand(1);
const unsigned int N = 10000;;
std::vector<unsigned long> v1(N), v2(N);
std::vector<int> c1(N), c2(N);
for (int i = 0; i < N; ++i)
{
v1[i] = rand();
v2[i] = rand();
c1[i] = rand() & 0x1;
c2[i] = !c1[i];
}
unsigned long* p1 = v1.data();
unsigned long* p2 = v2.data();
int* b1 = c1.data();
int* b2 = c2.data();
for (auto _ : state)
{
unsigned long a1 = 0, a2 = 0;
for (size_t i = 0; i < N; ++i)
{
if (b1[i] || b2[i]) // Only difference
{
a1 += p1[i];
}
else
{
a2 *= p2[i];
}
}
benchmark::DoNotOptimize(a1);
benchmark::DoNotOptimize(a2);
benchmark::ClobberMemory();
}
state.SetItemsProcessed(state.iterations());
}
void BM_predict(benchmark::State& state)
{
std::srand(1);
const unsigned int N = 10000;;
std::vector<unsigned long> v1(N), v2(N);
std::vector<int> c1(N), c2(N);
for (int i = 0; i < N; ++i)
{
v1[i] = rand();
v2[i] = rand();
c1[i] = rand() & 0x1;
c2[i] = !c1[i];
}
unsigned long* p1 = v1.data();
unsigned long* p2 = v2.data();
int* b1 = c1.data();
int* b2 = c2.data();
for (auto _ : state)
{
unsigned long a1 = 0, a2 = 0;
for (size_t i = 0; i < N; ++i)
{
if (b1[i] | b2[i]) // Only difference
{
a1 += p1[i];
}
else
{
a2 *= p2[i];
}
}
benchmark::DoNotOptimize(a1);
benchmark::DoNotOptimize(a2);
benchmark::ClobberMemory();
}
state.SetItemsProcessed(state.iterations());
}
I will not go in all the details explained in the book why the latter is faster, but the idea is that hardware branch predictor is given 2 chances to misspredict in slower version and in the | (bitwise or) version. See the benchmark results below.
So the question is why don't we always use | instead of || in branches?

Is if(A | B) always faster than if(A || B)?
No, if(A | B) is not always faster than if(A || B).
Consider a case where A is true and the B expression is a very expensive operation. Not doing the operation can save that expense.
So the question is why don't we always use | instead of || in branches?
Besides the cases where the logical or is more efficient, the efficiency is not the only concern. There are often operations that have pre-conditions, and there are cases where the result of the left hand operation signals whether the pre-condition is satisfied for the right hand operation. In such case, we must use the logical operator.
if (b1[i]) // maybe this exists somewhere in the program
b2 = nullptr;
if(b1[i] || b2[i]) // OK
if(b1[i] | b2[i]) // NOT OK; indirection through null pointer
It is this possibility that is typically the problem when the optimiser is unable to replace logical with bitwise. In the example of if(b1[i] || b2[i]), the optimiser can do such replacement only if it can prove that b2 is valid at least when b1[i] != 0. That condition might not exist in your example, but that doesn't mean that it would necessarily be easy or - sometimes even possible - for the optimiser to prove that it doesn't exist.
Furthermore, there can be a dependency on the order of the operations, for example if one operand modifies a value read by the other operation:
if(a || (a = b)) // OK
if(a | (a = b)) // NOT OK; undefined behaviour
Also, there are types that are convertible to bool and thus are valid operands for ||, but are not valid operators for |:
if(ptr1 || ptr2) // OK
if(ptr1 | ptr2) // NOT OK; no bitwise or for pointers
TL;DR If we could always use bitwise or instead of logical operators, then there would be no need for logical operators and they probably wouldn't be in the language. But such replacement is not always a possibility, which is the reason why we use logical operators, and also the reason why optimiser sometimes cannot use the faster option.

If evaluating A is fast, B is slow, and when the short circuit happens (A returns true), then if (A || B) will avoid the slow path where if (A | B) will not.
If evaluating A almost always gives the same result, the processor's branch prediction may give if (A || B) performance better than if (A | B) even if B is fast.
As others have mentioned, there are cases where the short circuit is mandatory: you only want to execute B if A is known to evaluate false:
if (p == NULL || test(*p)) { ... } // null pointer would crash on *p
if (should_skip() || try_update()) { ... } // active use of side effects

Bitwise-or is a branchless arithmetic operator corresponding to a single ALU instruction. Logical-or is defined as implying shortcut evaluation, which involves a (costly) conditional branch. The effect of the two can differ when the evaluations of the operands have side effects.
In the case of two boolean variables, a smart compiler might evaluate logical-or as a bitwise-or, or using a conditional move, but who knows...

So the question is why don't we always use | instead of || in branches?
Branch prediction is relevant only to fairly hot pieces of code, and it depends on the branch being predictable enough to matter. In most places, | has little or no performance benefit over ||.
Also, taking A and B as expressions of suitable type (not necessarily single variables), key relevant differences include:
In A || B, B is evaluated only if A evaluates to 0, but in A | B, B is always evaluated. Conditionally avoiding evaluation of B is sometimes exactly the point of using the former.
In A || B there is a sequence point between evaluation of A and evaluation of B, but there isn't one in A | B. Even if you don't care about short-circuiting, you may care about the sequencing, even in relatively simple examples. For example, given an integer x, x-- || x-- has well-defined behavior, but x-- | x-- has undefined behavior.
When used in conditional context, the intent of A || B is clear to other humans, but the reason to substitute A | B less so. Code clarity is extremely important. And after all, if the compiler can see that it is safe to do so (and in most cases it is more reliable than a human at making the determination) then it is at liberty to compile one of those expressions as if it were the other.
If you cannot be sure that A and B both have built-in types -- in a template, for example -- you have to account for the possibility that one or both of | and || have been overloaded. In that case, it is reasonable to suppose that || will still do something that makes sense for branch control, but it is much less safe to assume that | will do something equivalent or even suitable.
As an additional minor matter, the precedence of | is different from that of ||. This can bite you if you rely on precedence instead of parentheses for grouping, and you need to watch out for that if you are considering modifying existing code to change || expressions to | expressions. For example, A && B || C && D groups as (A && B) || (C && D), but A && B | C && D groups as (A && (B | C)) && D.

Even if a and b are automatic-duration Boolean flags, that doesn't mean that an expression like a||b will be evaluated by checking the state of one flag, and then if necessary checking the state of the other. If a section of code performs:
x = y+z;
flag1 = (x==0);
... code that uses flag1
a compiler could replace that with:
x = y+z;
if (processor's Z flag was set)
{
... copy of that uses flag1, but where flag is replaced with constant 1
}
else
{
... copy of that uses flag1, but where flag is replaced with constant 0
}
Although hardly required to do so, a compiler may base some of its decisions about whether to perform such substitution upon a programmer's choice of whether ot write (flag1 || flag2) or (flag1 | flag2), and many factors may cause the aforementioned substitution to run faster or slower than the original code.

Code readability, short-circuiting and it is not guaranteed that Ord will always outperform a || operand.
Computer systems are more complicated than expected, even though they are man-made.
There was a case where a for loop with a much more complicated condition ran faster on an IBM. The CPU didn't cool and thus instructions were executed faster, that was a possible reason. What I am trying to say, focus on other areas to improve code than fighting small-cases which will differ depending on the CPU and the boolean evaluation (compiler optimizations).

The expression A | B might be faster in a loop that the compiler can optimize to a bitwise or of two vectors. Another case where | might be slightly more optimized is when the compiler would want to optimize away a branch by combining the two operands with bitmasks. If the operand on the right is something that might have visible side-effects, the compiler must instead insert a branch to guarantee the correct short-circuiting.
In other situations I can think of, A || B will be as fast or faster, including just about all the ones I can think of where you’re comparing a single pair of operands rather than a vector. However, those are almost never critical to performance.

Adding to the list:
Given the case that A and B are totally unpredictable, but usually A||B is true (i.e. when A is wrong, then usually B is true and vice versa).
In this case A||B may lead to a lot of mispredictions, but A|B is predictable and most likely faster.

So the question is why don't we always use | instead of || in branches?
In my little mind three reasons, but that might be only me.
First and most: all code tends to be read multiple times more often than it is written. So, in the example you have an array of ints with value either zero or 1. What the code really is meant to do is hidden to the reader. It might be clear to the original writer exactly what is supposed to be done, but years later, and after adding lots and lots of code lines it is probably anyones guess. In my world use what shows the intended comparison, it either is a bit comparison or a logical one. It should not be both.
Secondly: does the performance gain really matter? The example basically does nothing and could be coded a lot more efficient. Not? Well don't create the array in the first place, what the code does now is simply to check the quality of the rand function. The optimization is an example where a better analysis of the problem probably would give a much larger gain.
Third: a different compiler or a different processor will probably change the relative speeds. Now, what you are doing here is applying your knowledge of the internals of the current compiler and current processor. Next version of compiler and processor might turn the question around totally. You really cannot know. And we all know that the only way to know if an optimization actually will be quicker is by testing, which this very specific example has done.
So, if, for some reason, getting the very last bit of efficiency out of the code is worth it I would mark the choice as a "hack". I would probably include a macro with something like "DO_PERFORMANCE_HACKS" and have two versions of the code, one with | and one with || and commenting what the intention is. By changing the macro it will be possible for next readers to see what and where the hack is, and in the future may elect to not compile them.

Related

Efficient check that two floating point values have distinct signs

I need to find whether two finite floating point values A and B have different signs or one of them is zero.
In many code examples I see the test as follows:
if ( (A <= 0 && B >= 0) || (A >= 0 && B <= 0) )
It works fine but looks inefficient to me since many conditions are verified here and each condition branch is a slow operation for modern CPUs.
The other option is to use multiplication of the floating-point values:
if ( A * B <= 0 )
It shall work even in case of overflow, since infinity values preserve proper sign.
What is the most efficient way for performing the check (one of these two or probably some other)?

Efficient check that two floating point values have distinct signs
if ( A * B <= 0 ) fails to distinguish the sign when A or B are of the set -0.0, +0.0.
Consider signbit()
if (signbit(A) == signbit(B)) {
; // same sign
} else {
; // distinct signs
}
Trust the compiler to form efficient code - or use a better compiler.
OP then formed a different goal: "ether two finite floating point values A and B have distinct signs or one of them is zero."
Of course if (signbit(A) != signbit(B) || A == 0.0 || B == 0.0) meets this newer functionality without the rounding to 0.0 issue #Eric Postpischil of if ( A * B <= 0 ), but may not meet the fuzzy requirement of most efficient way.
The best way to form efficient code and avoid premature optimization really the root of all evil is to see the larger context.

It works fine but looks inefficient to me since many conditions are verified here and each condition branch is a slow operation for modern CPUs.
An optimizing compiler won't branch unnecessarily. Don't worry about premature optimization unless you're in a very tight hot loop. Your first snippet is compiled to only 2 branches in x86-64 and ARM64 in the compilers I tried. It can also be compiled to a branchless version. Of course it may still be slower than a simple A * B <= 0 but there's no way to know that for sure without a proper benchmark
If you don't care about NaN then you can simply do some bitwise operations:
auto a = std::bit_cast<uint64_t>(A);
auto b = std::bit_cast<uint64_t>(B);
const auto sign_mask = 1ULL << 63;
return ((a ^ b) & sign_mask) || (((a | b) & ~sign_mask) == 0);
If A and B have different signs then it'll be matched by (a ^ b) & sign_mask. If they have same sign then they both must be zero which will be caught by the latter condition. But this works in the integer so it may incur a cross-domain penalty when moving the value from float to int domain
If std::bit_cast not available then just replace with memcpy(&a, &A, sizeof A)
Demo on Godbolt
Again, do a benchmark to determine what's best for your target. There's no solution that's fastest on every microarchitecture available. If you really run this a lot of times in a loop then you should use SIMD instead to check for multiple values at the same time. You should also use profile-guided optimization in order for the compiler to know where and when to branch

Is a boolean expression as onerous as branching with if or switch?

Often I convert some if statements into boolean expressions for code compactness. For instance, if I have something like
foo(int x)
{
if (x > 5) return 100 + 5;
return 100;
}
I'll do it like
foo(int x)
{
return 100 + (x > 5) * 5;
}
This is very simple so no problem, the thing is when I have multiple tests, I can greatly simplify them (at the expense of readability but that's a different issue).
So the question is if that (x > 5) evaluation is as onerous as explicitly branching with it.

In both cases the expression (x > 5) has to be checked if it evaluates to true . And as demonstrated already, both versions compile to the same assembly even without any optimization enabled.
However, the Philosophy section of C++ Core Guidelines has these two rules you would do well to pay heed to:
P.1: Express ideas directly in code
P.3: Express intent
Though these rules cannot be enforced in anyway, adhering to them will make you adopt the version with the if statement.
Doing so will make it less onerous for those who have to maintain the code after you and even yourself a few months later.

You seem to be conflating C++ language constructs with patterns in the assembly. It may have been viable to reason about code on this level given the compilers of the late eighties or early nineties. At this point, however, compilers apply a lot of optimizations and transformations whose correctness or utility is not even obvious to the average programmer. A very simple example is the common beginner's mistake of assuming the following equivalences:
std::uint16_t a = ...;
a *= 2; // a multiplication in assembly
a *= 17; // ditto
a /= 3; // a division in assembly
They may then be surprised to find out that their compiler of choice translates these into the assembly equivalent of e.g.:
a <<= 1u;
a = (a << 4u) + a; // or even (a << 4u) | a if a < 16
a *= 43691u;
Note that the last transformation is only allowed if a is known to be a multiple of the divisor, so you may not see this kind of optimization all too often. How does it even work? In mathematical terms, uint16_t can be thought of as the residue class ring Z/(2^16)Z, and in this ring, there exists a multiplicative inverse for any element that is coprime to 2^16 (i.e. not divisible by 2). If d (e.g. 3) is coprime to 2, it has such an inverse, and then dividing by d is simply equivalent to multiplying by the inverse of d if the remainder is known to be zero. (I won't go into how this inverse can be calculated here.)
Here is another surprising optimization:
long arithsum(long n)
{
long result = 0;
for (long i=0; i<=n; ++i)
result += i;
return result;
}
GCC with -O3 rather mundanely translates this into an unrolled loop of additions. My version (9.0.0svn-something) of Clang, however, will pull a Gauss on you if you do this, and translate this into something like:
long arithsum(long n)
{
return (n * (n+1)) >> 1;
}
Anyway, the same caveats apply to if/switch etc. – while these are control flow structures, and so you'd think they correspond to branching, this may not be so. Likewise, what appears to be a non-branching operation might be translated to a branching operation if the compiler has an optimization rule under which this seems beneficial, or even if it is just unable to translate its own AST or intermediate representation into machine code without use of branching (on the given architecture).
TL;DR: Before you try to outsmart your compiler, figure out which assembly the compiler produces for the straightforward / readable code in this first place. If this assembly is good, there is no point in making the code more subtle / less readable.

Assuming by onerous you mean 1/0. Sure it might work in C/C++ due to implicit typecasting but might not for other languages. If that's what you want to achieve why not use ternary operator (? :) which also makes the code more readable
foo(int x) {
return (x > 5) ? (100 + 5) : 100;
}
Also read this stackoverflow article -- bool to int conversion

How far does GCC's __builtin_expect go?

While answering another question I got curious about this. I'm well aware that
if( __builtin_expect( !!a, 0 ) ) {
// not likely
} else {
// quite likely
}
will make the "quite likely" branch faster (in general) by doing something along the lines of hinting to the processor / changing the assembly code order / some kind of magic. (if anyone can clarify that magic that would also be great).
But does this work for a) inline ifs, b) variables and c) values other than 0 and 1? i.e. will
__builtin_expect( !!a, 0 ) ? /* unlikely */ : /* likely */;
or
int x = __builtin_expect( t / 10, 7 );
if( x == 7 ) {
// likely
} else {
// unlikely
}
or
if( __builtin_expect( a, 3 ) ) {
// likely
// uh-oh, what happens if a is 2?
} else {
// unlikely
}
have any effect? And does all of this depend on the architecture being targeted?

Did you read the GCC documentation?
Built-in Function: long __builtin_expect (long exp, long c)
You may use __builtin_expect to provide the compiler with branch
prediction information. In general, you should prefer to use actual
profile feedback for this (-fprofile-arcs), as programmers are
notoriously bad at predicting how their programs actually perform.
However, there are applications in which this data is hard to collect.
The return value is the value of exp, which should be an integral
expression. The semantics of the built-in are that it is expected that
exp == c. For example:
if (__builtin_expect (x, 0))
foo ();
indicates that we do not expect to call foo, since we expect x to be zero. Since you are limited to integral expressions for exp, you should use constructions such as
if (__builtin_expect (ptr != NULL, 1))
foo (*ptr);
when testing pointer or floating-point values.
To explain this a bit... __builtin_expect is specifically useful for communicating which branch you think the program's likely to take. You ask how the compiler can use that insight - well, consider this code:
if (x == 0)
return 10 * y;
else
return 39;
In machine code, the CPU can typically be asked to "goto" another line (which takes time, and depending on the CPU may prevent other execution optimisations - i.e. beneath the level of machine code - for example, see the Branches heading under http://en.wikipedia.org/wiki/Instruction_pipeline), or to call some other code, but there's not really an if/else concept where both true and false code are equal... you have to branch away to find the code for one or the other. The way that's done is basically, in pseudo-code:
test whether x is 0
if it was goto else_return_39
return 10 * y
else_return_39:
return 39
Given most CPUs are slower following the goto down to the else_return_39: label than to just fall through to return 10 * y, code for the "true" branch will be reached faster than for the false branch. Of course, the machine code could test whether x is not 0, put the "false" code (return 39) first and thereby reverse the performance characteristics.
This is what __builtin_expect controls - you can tell the compiler to put the true or the false branch where less branching is needed to reach it, thereby getting a tiny performance boost.
But does this work for a) inline ifs, b) variables and c) values other than 0 and 1?
a) Whether the surrounding function is inlined or not doesn't change the need for branching where the if statement appears (unless the optimiser sees the condition the if statement tests is always true or false and only one branch could never run). So, it's equally applicable to inlined code.
[ Your comment shows you were interested in conditional expressions - a ? b : c - I'm not sure - there's a disputed answer to that question at https://stackoverflow.com/questions/14784481/can-i-use-gccs-builtin-expect-with-ternary-operator-in-c that might prove insightful one way or the other, or the basis for further exploration ]
b) variables - you postulated:
int x = __builtin_expect( t / 10, 7 );
if( x == 7 ) {
That won't work - the compiler's not obliged to associate such expectations with variables and remember them the next time an if is seen. You can verify this (as I did for gcc 3.4.4) using gcc -S to produce assembly language output: the assembly doesn't change regardless of the expected value.
c) values other than 0 and 1
It works for integral (long) values, so yes. The last paragraph of the documentation quoted above address this, specifically:
you should use constructions such as
if (__builtin_expect (ptr != NULL, 1))
foo (*ptr);
when testing pointer or floating-point values.
Why? Well, if the pointer type is larger than long, then calling __builtin_conversion(long, long) would effectively slice off some of the less-significant bits for the test, and fail to incorporate the (high-order) rest in the test. Similarly, floating point values might be larger than a long, and the conversion not produce the result you expect. By using a boolean expression such as ptr != NULL (given true converts to 1 and false to 0) you're sure to get intended results.

But does this work for a) inline ifs, b) variables and c) values other than 0 and 1?
It works for an expression context that is used to determine branching.
So, a) Yes. b) No. c) Yes.
And does all of this depend on the architecture being targeted?
Yes!
It leverages architectures that use instruction pipelining, which allow a CPU to begin working on upcoming instructions before the current instruction has been completed.
(if anyone can clarify that magic that would also be great).
("Branch prediction" complicates this description, so I'm intentionally omitting it)
Any code resembling an if statement implies that an expression may result in the CPU jumping to a different location in the program. These jumps invalidate what's in the CPU's instruction pipeline.
__builtin_expect allows (without guarantee) gcc to try to assemble the code so the likely scenario involves fewer jumps than the alternate.

Will a modern compiler automatically optimize the following C++ code?

If i use hte following codes , will the compiler optimize it like a switch structure , which uses a binary tree to search for values ?
if ( X == "aaaa" || X == "bbbb" || X == "cccc" || X == "dddd" )
{
}
else if ( X == "abcd" || X == "cdef" || X == "qqqq" )
{
}
It's just an example , there's no pattern of what's inside the quote symbol
UPDATE
Ok , X is a string , but i don't really think it matters here , i just want to know , when everything inside the if was all about single variable , will it be optimized.

The values WILL be compared one after the other as it is a requirement of the || or, the short-circuit operator. So, here two things will happen:
X will be compared one-by-one from right-to-left.
There will be NO MORE comparisons after any comparison that succeeds (Since it is the short-circuit OR operator) i.e. in the following case
For example:
int hello() {
std::cout<<"Hello";
return 10;
}
int world() {
std::cout<<"World";
return 11;
}
int hello2() {
std::cout<<"Hello2";
return 9;
}
int a = 10;
bool dec = (a == hello() || a == world())
bool dec = (a == hello2() || a == hello() || a == world())
The output for the first statement will be:
Hello
as a == world() will not be executed, and for the second Hello2 Hello, as the comparisons keep on happening till the first success.
In case of the && operator, the comparisons keep on happening until the first failure (as that is enough to determine the outcome of the entire statement).

Almost certainly not. Binary search requires some sort of ordering
relationship, and an ability to compare for less-than. The compiler
cannot assume such exists, and even if it does find one, it cannot
assume that it defines an equivalence relation which corresponds to
==. It's also possible that the compiler can't determine that the
function defining the ordering relationship has no side effects. (If it
has side effects, or if any operation in the expression has side
effects, the compiler must respect the short circuiting behavior of
||.) Finally, even if the compiler did all this... what happens if I
carefully chose the order of the comparisons so that the most frequent
case is the first. Such an “optimization” could even end up
being a pessimization.
The “correct” to handle this is to create a map, mapping the
strings to pointers to functions (or to polymorphic functional objects,
if some state is involved).

It probably depends on the flags you set. Binary trees are faster for search but usually require more code to be handled. So if you optimized for size it probably wont. I'm not sure if it will do it anyway. You know, gcc is optimized according to a LOT of flags. O1, O2, O3, O4 are just simple forms to indicate big groups of flags. You can find a list of all the optimization flags here: http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
Try to search for strings, binary trees, etc in that page.

Best way to test this is the example like below.
int a = 0;
int b = 0;
if(a == a || b++) {
...
}
cout << b;
value of the b variable should be 0 at the end. b++ part won't be executed. that's the expected behaviour but there might be some exceptions depending on the compiler and its optimization settings. even many scripting langauges like JavaScript behaves this way.

Using bitwise operators for Booleans in C++

Is there any reason not to use the bitwise operators &, |, and ^ for "bool" values in C++?
I sometimes run into situations where I want exactly one of two conditions to be true (XOR), so I just throw the ^ operator into a conditional expression. I also sometimes want all parts of a condition to be evaluated whether the result is true or not (rather than short-circuiting), so I use & and |. I also need to accumulate Boolean values sometimes, and &= and |= can be quite useful.
I've gotten a few raised eyebrows when doing this, but the code is still meaningful and cleaner than it would be otherwise. Is there any reason NOT to use these for bools? Are there any modern compilers that give bad results for this?

|| and && are boolean operators and the built-in ones are guaranteed to return either true or false. Nothing else.
|, & and ^ are bitwise operators. When the domain of numbers you operate on is just 1 and 0, then they are exactly the same, but in cases where your booleans are not strictly 1 and 0 – as is the case with the C language – you may end up with some behavior you didn't want. For instance:
BOOL two = 2;
BOOL one = 1;
BOOL and = two & one; //and = 0
BOOL cand = two && one; //cand = 1
In C++, however, the bool type is guaranteed to be only either a true or a false (which convert implicitly to respectively 1 and 0), so it's less of a worry from this stance, but the fact that people aren't used to seeing such things in code makes a good argument for not doing it. Just say b = b && x and be done with it.

Two main reasons. In short, consider carefully; there could be a good reason for it, but if there is be VERY explicit in your comments because it can be brittle and, as you say yourself, people aren't generally used to seeing code like this.
Bitwise xor != Logical xor (except for 0 and 1)
Firstly, if you are operating on values other than false and true (or 0 and 1, as integers), the ^ operator can introduce behavior not equivalent to a logical xor. For example:
int one = 1;
int two = 2;
// bitwise xor
if (one ^ two)
{
// executes because expression = 3 and any non-zero integer evaluates to true
}
// logical xor; more correctly would be coded as
// if (bool(one) != bool(two))
// but spelled out to be explicit in the context of the problem
if ((one && !two) || (!one && two))
{
// does not execute b/c expression = ((true && false) || (false && true))
// which evaluates to false
}
Credit to user #Patrick for expressing this first.
Order of operations
Second, |, &, and ^, as bitwise operators, do not short-circuit. In addition, multiple bitwise operators chained together in a single statement -- even with explicit parentheses -- can be reordered by optimizing compilers, because all 3 operations are normally commutative. This is important if the order of the operations matters.
In other words
bool result = true;
result = result && a() && b();
// will not call a() if result false, will not call b() if result or a() false
will not always give the same result (or end state) as
bool result = true;
result &= (a() & b());
// a() and b() both will be called, but not necessarily in that order in an
// optimizing compiler
This is especially important because you may not control methods a() and b(), or somebody else may come along and change them later not understanding the dependency, and cause a nasty (and often release-build only) bug.

The raised eyebrows should tell you enough to stop doing it. You don't write the code for the compiler, you write it for your fellow programmers first and then for the compiler. Even if the compilers work, surprising other people is not what you want - bitwise operators are for bit operations not for bools.
I suppose you also eat apples with a fork? It works but it surprises people so it's better not to do it.

I think
a != b
is what you want

Disadvantages of the bitlevel operators.
You ask:
“Is there any reason not to use the bitwise operators &, |, and ^ for "bool" values in C++? ”
Yes, the logical operators, that is the built-in high level boolean operators !, && and ||, offer the following advantages:
Guaranteed conversion of arguments to bool, i.e. to 0 and 1 ordinal value.
Guaranteed short circuit evaluation where expression evaluation stops as soon as the final result is known.
This can be interpreted as a tree-value logic, with True, False and Indeterminate.
Readable textual equivalents not, and and or, even if I don't use them myself.
As reader Antimony notes in a comment also the bitlevel operators have alternative tokens, namely bitand, bitor, xor and compl, but in my opinion these are less readable than and, or and not.
Simply put, each such advantage of the high level operators is a disadvantage of the bitlevel operators.
In particular, since the bitwise operators lack argument conversion to 0/1 you get e.g. 1 & 2 → 0, while 1 && 2 → true. Also ^, bitwise exclusive or, can misbehave in this way. Regarded as boolean values 1 and 2 are the same, namely true, but regarded as bitpatterns they're different.
How to express logical either/or in C++.
You then provide a bit of background for the question,
“I sometimes run into situations where I want exactly one of two conditions to be true (XOR), so I just throw the ^ operator into a conditional expression.”
Well, the bitwise operators have higher precedence than the logical operators. This means in particular that in a mixed expression such as
a && b ^ c
you get the perhaps unexpected result a && (b ^ c).
Instead write just
(a && b) != c
expressing more concisely what you mean.
For the multiple argument either/or there is no C++ operator that does the job. For example, if you write a ^ b ^ c than that is not an expression that says “either a, b or c is true“. Instead it says, “An odd number of a, b and c are true“, which might be 1 of them or all 3…
To express the general either/or when a, b and c are of type bool, just write
(a + b + c) == 1
or, with non-bool arguments, convert them to bool:
(!!a + !!b + !!c) == 1
Using &= to accumulate boolean results.
You further elaborate,
“I also need to accumulate Boolean values sometimes, and &= and |=? can be quite useful.”
Well, this corresponds to checking whether respectively all or any condition is satisfied, and de Morgan’s law tells you how to go from one to the other. I.e. you only need one of them. You could in principle use *= as a &&=-operator (for as good old George Boole discovered, logical AND can very easily be expressed as multiplication), but I think that that would perplex and perhaps mislead maintainers of the code.
Consider also:
struct Bool
{
bool value;
void operator&=( bool const v ) { value = value && v; }
operator bool() const { return value; }
};
#include <iostream>
int main()
{
using namespace std;
Bool a = {true};
a &= true || false;
a &= 1234;
cout << boolalpha << a << endl;
bool b = {true};
b &= true || false;
b &= 1234;
cout << boolalpha << b << endl;
}
Output with Visual C++ 11.0 and g++ 4.7.1:
true
false
The reason for the difference in results is that the bitlevel &= does not provide a conversion to bool of its right hand side argument.
So, which of these results do you desire for your use of &=?
If the former, true, then better define an operator (e.g. as above) or named function, or use an explicit conversion of the right hand side expression, or write the update in full.

Contrary to Patrick's answer, C++ has no ^^ operator for performing a short-circuiting exclusive or. If you think about it for a second, having a ^^ operator wouldn't make sense anyway: with exclusive or, the result always depends on both operands. However, Patrick's warning about non-bool "Boolean" types holds equally well when comparing 1 & 2 to 1 && 2. One classic example of this is the Windows GetMessage() function, which returns a tri-state BOOL: nonzero, 0, or -1.
Using & instead of && and | instead of || is not an uncommon typo, so if you are deliberately doing it, it deserves a comment saying why.

Patrick made good points, and I'm not going to repeat them. However might I suggest reducing 'if' statements to readable english wherever possible by using well-named boolean vars.For example, and this is using boolean operators but you could equally use bitwise and name the bools appropriately:
bool onlyAIsTrue = (a && !b); // you could use bitwise XOR here
bool onlyBIsTrue = (b && !a); // and not need this second line
if (onlyAIsTrue || onlyBIsTrue)
{
.. stuff ..
}
You might think that using a boolean seems unnecessary, but it helps with two main things:
Your code is easier to understand because the intermediate boolean for the 'if' condition makes the intention of the condition more explicit.
If you are using non-standard or unexpected code, such as bitwise operators on boolean values, people can much more easily see why you've done this.
EDIT: You didnt explicitly say you wanted the conditionals for 'if' statements (although this seems most likely), that was my assumption. But my suggestion of an intermediate boolean value still stands.

Using bitwise operations for bool helps save unnecessary branch prediction logic by the processor, resulting from a 'cmp' instruction brought in by logical operations.
Replacing the logical with bitwise operations (where all operands are bool) generates more efficient code offering the same result. The efficiency ideally should outweigh all the short-circuit benefits that can be leveraged in the ordering using logical operations.
This can make code a bit un-readable albeit the programmer should comment it with reasons why it was done so.

IIRC, many C++ compilers will warn when attempting to cast the result of a bitwise operation as a bool. You would have to use a type cast to make the compiler happy.
Using a bitwise operation in an if expression would serve the same criticism, though perhaps not by the compiler. Any non-zero value is considered true, so something like "if (7 & 3)" will be true. This behavior may be acceptable in Perl, but C/C++ are very explicit languages. I think the Spock eyebrow is due diligence. :) I would append "== 0" or "!= 0" to make it perfectly clear what your objective was.
But anyway, it sounds like a personal preference. I would run the code through lint or similar tool and see if it also thinks it's an unwise strategy. Personally, it reads like a coding mistake.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js