bit representation of unsigned int zero - c++

I ran into a behavior which I didn't expect using bitwise operations on unsigned ints. I'll cut right to my example.
unsigned int a = 0;
unsigned int b = 0;
std::printf("a & b: %u\n", a & b);
std::printf("a == b: %i\n", a == b);
std::printf("a & b == a: %i\n", a & b == a);
The above code produces the following output:
a & b: 0
a == b: 1
a & b == a: 0
The last line is what confuses me. Shouldn't a & b == a evaluate to true, since a & b == (unsigned int)0 and a == (unsigned int)0?

You're getting this behavior because you didn't realize == comes before & in the C operator precedence table. In fact, a good compiler will warn you straight away about your code:
t.cpp:10:35: warning: & has lower precedence than ==; == will be evaluated first [-Wparentheses]
std::printf("a & b == a: %i\n", a & b == a);
^~~~~~~~
t.cpp:10:35: note: place parentheses around the '==' expression to silence this warning
std::printf("a & b == a: %i\n", a & b == a);
^
( )
t.cpp:10:35: note: place parentheses around the & expression to evaluate it first
std::printf("a & b == a: %i\n", a & b == a);
^
( )
Make sure your warnings are turned on, like g++ -Wall -Wextra -Werror.

You should write:
(a & b) == a
Now you'll get 1 since a & b will be evaluated first:
(a & b) = 0, 0 == 0 is 1.
In your case, a & b == a is evaluated as a & (b == a), b == a is 1, and a & 1 is 0.

Due to =='s precedence over &, a & b == a gets evaluated as a & (b == a) (and not as (a & b) == a as you appear to have expected).

Related

How does llvm ir handle different kind of short circuits?

For the following code in C:
void test(int a, int b, int c){
if(a > 1 && (b == 2 || c == 3))
b = a + c;
else
c = a + b;
return;
}
The CFG is as following:
But when I change the if on line 2 to while:
void test(int a, int b, int c){
while(a > 1 && (b == 2 || c == 3))
b = a + c;
c = a + b;
return;
}
Then the CFG is changed to:
So the basic block lor.end and land.end are generated, but they are not corresponding to any statement in the program. Why does while.cond not connects to while.end directly just like entry -> if.else in the first CFG? In other words, why are lor.end and land.end generated? It seems that the way of handling short circuits in if statements are different from that in loop statements. What causes this difference?
compiling script(calng/llvm 7.0.1): clang -emit-llvm -c -g -fno-discard-value-names file_name.c

Can we and how safe is to "signed" to "unsigned" trick to save one comparison in this case?

For example
bool CheckWithinBoundary(int x, int b) {
return (x>=0) && (x <= b);
}
bool CheckWithinBoundary2(int x, int b) {
return static_cast<uint32>(x) <= static_cast<uint32>(b);
}
CheckWithinBoundary2 can save one comparison.
My question is:
Can today's compiler optimize code using this? Or how can I make the
compiler do this kind of optimization?
Is there any danger to use this trick?
The answer to 2 is, yes, there is, these two are not the same. It seems that you are silently assuming that b >= 0, too. Consider e.g x == 1 and b == -1, this would give false for the first case and true for the second.
(I switch to C notation, this is easier to me, and since you also seem to be interested in it)
So we have that in fact
static_assert(INT_MAX < UINT_MAX);
bool CheckWithinBoundary(int x, int b) {
return (b >=0) && (x>=0) && (x <= b);
}
bool CheckWithinBoundary2(unsigned x, unsigned b) {
return (b >=0) && (x <= b);
}
if it compiles, are equivalent on all architectures where INT_MAX < UINT_MAX, and then the implicit conversion int --> unsigned does the right thing.
But be careful, you note that I use unsigned and not uint32_t, because you have to be sure to use an unsigned type with the same width. I don't know if there are architectures with 64 bit int, but there your method would fail.

Find out (in C++) if binary number is prefix of another

I need a function with a header like this:
bool is_prefix(int a, int b, int* c) {
// ...
}
If a is, read as a binary number string, a prefix of b, then set *c to be the rest of b (i.e. "what b has more than a") and return true. Otherwise, return false. Assume that binary strings always start with "1".
Of course - it is easy to do by comparing bit by bit (leftshift b until b==a). But is there a solution which is more efficient, without iterating over the bits?
Example: a=100 (4), b=1001 (9). Now set *c to 1.
You can use your favorite "fast" method to find the highest set bit. Let's call the function msb().
bool is_prefix (int a, int b, int *c) {
if (a == 0 || b == 0 || c == 0) return false;
int d = msb(b) - msb(a);
if (d < 0) return false;
if ((b >> d) == a) {
*c = b ^ (a << d);
return true;
}
return false;
}
Shift b so its high order bit aligns with a, and compare that with a. If they are equal, then a is a "prefix" of b.
This algorithm's performance depends on the performance of msb(). If it is constant, then this algorithm is constant. If msb() is expensive, then the "easy approach" may be the fastest approach.
I'm not too sure, but would something like the following work:
bool
is_prefix( unsigned a, unsigned b, unsigned* c )
{
unsigned mask = -1;
while ( mask != 0 && a != (b & mask) ) {
a <<= 1;
mask <<= 1;
}
c = b & ~mask;
return mask != 0;
}
(Just off the top of my head, so there could be errors.)

Relatively Prime Numbers

How to make a function in c++ to determine if two entered numbers are relatively prime (no common factors)?
For example "1, 3" would be valid, but "2, 4" wouldn't.
Galvanised into action by Jim Clay's incautious comment, here is Euclid's algorithm in six lines of code:
bool RelativelyPrime (int a, int b) { // Assumes a, b > 0
for ( ; ; ) {
if (!(a %= b)) return b == 1 ;
if (!(b %= a)) return a == 1 ;
}
}
Updated to add: I have been out-obfuscated by this answer from Omnifarious, who programs the gcd function thus:
constexpr unsigned int gcd(unsigned int const a, unsigned int const b)
{
return (a < b) ? gcd(b, a) : ((a % b == 0) ? b : gcd(b, a % b));
}
So now we have a three-line version of RelativelyPrime:
bool RelativelyPrime (int a, int b) { // Assumes a, b > 0
return (a<b) ? RelativelyPrime(b,a) : !(a%b) ? (b==1) : RelativelyPrime (b, a%b);
}
One of the many algorithms for computing the Greatest Common Denominator.

Neatest / Fastest Algorithm for Smallest Positive Number

Simple question - In c++, what's the neatest way of getting which of two numbers (u0 and u1) is the smallest positive number? (that's still efficient)
Every way I try it involves big if statements or complicated conditional statements.
Thanks,
Dan
Here's a simple example:
bool lowestPositive(int a, int b, int& result)
{
//checking code
result = b;
return true;
}
lowestPositive(5, 6, result);
If the values are represented in twos complement, then
result = ((unsigned )a < (unsigned )b) ? a : b;
will work since negative values in twos complement are larger, when treated as unsigned, than positive values. As with Jeff's answer, this assumes at least one of the values is positive.
return result >= 0;
I prefer clarity over compactness:
bool lowestPositive( int a, int b, int& result )
{
if (a > 0 && a <= b) // a is positive and smaller than or equal to b
result = a;
else if (b > 0) // b is positive and either smaller than a or a is negative
result = b;
else
result = a; // at least b is negative, we might not have an answer
return result > 0; // zero is not positive
}
Might get me modded down, but just for kicks, here is the result without any comparisons, because comparisons are for whimps. :-)
bool lowestPositive(int u, int v, int& result)
{
result = (u + v - abs(u - v))/2;
return (bool) result - (u + v + abs(u - v)) / 2;
}
Note: Fails if (u + v) > max_int. At least one number must be positive for the return code to be correct. Also kudos to polythinker's solution :)
unsigned int mask = 1 << 31;
unsigned int m = mask;
while ((a & m) == (b & m)) {
m >>= 1;
}
result = (a & m) ? b : a;
return ! ((a & mask) && (b & mask));
EDIT: Thought this is not so interesting so I deleted it. But on the second thought, just leave it here for fun :) This can be considered as a dump version of Doug's answer :)
Here's a fast solution in C using bit twiddling to find min(x, y). It is a modified version of #Doug Currie's answer and inspired by the answer to the Find the Minimum Positive Value question:
bool lowestPositive(int a, int b, int* pout)
{
/* exclude zero, make a negative number to be larger any positive number */
unsigned x = (a - 1), y = (b - 1);
/* min(x, y) + 1 */
*pout = y + ((x - y) & -(x < y)) + 1;
return *pout > 0;
}
Example:
/** gcc -std=c99 *.c && a */
#include <assert.h>
#include <limits.h>
#include <stdio.h>
#include <stdbool.h>
void T(int a, int b)
{
int result = 0;
printf("%d %d ", a, b);
if (lowestPositive(a, b, &result))
printf(": %d\n", result);
else
printf(" are not positive\n");
}
int main(int argc, char *argv[])
{
T(5, 6);
T(6, 5);
T(6, -1);
T(-1, -2);
T(INT_MIN, INT_MAX);
T(INT_MIN, INT_MIN);
T(INT_MAX, INT_MIN);
T(0, -1);
T(0, INT_MIN);
T(-1, 0);
T(INT_MIN, 0);
T(INT_MAX, 0);
T(0, INT_MAX);
T(0, 0);
return 0;
}
Output:
5 6 : 5
6 5 : 5
6 -1 : 6
-1 -2 are not positive
-2147483648 2147483647 : 2147483647
-2147483648 -2147483648 are not positive
2147483647 -2147483648 : 2147483647
0 -1 are not positive
0 -2147483648 are not positive
-1 0 are not positive
-2147483648 0 are not positive
2147483647 0 : 2147483647
0 2147483647 : 2147483647
0 0 are not positive
This will handle all possible inputs as you request.
bool lowestPositive(int a, int b, int& result)
{
if ( a < 0 and b < 0 )
return false
result = std::min<unsigned int>( a, b );
return true;
}
That being said, the signature you supply allows sneaky bugs to appear, as it is easy to ignore the return value of this function or not even remember that there is a return value that has to be checked to know if the result is correct.
You may prefer one of these alternatives that makes it harder to overlook that a success result has to be checked:
boost::optional<int> lowestPositive(int a, int b)
{
boost::optional<int> result;
if ( a >= 0 or b >= 0 )
result = std::min<unsigned int>( a, b );
return result;
}
or
void lowestPositive(int a, int b, int& result, bool &success)
{
success = ( a >= 0 or b >= 0 )
if ( success )
result = std::min<unsigned int>( a, b );
}
tons of the answers here are ignoring the fact that zero isn't positive :)
with tricky casting and tern:
bool leastPositive(int a, int b, int& result) {
result = ((unsigned) a < (unsigned) b) ? a : b;
return result > 0;
}
less cute:
bool leastPositive(int a, int b, int& result) {
if(a > 0 && b > 0)
result = a < b ? a : b;
else
result = a > b ? a : b:
return result > 0;
}
I suggest you refactor the function into simpler functions. Furthermore, this allows your compiler to better enforce expected input data.
unsigned int minUnsigned( unsigned int a, unsigned int b )
{
return ( a < b ) ? a : b;
}
bool lowestPositive( int a, int b, int& result )
{
if ( a < 0 && b < 0 ) // SO comments refer to the previous version that had || here
{
return false;
}
result = minUnsigned( (unsigned)a, (unsigned)b ); // negative signed integers become large unsigned values
return true;
}
This works on all three signed-integer representations allowed by ISO C:
two's complement, one's complement, and even sign/magnitude. All we care about is that any positive signed integer (MSB cleared) compares below anything with the MSB set.
This actually compiles to really nice code with clang for x86, as you can see on the Godbolt Compiler Explorer. gcc 5.3 unfortunately does a much worse job.
Hack using "magic constant" -1:
enum
{
INVALID_POSITIVE = -1
};
int lowestPositive(int a, int b)
{
return (a>=0 ? ( b>=0 ? (b > a ? a : b ) : INVALID_POSITIVE ) : INVALID_POSITIVE );
}
This makes no assumptions about the numbers being positive.
Pseudocode because I have no compiler on hand:
////0 if both negative, 1 if u0 positive, 2 if u1 positive, 3 if both positive
switch((u0 > 0 ? 1 : 0) + (u1 > 0 ? 2 : 0)) {
case 0:
return false; //Note that this leaves the result value undef.
case 1:
result = u0;
return true;
case 2:
result = u1;
return true;
case 3:
result = (u0 < u1 ? u0 : u1);
return true;
default: //undefined and probably impossible condition
return false;
}
This is compact without a lot of if statements, but relies on the ternary " ? : " operator, which is just a compact if, then, else statement. "(true ? "yes" : "no")" returns "yes", "(false ? "yes" : "no") returns "no".
In a normal switch statement after every case you should have a break;, to exit the switch. In this case we have a return statement, so we're exiting the entire function.
With all due respect, your problem may be that the English phrase used to describe the problem really does hide some complexity (or at least some unresolved questions). In my experience, this is a common source of bugs and/or unfulfilled expectations in the "real world" as well. Here are some of the issues I observed:
Some programmers use a naming
convention in which a leading u
implies unsigned, but you didn't
state explicitly whether your
"numbers" are unsigned or signed
(or, for that matter, whether they
are even supposed to be integral!)
I suspect that all of us who read it
assumed that if one argument is
positive and the other is not, then
the (only) positive argument value
is the correct response, but that is
not explicitly stated.
The description also doesn't define
the required behavior if both values
are non-positive.
Finally, some of the responses
offered prior to this post seem to
imply that the responder thought
(mistakenly) that 0 is positive! A
more specific requirements statement
might help prevent any
misunderstanding (or make it clear
that the issue of zero hadn't been
thought out completely when the
requirement was written).
I'm not trying to be overly critical; I'm just suggesting that a more precisely-written requirement will probably help, and will probably also make it clear whether some of the complexity you're concerned about in the implementation is really implicit in the nature of the problem.
Three lines with the use (abuse?) of the ternary operator
int *smallest_positive(int *u1, int *u2) {
if (*u1 < 0) return *u2 >= 0 ? u2 : NULL;
if (*u2 < 0) return u1;
return *u1 < *u2 ? u1 : u2;
}
Don't know about efficiency or what to do if both u1 and u2 are negative. I opted to return NULL (which has to be checked in the caller); a return of a pointer to a static -1 might be more useful.
Edited to reflect the changes in the original question :)
bool smallest_positive(int u1, int u2, int& result) {
if (u1 < 0) {
if (u2 < 0) return false; /* result unchanged */
result = u2;
} else {
if (u2 < 0) result = u1;
else result = u1 < u2 ? u1 : u2;
}
return true;
}
uint lowestPos(uint a, uint b) { return (a < b ? a : b); }
You are looking for the smallest positive, it is be wise to accept positive values only in that case. You don't have to catch the negative values problem in your function, you should solve it at an earlier point in the caller function. For the same reason I left the boolean oit.
A precondition is that they are not equal, you would use it like this in that way:
if (a == b)
cout << "equal";
else
{
uint lowest = lowestPos(a, b);
cout << (lowest == a ? "a is lowest" : "b is lowest");
}
You can introduce const when you want to prevent changes or references if you want to change the result. Under normal conditions the computer will optimize and even inline the function.
No cleverness, reasonable clarity, works for ints and floats:
template<class T>
inline
bool LowestPositive( const T a, const T b, T* result ) {
const bool b_is_pos = b > 0;
if( a > 0 && ( !b_is_pos || a < b ) ) {
*result = a;
return true;
}
if( b_is_pos ) {
*result = b;
return true;
}
return false;
}
Note that 0 (zero) is not a positive number.
OP asks for dealing with numbers (I interpret this as ints and floats).
Only dereference result pointer if there is a positive result (performance)
Only test a and b for positiveness once (performance -- not sure if such a test is expensive?)
Note also that the accepted answer (by tvanfosson) is wrong. It fails if a is positive and b is negative (saying that "neither is positive"). (This is the only reason I add a separate answer -- I don't have reputation enough to add comments.)
My idea is based on using min and max. And categorized the result into three cases, where
min <= 0 and max <= 0
min <= 0 and max > 0
min > 0 and max > 0
The best thing is that it's not look too complicated.
Code:
bool lowestPositive(int a, int b, int& result)
{
int min = (a < b) ? a : b;
int max = (a > b) ? a : b;
bool smin = min > 0;
bool smax = max > 0;
if(!smax) return false;
if(smin) result = min;
else result = max;
return true;
}
After my first post was rejected, allow me to suggest that you are prematurely optimizing the problem and you shouldn't worry about having lots of if statements. The code you're writing naturally requires multiple 'if' statements, and whether they are expressed with the ternary if operator (A ? B : C) or classic if blocks, the execution time is the same, the compiler is going to optimize almost all of the code posted into very nearly the same logic.
Concern yourself with the readability and reliability of your code rather than trying to outwit your future self or anyone else who reads the code. Every solution posted is O(1) from what I can tell, that is, every single solution will contribute insignificantly to the performance of your code.
I would like to suggest that this post be tagged "premature optimization," the poster is not looking for elegant code.