I'm working with a protocol where I don't have control of the input types. But I need to compute the difference in two, 64-bit unsigned integers (currently baked into a std::uint64_t). But the difference might be negative or positive. I don't want to do this:
uint64_t a{1};
uint64_t b{2};
int64_t x = a - b; // -1; correct, but what if a and b were /enormous/?
So I was looking at Boost's safe_numerics here. The large-values case is handled as I would like:
boost::safe_numerics::safe<uint64_t> a{UINT64_MAX};
boost::safe_numerics::safe<uint64_t> b{1};
boost::safe_numerics::safe<int64_t> x = a - b;
// ^^ Throws "converted unsigned value too large: positive overflow error"
Great! But ... they're a little too safe:
boost::safe_numerics::safe<uint64_t> a{1}; //UINT64_MAX;
boost::safe_numerics::safe<uint64_t> b{2};
boost::safe_numerics::safe<int64_t> x = a - b;
// ^^ Throws "subtraction result cannot be negative: negative overflow error"
// ... even though `x` is signed
I have a suspicion that it's a - b that actually throws, not the assignment. But I've tried every kind of cast in the book to get a - b into a safe, signed integer, but no joy.
There are some inelegant ways to deal with this, like comparing a and b to always subtract the smaller from the larger. Or I can do a lot of casting with boost::numeric_cast, or old-school range checking. Or...god forbid...I just throw myself when a or b exceed 63 bits, but all that is a bit lame.
But my real question is: Why does Boost detect a negative overflow in the final example above? Am I using safe_numerics incorrectly?
Am targeting C++-17 with gcc on a 64-bit system and using Boost 1.71.
The behavior I was looking for is actually implemented in boost::safe_numerics::checked_result:
https://www.boost.org/doc/libs/develop/libs/safe_numerics/doc/html/checked_result.html
checked::subtract allows negative overflows when the difference of two unsigned integers is negative (and being stored in a signed integer of adequate size). But it throws when the result does not. For example:
using namespace std;
using namespace boost::safe_numerics;
safe<uint64_t> a{2};
safe<uint64_t> b{1};
checked_result<int64_t> x0 = checked::subtract<int64_t>(b, a);
assert(x0 == -1);
checked_result<int64_t> x1 = checked::subtract<int64_t>(a, b);
assert(x1 == 1);
a = UINT64_MAX;
checked_result<int64_t> x2 = checked::subtract<int64_t>(a, b); // throws
Related
Consider the following program:
#include <iostream>
int main()
{
unsigned int a = 3;
unsigned int b = 7;
std::cout << (a - b) << std::endl; // underflow here!
return 0;
}
In the line starting with std::cout an underflow is happening because a is lesser than b so a-b is less than 0, but since a and b are unsigend so is a-b.
Is there a compiler flag (for G++) that gives me a warning when I try to calculate the difference of two unsigend integers?
Now, one could argue that an overflow/underflow can happen in any calculation using any operator. But I think it is more dangerous to apply operator - to unsigend ints because with unsigned integers this error may happen with quite low (to me: "more common") numbers.
A (static analysis) tool that finds such things would also be great but I much prefer a compiler flag and warning.
GCC does not (afaict) support it, but Clang's UBSanitizer has the following option [emphasis mine]:
-fsanitize=unsigned-integer-overflow: Unsigned integer overflow, where the result of an unsigned integer computation cannot be represented in its type. Unlike signed integer overflow, this is not undefined behavior, but it is often unintentional. This sanitizer does not check for lossy implicit conversions performed before such a computation
For example:
short a = 10;
int b = a & 0xffff;
Similarly if I want to convert from int to short, how do I do using bitwise operators? I don't want to use the usual casting using (short).
If you want sign extension:
int b = a;
If you don't (i.e. negative values of a will yield (weird) positive values of b)
// note that Standard Conversion of shorts to int happens before &
int b = a & std::numeric_limits<unsigned short>::max();
Doing bit-operations on signed types may not be a good idea and lead to surprising results: Are the results of bitwise operations on signed integers defined?. Why do you need bit-operations?
short int2short(int x) {
if (x > std::numeric_limits<short>::max()) {
// what to do now? Throw exception, return default value ...
}
else if (x < std::numeric_limits<short>::min()) {
// what to do now? Throw exception, return default value ...
} else
{
return static_cast<short>(x);
}
}
This could generalized into a template method and also have policies for the error cases.
Why not using (short)? That's the easiest way and gets what you want.
Unless it's an interview problem, then you need to assume how many bits a short and a int contains. If the number is positive, just using bitwise AND. If the number is negative, flip it to positive number, and do bitwise AND. After AND, you need to change the highest bit to 1.
Suppose I have these two types:
typedef unsigned long long uint64;
typedef signed long long sint64;
And I have these variables:
uint64 a = ...;
uint64 b = ...;
sint64 c;
I want to subtract b from a and assign the result to c, clearly if the absolute value of the difference is greater than 2^63 than it will wrap (or be undefined) which is ok. But for cases where the absolute difference is less than 2^63 I want the result to be correct.
Of the following three ways:
c = a - b; // sign conversion warning ignored
c = sint64(a - b);
c = sint64(a) - sint64(b);
Which of the them are guaranteed to work by the standard? (and why/how?)
None of the three work. The first fails if the difference is negative (no matter the absolute value), the second is the same as the first, and the third fails if either operand is too large.
It's impossible to implement without a branch.
c = b < a? a - b : - static_cast< sint64 >( b - a );
Fundamentally, unsigned types use modulo arithmetic without any kind of sign bit. They don't know they wrapped around, and the language spec doesn't identify wraparound with negative numbers. Also, assigning a value outside the range of a signed integral variable results in an implementation-defined, potentially nonsense result (integral overflow).
Consider a machine with no hardware to convert between native negative integers and two's complement. It can perform two's complement subtraction using bitwise negation and native two's complement addition, though. (Bizarre, maybe, but that is what C and C++ currently require.) The language leaves it up to the programmer, then, to convert the negative values. The only way to do that is to negate a positive value, which requires that the computed difference be positive. So…
The best solution is to avoid any attempt to represent a negative number as a large positive number in the first place.
EDIT: I forgot the cast before, which would have produced a large unsigned value, equivalently to the other solutions!
Potatoswatter's answer is probably the most pragmatic solution, but "impossible to implement without a branch" is like a red rag to a bull for me. If your hypothetical system implements undefined overflow/cast operations like that, my hypothetical system implements branches by killing puppies.
So I'm not completely familiar with what the standard(s) would say, but how about this:
sint64 c,d,r;
c = a >> 1;
d = b >> 1;
r = (c-d) * 2;
c = a & 1;
d = b & 1;
r += c - d;
I've written it in a fairly verbose fasion so the individual operations are clear, but have left some implicit casts. Is anything there undefined?
Steve Jessop rightly points out that this does fail in the case where the difference is exactly 2^63-1, as the multiply overflows before the 1 is subtracted.
So here's an even uglier version which should cover all underflow/overflow conditions:
sint64 c,d,r,ov;
c = a >> 1;
d = b >> 1;
ov = a >> 63;
r = (c-d-ov) * 2;
c = a & 1;
d = b & 1;
r += ov + ov + c - d;
if the absolute value of the difference is greater than 2^63 than it
will wrap (or be undefined) which is ok. But for cases where the
absolute difference is less than 2^63 I want the result to be correct.
Then all three of the notations you suggest work, assuming a conventional architecture. The notable difference
is that the third one sint64(a) - sint64(b) invokes undefined behavior
when the difference is not representable, whereas the first two are
guaranteed to wrap around (unsigned arithmetic overflow is guaranteed to wrap around and conversion from unsigned to signed is implementation-defined, whereas signed arithmetic overflow is undefined).
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Best way to detect integer overflow in C/C++
I am writing a function in C but the question is generic. The function takes three integers and returns some information about these three integers.
Problem I suspect here is the integers can be at their max and this can cause overflow.
For example: if I pass a as maximum value possible and b can be anything 1 - max, then in this case, will the expression (a+b)>c in if condition cause overflow? If so, how do I handle it?
My solution was to keep a long integer as temporary variable to keep value of a+b and use it in the expression but it sounds dirty way.
Refer to this snippet:
int
triangle_type(int a, int b, int c) {
if (!((a+b)>c)&&((b+c) > a)&&((a+c>b))) {
return -1;
}
}
On current processors, there is no real signaling overflow on integers. So on a 32 bits processors, integer arithmetic is done modulus 2^32 at the bit level. When you add up two int-s and some "overflow" happens, an overflow (or carry) bit is set in some status register (and the arithmetic operation is done modulus 2^32). If (as it is usually the case) no machine instruction tests that overflow status bit, nothing happens.
So the control flow won't change because of an overflow (it usually will change on division by zero, e.g. with a SIGEMT signal).
If you want to portably catch in C the overflow case, you could test e.g. that the sum of two positive int-s stays positive. (if it is negative an overflow did happen).
You could also be interested in bignums, e.g. use the gmp library. You could also use <stdint.h> and use carefully int32_t and int64_t with explicit casts. At last, you could (as most coders do) choose to ignore that issue.
NB: As Jonathan noticed, you may fall in the undefined behavior or the unspecified behavior case. If you really care, use bignums. However, you may choose to not care at all.
You could do something like this
// returns true if a+b > c
inline int safe_sum_greater (int a, int b, int c) {
int a1 = a / 4; int a2 = a % 4;
int b1 = b / 4; int b2 = b % 4;
int c1 = c / 4; int c2 = c % 4;
int s2 = a2 + b2;
int s1 = a1 + b1 + s2 / 4;
s2 = s2 % 4;
return (s1 > c1) || ( (s1 == c1) && (s2 > c2) );
}
Performance won't be bad, since only bit-wise operations will be used.
I have not thought about this extensively for negative numbers, so use with care.
I have an 8-character string representing a hexadecimal number and I need to convert it to an int. This conversion has to preserve the bit pattern for strings "80000000" and higher, i.e., those numbers should come out negative. Unfortunately, the naive solution:
int hex_str_to_int(const string hexStr)
{
stringstream strm;
strm << hex << hexStr;
unsigned int val = 0;
strm >> val;
return static_cast<int>(val);
}
doesn't work for my compiler if val > MAX_INT (the returned value is 0). Changing the type of val to int also results in a 0 for the larger numbers. I've tried several different solutions from various answers here on SO and haven't been successful yet.
Here's what I do know:
I'm using HP's C++ compiler on OpenVMS (using, I believe, an Itanium processor).
sizeof(int) will be at least 4 on every architecture my code will run on.
Casting from a number > INT_MAX to int is implementation-defined. On my machine, it usually results in a 0 but interestingly casting from long to int results in INT_MAX when the value is too big.
This is surprisingly difficult to do correctly, or at least it has been for me. Does anyone know of a portable solution to this?
Update:
Changing static_cast to reinterpret_cast results in a compiler error. A comment prompted me to try a C-style cast: return (int)val in the code above, and it worked. On this machine. Will that still be safe on other architectures?
Quoting the C++03 standard, §4.7/3 (Integral Conversions):
If the destination type is signed, the value is unchanged if it can be represented in the destination type (and bit-field width); otherwise, the value is implementation-defined.
Because the result is implementation-defined, by definition it is impossible for there to be a truly portable solution.
While there are ways to do this using casts and conversions, most rely on undefined behavior that happen to have well-defined behaviors on some machines / with some compilers. Instead of relying on undefined behavior, copy the data:
int signed_val;
std::memcpy (&signed_val, &val, sizeof(int));
return signed_val;
You can negate an unsigned twos-complement number by taking the complement and adding one. So let's do that for negatives:
if (val < 0x80000000) // positive values need no conversion
return val;
if (val == 0x80000000) // Complement-and-addition will overflow, so special case this
return -0x80000000; // aka INT_MIN
else
return -(int)(~val + 1);
This assumes that your ints are represented with 32-bit twos-complement representation (or have similar range). It does not rely on any undefined behavior related to signed integer overflow (note that the behavior of unsigned integer overflow is well-defined - although that should not happen here either!).
Note that if your ints are not 32-bit, things get more complex. You may need to use something like ~(~0U >> 1) instead of 0x80000000. Further, if your ints are no twos-complement, you may have overflow issues on certain values (for example, on a ones-complement machine, -0x80000000 cannot be represented in a 32-bit signed integer). However, non-twos-complement machines are very rare today, so this is unlikely to be a problem.
Here's another solution that worked for me:
if (val <= INT_MAX) {
return static_cast<int>(val);
}
else {
int ret = static_cast<int>(val & ~INT_MIN);
return ret | INT_MIN;
}
If I mask off the high bit, I avoid overflow when casting. I can then OR it back safely.
C++20 will have std::bit_cast that copies bits verbatim:
#include <bit>
#include <cassert>
#include <iostream>
int main()
{
int i = -42;
auto u = std::bit_cast<unsigned>(i);
// Prints 4294967254 on two's compliment platforms where int is 32 bits
std::cout << u << "\n";
auto roundtripped = std::bit_cast<int>(u);
assert(roundtripped == i);
std::cout << roundtripped << "\n"; // Prints -42
return 0;
}
cppreference shows an example of how one can implement their own bit_cast in terms of memcpy (under Notes).
While OpenVMS is not likely to gain C++20 support anytime soon, I hope this answer helps someone arriving at the same question via internet search.
unsigned int u = ~0U;
int s = *reinterpret_cast<int*>(&u); // -1
Сontrariwise:
int s = -1;
unsigned int u = *reinterpret_cast<unsigned int*>(&s); // all ones