Mapping signed integer ranges to unsigned - c++

I'm facing a problem where signed integers should be converted to unsigneds, preserving their range and order.
Given the following definition:
#include <limits>
#define MIN(X) std::numeric_limits<X>::min();
#define MAX(X) std::numeric_limits<X>::max();
What is the fastest and correct way to map the signed range [MIN(T), MAX(T)] to the unsigned range [0, MAX(U)]?
where:
T is a signed integer type
U is an unsigned integer type
sizeof(T) == sizeof(U)
I tried various bit twiddling and numeric methods to come up with a solution, without success.

unsigned int signedToUnsigned(signed int s) {
unsigned int u = 1U + std::numeric_limits<int>::max();
u += s;
return u;
}
Live example here
This will add signed_max + 1 to signed int to ensure [MIN(int), MAX(int)] is mapped to [0, MAX(unsigned int)]
Why would this answer work and map correctly:
When you add a signed integral number to an unsigned, the signed number is promoted to unsigned type. From Section 4.7 [conv.integral]
If the destination type is unsigned, the resulting value is the least
unsigned integer congruent to the source integer (modulo 2n
where n is the number of bits used to represent the unsigned type). [
Note: In a two’s complement representation, this conversion is
conceptual and there is no change in the bit pattern (if there is no
truncation). —end note ]

Related

c++ safeness of code with implicit conversion between signed and unsigned

According to the rules on implicit conversions between signed and unsigned integer types, discussed here and here, when summing an unsigned int with a int, the signed int is first converted to an unsigned int.
Consider, e.g., the following minimal program
#include <iostream>
int main()
{
unsigned int n = 2;
int x = -1;
std::cout << n + x << std::endl;
return 0;
}
The output of the program is, nevertheless, 1 as expected: x is converted first to an unsigned int, and the sum with n leads to an integer overflow, giving the "right" answer.
In a code like the previous one, if I know for sure that n + x is positive, can I assume that the sum of unsigned int n and int x gives the expected value?
In a code like the previous one, if I know for sure that n + x is positive, can I assume that the sum of unsigned int n and int x gives the expected value?
Yes.
First, the signed value converted to unsigned, using modulo arithmetic:
If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2n
where n is the number of bits used to represent the unsigned type).
Then two unsigned values will be added using modulo arithmetic:
Unsigned integers shall obey the laws of arithmetic modulo 2n where n is the number of bits in the value representation of that particular size of integer.
This means that you'll get the expected answer.
Even, if the result would be negative in the mathematical sense, the result in C++ would be a number which is modulo-equal to the negative number.
Note that I've supposed here that you add two same-sized integers.
I think you can be sure and it is not implementation defined, although this statement requires some interpretations of the standard when it comes to systems that do not use two's complement for representing negative values.
First, let's state the things that are clear: unsigned integrals do not overflow but take on a modulo 2^nrOfBits-value (cf this online C++ standard draft):
6.7.1 Fundamental types
(7) Unsigned integers shall obey the laws of arithmetic modulo 2n
where n is the number of bits in the value representation of that
particular size of integer.
So it's just a matter of whether a negative value nv is converted correctly into an unsigned integral bit pattern nv(conv) such that x + nv(conv) will always be the same as x - nv. For the case of a system using two's complement, things are clear, since the two's complement is actually designed such that this arithmetic works immediately.
For systems using other representations of negative values, we'll have to read the standard carefully:
7.8 Integral conversions
(2) If the destination type is unsigned, the resulting value is the
least unsigned integer congruent to the source integer (modulo 2n
where n is the number of bits used to represent the unsigned type). [
Note: In a two’s complement representation, this conversion is
conceptual and there is no change in the bit pattern (if there is
notruncation). —endnote]
As the footnote explicitly says, that in a two's complement representation, there is no change in the bit pattern, we may assume that in systems other than 2s complement a real conversion will take place such that x + nv(conv) == x - nv.
So due to 7.8 (2), I'd say that your assumption is valid.

Negative size_t

Is it well-specified (for unsigned types in general), that:
static_assert(-std::size_t{1} == ~std::size_t{0}, "!");
I just looked into libstdc++'s std::align implementation and note using std::size_t negation:
inline void*
align(size_t __align, size_t __size, void*& __ptr, size_t& __space) noexcept
{
const auto __intptr = reinterpret_cast<uintptr_t>(__ptr);
const auto __aligned = (__intptr - 1u + __align) & -__align;
const auto __diff = __aligned - __intptr;
if ((__size + __diff) > __space)
return nullptr;
else
{
__space -= __diff;
return __ptr = reinterpret_cast<void*>(__aligned);
}
}
Unsigned integer types are defined to wrap around, and the highest possible value representable in an unsigned integer type is the number with all bits set to one - so yes.
As cpp-reference states it (arithmetic operators / overflow):
Unsigned integer arithmetic is always performed modulo 2n where n is
the number of bits in that particular integer. E.g. for unsigned int,
adding one to UINT_MAX gives ​0​, and subtracting one from 0​ gives
UINT_MAX.
Related: Is it safe to use negative integers with size_t?
Is it well-specified (for unsigned types in general), that:
static_assert(-std::size_t{1} == ~std::size_t{0}, "!");
No, it is not.
For calculations using unsigned types, the assertion must hold. However, this assertion is not guaranteed to use unsigned types. Unsigned types narrower than int would be promoted to signed int or unsigned int (depending on the types' ranges) before - or ~ is applied. If it is promoted to signed int, and signed int does not use two's complement for representing negative values, the assertion can fail.
libstdc++'s code, as shown, does not perform any arithmetic in any unsigned type narrower than int though. The 1u in __aligned ensures each of the calculations use unsigned int or size_t, whichever is larger. This applies even to the subtraction in __space -= __diff.
Unsigned types at least as wide as unsigned int do not undergo integer promotions, so arithmetic and logical operations on them is applied in their own type, for which Johan Lundberg's answer applies: that's specified to be performed modulo 2N.

How to take twos complement of a byte in c++?

I am looking at some C++ code and I see:
byte b = someByteValue;
// take twos complement
byte TwosComplement = -b;
Is this code taking the twos complement of b? If not, What is it doing?
This code definitely does compute the twos-complement of an 8-bit binary number, on any implementation where stdint.h defines uint8_t:
#include <stdint.h>
uint8_t twos_complement(uint8_t val)
{
return -(unsigned int)val;
}
That is because, if uint8_t is available, it must be an unsigned type that is exactly 8 bits wide. The conversion to unsigned int is necessary because uint8_t is definitely narrower than int. Without the conversion, the value will be promoted to int before it is negated, so, if you're on a non-twos-complement machine, it will not take the twos-complement.
More generally, this code computes the twos-complement of a value with any unsigned type (using C++ constructs for illustration - the behavior of unary minus is the same in both languages, assuming no user-defined overloads):
#include <cstdint>
#include <type_traits>
template <typename T>
T twos_complement(T val,
// "allow this template to be instantiated only for unsigned types"
typename std::enable_if<std::is_unsigned<T>::value>::type* = 0)
{
return -std::uintmax_t(val);
}
because unary minus is defined to take the twos-complement when applied to unsigned types. We still need a cast to an unsigned type that is no narrower than int, but now we need it to be at least as wide as any possible T, hence uintmax_t.
However, unary minus does not necessarily compute the twos-complement of a value whose type is signed, because C (and C++) still explicitly allow implementations based on CPUs that don't use twos-complement for signed quantities. As far as I know, no such CPU has been manufactured in at least 20 years, so the continued provision for them is kind of silly, but there it is. If you want to compute the twos-complement of a value even if its type happens to be signed, you have to do this: (C++ again)
#include <type_traits>
template <typename T>
T twos_complement(T val)
{
typedef std::make_unsigned<T>::type U;
return T(-uintmax_t(U(val)));
}
i.e. convert to the corresponding unsigned type, then to uintmax_t, then apply unary minus, then back-convert to the possibly-signed type. (The cast to U is required to make sure the value is zero- rather than sign-extended from its natural width.)
(If you find yourself doing this, though, stop and change the types in question to unsigned instead. Your future self will thank you.)
The correct expression will look
byte TwosComplement = ~b + 1;
Note: provided that byte is defined as unsigned char
On a two's complement machine negation computes the two's complement, yes.
On the Unisys something-something, hopefully now dead and buried (but was still extant a few years ago), no for a signed type.
C and C++ supports two's complement, one's complement and sign-and-magnitude representation of signed integers, and only with two's complement does negation do a two's complement.
With byte as an unsigned type negation plus conversion to byte produces the two's complement bitpattern, regardless of integer representation, because conversion to unsigned as well as unsigned arithmetic is modulo 2n where n is the number of value representation bits.
That is, the resulting value after assigning or initializing with -x is 2n - x which is the two's complement of x.
This does not mean that the negation itself necessarily computes the two's complement bitpattern. To understand this, note that with byte defined as unsigned char, and with sizeof(int) > 1, the byte value is promoted to int before the negation, i.e. the negation operation is done with a signed type. But converting the resulting negative value to unsigned byte, creates the two's complement bitpattern by definition and the C++ guarantee of modulo arithmetic and conversion to unsigned type.
The usefulness of 2's complement form follows from 2n - x = 1 + ((2n - 1) - x), where the last parenthesis is an all-ones bitpattern minus x, i.e. a simple bitwise inversion of x.
twos_complement code for a byte binary number :
int byte[] = {1, 0, 1, 1, 1, 1, 1, 1};
if (byte[0] != 0){
for (int i = 0; i < 8; i++){
if (byte[i] == 1)
byte[i] = 0;
else
byte[i] = 1;
}
for (int j = 7; j >= 0; j--){
if (byte[j] == 0){
byte[j] = 1;
break;
}
else {
byte[j] = 0;
}
}
}
for (int i = 0; i < 8; i++)
cout << byte[i];
cout << endl;

C++ signed integer conversion to unsigned with more bits

I wonder about type conversion from a smaller signed integer to a larger unsigned integer. It appears that the compiler first converts the signed integer to a signed integer of the same size as the destination, then to the unsigned integer.
Regard the following C++ code:
#include <assert.h>
#include <iostream>
typedef int sint;
typedef unsigned __int64 luint;
int main(int, char**) {
assert(sizeof(luint) > sizeof(sint));
sint i = -10;
luint j = i;
std::cout << std::hex << j;
}
Under Visual C++ this yields: fffffffffffffff6.
This is as I like it. Can I be sure that all compilers will behave this way? If the signed integer would be converted to an unsigned first and then to the new size, the result would have been fffffff6.
Signed to unsigned conversion uses modulo 2n arithmetic. From the C++11 Standard, section 4.7 Integral conversions [conv.integral] (§4.7/2):
If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source
integer (modulo 2n where n is the number of bits used to represent the unsigned type).
So j takes the value 264 − 10, which is 0xfffffffffffffff6.

Is there a safe way to get the unsigned absolute value of a signed integer, without triggering overflow?

Consider a typical absolute value function (where for the sake of argument the integral type of maximum size is long):
unsigned long abs(long input);
A naive implementation of this might look something like:
unsigned long abs(long input)
{
if (input >= 0)
{
// input is positive
// We know this is safe, because the maximum positive signed
// integer is always less than the maximum positive unsigned one
return static_cast<unsigned long>(input);
}
else
{
return static_cast<unsigned long>(-input); // ut oh...
}
}
This code triggers undefined behavior, because the negation of input may overflow, and triggering signed integer overflow is undefined behavior. For instance, on 2s complement machines, the absolute value of std::numeric_limits<long>::min() will be 1 greater than std::numeric_limits<long>::max().
What can a library author do to work around this problem?
One can cast to the unsigned variant first to avoid any undefined behavior:
unsigned long uabs(long input)
{
if (input >= 0)
{
// input is positive
return static_cast<unsigned long>(input);
}
else
{
return -static_cast<unsigned long>(input); // read on...
}
}
In the above code, we invoke two well defined operations. Converting the signed integer to the unsigned one is well defined by N3485 4.7 [conv.integral]/2:
If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2^n where n is the number of bits used to represent the unsigned type). [ Note: In a two’s complement representation, this conversion is conceptual and there is no change in the bit pattern (if there is no truncation). — end note ]
This basically says that when making the specific conversion of going from signed to unsigned, one can assume unsigned-style wraparound.
The negation of the unsigned integer is well defined by 5.3.1 [expr.unary.op]/8:
The negative of an unsigned quantity is computed by subtracting its value from 2^n , where n is the number of bits in the promoted operand.
These two requirements effectively force implementations to operate like a 2s complement machine would, even if the underlying machine is a 1s complement or signed magnitude machine.
A generalized C++11 version that returns the unsigned version of an integral type:
#include <type_traits>
template <typename T>
constexpr
typename std::make_unsigned<T>::type uabs(T x)
{
typename std::make_unsigned<T>::type ux = x;
return (x<0) ? -ux : ux; // compare signed x, negate unsigned x
}
This compiles on the Godbolt compiler explorer, with a test case showing that gcc -O3 -fsanitize=undefined finds no UB in uabs(std::numeric_limits<long>::min()); after constant-propagation, but does in std::abs().
Further template stuff should be possible to make a version that would return the unsigned version of integral types, but return T for floating-point types, if you want a general-purpose replacement for std::abs.
Just add one if negative.
unsigned long absolute_value(long x) {
if (x >= 0) return (unsigned long)x;
x = -(x+1);
return (unsigned long)x + 1;
}