C++ Unexpected Integer Promotion

C++ Unexpected Integer Promotion - c++

I was writing some code recently that was actually supposed to test other code, and I stumbled upon a surprising case of integer promotion. Here's the minimal testcase:
#include <cstdint>
#include <limits>
int main()
{
std::uint8_t a, b;
a = std::numeric_limits<std::uint8_t>::max();
b = a;
a = a + 1;
if (a != b + 1)
return 1;
else
return 0;
}
Surprisingly this program returns 1. Some debugging and a hunch revealed that b + 1 in the conditional was actually returning 256, while a + 1 in assignment produced the expected value of 0.
Section 8.10.6 (on the equality/ineuqlity operators) of the C++17 draft states that
If both operands are of arithmetic or enumeration type, the usual arithmetic conversions are performed on
both operands; each of the operators shall yield true if the specified relationship is true and false if it is
false.
What are "the usual arithmetic conversions", and where are they defined in the standard? My guess is that they implicitly promote smaller integers to int or unsigned int for certain operators (which is also supported by the fact that replacing std::uint8_t with unsigned int yields 0, and further in that the assignment operator lacks the "usual arithmetic conversions" clause).

What are "the usual arithmetic conversions", and where are they defined in the standard?
[expr.arith.conv]/1
Many binary operators that expect operands of arithmetic or
enumeration type cause conversions and yield result types in a similar
way. The purpose is to yield a common type, which is also the type of
the result. This pattern is called the usual arithmetic conversions,
which are defined as follows:
(1.1) If either operand is of scoped enumeration type, no conversions
are performed; if the other operand does not have the same type, the
expression is ill-formed.
(1.2) If either operand is of type long double, the other shall be
converted to long double.
(1.3) Otherwise, if either operand is double, the other shall be
converted to double.
(1.4) Otherwise, if either operand is float, the other shall be
converted to float.
(1.5) Otherwise, the integral promotions ([conv.prom]) shall be
performed on both operands.59 Then the following rules shall be
applied to the promoted operands:
(1.5.1) If both operands have the same type, no further conversion is
needed.
(1.5.2) Otherwise, if both operands have signed integer types or both
have unsigned integer types, the operand with the type of lesser
integer conversion rank shall be converted to the type of the operand
with greater rank.
(1.5.3) Otherwise, if the operand that has unsigned integer type has
rank greater than or equal to the rank of the type of the other
operand, the operand with signed integer type shall be converted to
the type of the operand with unsigned integer type.
(1.5.4) Otherwise, if the type of the operand with signed integer type
can represent all of the values of the type of the operand with
unsigned integer type, the operand with unsigned integer type shall be
converted to the type of the operand with signed integer type.
(1.5.5) Otherwise, both operands shall be converted to the unsigned
integer type corresponding to the type of the operand with signed
integer type.
59) As a consequence, operands of type bool, char8_t, char16_t,
char32_t, wchar_t, or an enumerated type are converted to some
integral type.
For uint8_t vs int (for operator+ and operator!= later), #1.5 is applied, uint8_t will be promoted to int, and the result of operator+ is int too.
On the other hand, for unsigned int vs int (for operator+), #1.5.3 is applied, int will be converted to unsigned int, and the result of operator+ is unsigned int.

Your guess is correct. Operands to many operators in C++ (e.g., binary arithmetic and comparison operators) are subject to the usual arithmetic conversions. In C++17, the usual arithmetic conversions are specified in [expr]/11. I'm not going to quote the whole paragraph here because it's rather large (you can just click on the link), but for integral types, the usual arithmetic conversions boil down to integral promotions being applied followed by effectively some more promoting in the sense that if the types of the two operands after the initial integral promotions are not the same, the smaller type is converted to the larger one of the two. The integral promotions basically mean that any type smaller than an int will be promoted to int or unsigned int, whichever of the two can represent all possible values of the original type, which is mainly what is causing the behavior in your example.
As you have already figured out yourself, in your code, the usual arithmetic conversions happen in a = a + 1; and, most noticeably, in the condition of your if
if (a != b + 1)
…
where they cause b to be promoted to int, making the result of b + 1 to be of type int, as well as a being promoted to int and the !=, thus, happening on values of type int, which causes the condition to be true rather than false…

Related

What causes this signed int to unsigned int conversion using a ternary, but not shorts?

When using a ternary operator within list initialization, what causes the implicit conversion of int to unsigned int (and similarly for long long) but not short to unsigned short (and similarly for char).
Specifically, I am surprised that the i32v2 function compiles fine whereas the others do not:
unsigned short f16(unsigned short x);
unsigned int f32(unsigned int x);
void i16(short value) {
unsigned short encoded{value}; // narrowing, makes sense
}
void i32(int value) {
unsigned int encoded{value}; // narrowing, makes sense
}
void i16v2(short value) {
unsigned short encoded{false ? value : f16(value)}; // narrowing, makes sense
}
void i32v2(int value) {
unsigned int encoded{false ? value : f32(value)}; // not narrowing, huh?
}
Complete example here: https://godbolt.org/z/fVTcrr
I am guessing the ternary operator implicitly converts int to unsigned int but I do not understand why it is unable to convert short to unsigned short similarly.
I would expect, if it was possible for int, then the ternary operator should also be able to convert any of the other signed types to the unsigned when possible:
If the destination type is unsigned, the resulting value is the smallest unsigned value equal to the source value modulo 2n
where n is the number of bits used to represent the destination type.
(https://en.cppreference.com/w/cpp/language/implicit_conversion)
Can someone explain this behavior, and if possible, reference the standard or applicable cppreference page?

The standard says (quotes from latest draft):
[expr.cond]
Lvalue-to-rvalue, array-to-pointer, and function-to-pointer standard conversions are performed on the second and third operands.
After those conversions, one of the following shall hold:
The second and third operands have the same type; ... [does not apply]
The second and third operands have arithmetic [applies] or enumeration type; the usual arithmetic conversions are performed to bring them to a common type, and the result is of that type.
...
[expr.arith.conv]
Many binary operators that expect operands of arithmetic or enumeration type cause conversions and yield result types in a similar way.
The purpose is to yield a common type, which is also the type of the result.
This pattern is called the usual arithmetic conversions, which are defined as follows:
If either operand is of scoped enumeration type ... [does not apply]
If either operand is of type long double ... [does not apply]
Otherwise, if either operand is double ... [does not apply]
Otherwise, if either operand is float ... [does not apply]
Otherwise, the integral promotions ([conv.prom]) shall be performed on both operands.
Then the following rules shall be applied to the promoted operands:
...
[conv.prom]
A prvalue of an integer type other than bool, char16_t, char32_t, or wchar_t [applies] whose integer conversion rank ([conv.rank]) is less than the rank of int [applies] can be converted to a prvalue of type int if int can represent all the values of the source type [evidently applies1]; otherwise, the source prvalue can be converted to a prvalue of type unsigned int.
These conversions are called integral promotions.
So, in the case of i16v2, the second and third operands are short and unsigned short. Both evidently1 promote to int on your system, and the int result of the conditional operator is then used to initialise the unsigned short.
In the case of i32v2, no promotions apply and the common type of int and unsigned int is unsigned int.
1 I say evidently, because technically, unsigned short could promote to unsigned int on some exotic system where their size is the same, in which case int couldn't represent all values of unsigned short. The outcome that you observe shows that is not the case for your system, which is to be expected.

Note that for false ? value : f16(value), integral_promotion is performed on the operands firstly. For arithmetic operator,
If the operand passed to an arithmetic operator is integral or unscoped enumeration type, then before any other action (but after lvalue-to-rvalue conversion, if applicable), the operand undergoes integral promotion.
and
The following implicit conversions are classified as integral
promotions:
signed char or signed short can be converted to int;
That means the return type of false ? value : f16(value) is int, then causes the narrowing conversion to unsigned short.
On the other hand, the return type, i.e. the common type for false ? value : f32(value) is unsigned int, then unsigned int encoded{false ? value : f32(value)}; is fine.
Otherwise, the operand has integer type (because bool, char, char8_t,
char16_t, char32_t, wchar_t, and unscoped enumeration were promoted at
this point) and integral conversions are applied to produce the common
type, as follows:
...
Otherwise, if the unsigned operand's conversion rank is greater or equal to the conversion rank of the signed operand, the signed operand
is converted to the unsigned operand's type.
For long or long long, they won't be promoted to int, then they don't have such issues.

Arithmetic operations on unsigned variables produce signed values, is it standard behavior?

Subtracting two unsigned variables I expect an unsigned result. I do realize that overflow happens but that's ok, I'm actually counting on it.
Seems like that's not the case when the result needs to be used in another operation. Is this standard or undefined behavior?
uint8_t n1 = 255;
uint8_t z = 0;
uint8_t n = 1;
printf("n1 is %" PRIu8 "\n", n1);
printf("z - n is %" PRIu8 "\n", z - n);
printf("n1 < z: %s\n", n1 < z ? "yes" : "no");
printf("z - n < z: %s\n", z - n < z ? "yes" : "no");
printf("(uint8_t)(z - n) < (uint8_t)z: %s\n", (uint8_t)(z - n) < (uint8_t)z ? "yes" : "no");
Output:
n1 is 255
z - n is 255
n1 < z: no
z - n < z: yes
(uint8_t)(z - n) < (uint8_t)z: no

When the variables are of type uint8_t, they are both promoted to (signed) int and then the subtraction occurs between the promoted values, yielding a (signed) int value. It is mandated behaviour.
In C11, §6.3.1.8 Usual arithmetic conversions says:
Many operators that expect operands of arithmetic type cause conversions and yield result types in a similar way. The purpose is to determine a common real type for the operands and result. For the specified operands, each operand is converted, without change of type domain, to a type whose corresponding real type is the common real type. Unless explicitly stated otherwise, the common real type is also the corresponding real type of the result, whose type domain is the type domain of the operands if they are the same, and complex otherwise. This pattern is called the usual arithmetic conversions:
First, if the corresponding real type of either operand is long double, the other operand is converted, without change of type domain, to a type whose corresponding real type is long double.
Otherwise, if the corresponding real type of either operand is double, the other operand is converted, without change of type domain, to a type whose corresponding real type is double.
Otherwise, if the corresponding real type of either operand is float, the other operand is converted, without change of type domain, to a type whose corresponding real type is float.62)
Otherwise, the integer promotions are performed on both operands. Then the following rules are applied to the promoted operands:
If both operands have the same type, then no further conversion is needed.
Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank is converted to the type of the operand with greater rank.
Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.
Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, then the operand with unsigned integer type is converted to the type of the operand with signed integer type.
Otherwise, both operands are converted to the unsigned integer type corresponding to the type of the operand with signed integer type.
See §6.3.1 Arithmetic operands and §6.3.1.1 Boolean, characters, and integers for more information about 'integer promotions'.
The following may be used in an expression wherever an int or unsigned int may be used:
An object or expression with an integer type (other than int or unsigned int) whose integer conversion rank is less than or equal to the rank of int and unsigned int.
A bit-field of type _Bool, int, signed int, or unsigned int.
If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions.58) All other types are unchanged by the integer promotions.
The term 'rank' is defined in that section; it's complex, but basically, long has a higher rank than int, and int has a higher rank than char.
The rules are undoubtedly slightly different in C++, but the net result is essentially the same.

In arithmetic, integers narrower than int are promoted to int, and then arithmetic on them is done in the int type. If you store the result in a uint8_t or other type, it will be converted to that type. But if you pass it to printf, it will remain an int.
In C, the usual arithmetic conversions for real numbers are:
If either type is long double, the other is converted to long double.
Otherwise, if either is double, the other is converted to double.
Otherwise, if either is float, the other is converted to float.
Otherwise, the integer promotions are performed on each operand. Then:
If both operands have the same time, no further conversion is performed.
Otherwise, if both are signed or both are unsigned, the narrower1 operand is converted to the wider operand.
Otherwise, if the unsigned operand is as wide as or wider than the other, the signed operand is converted to the unsigned type.
Otherwise, if the signed type can represent all the values of the unsigned type, the unsigned operand is converted to the signed type.
Otherwise, both operands are converted to the unsigned type corresponding to the signed type.
The integer promotions are:
If a type is wider1 than unsigned int, it is not changed.
Otherwise, if an int can represent all values of the type, the value is converted to int.
Otherwise, the value is converted to unsigned int.
Footnote
1 The C standard actually uses a technical classification of rank, which involves further details. It affects C implementations where multiple integer types can have the same width, aside from just being signed and unsigned.

Arithmetic conversion VS integral promotion

char cval;
short sval;
long lval;
sval + cval; // sval and cval promoted to int
cval + lval; // cval converted to long
This is a piece of code on C++ Primer.
I know sval+cval generates an int type according to
convert the small integral types to a larger integral type. The types
bool, char, signed char, unsigned char, short, and unsigned short are
promoted to int if all possible values of that type fit in an int.
But for the last one I couldn't understand why it uses "converted". Why is cval not promoted to int first and then the int converted (or maybe promoted I'm not sure whether promoted can be used from int to long because I only see definition of promotion on smaller type to int) to long. I didn't see any explanation or examples on char straightly to long in that part of the book. Is there any thing wrong with my understanding?
I'm quite new at C++, someone please enlighten me! Many thanks in advance!

The additive operators perform what is called the usual arithmetic conversion on their operands which can include integral promotions and then after that we can have further conversions. The purpose is to yield a common type and if the promotions do not accomplish that then a further conversion is required.
This is covered in section 5 [expr] of the draft C++ standard which says (emphasis mine):
Many binary operators that expect operands of arithmetic or enumeration type cause conversions and yield
result types in a similar way. The purpose is to yield a common type, which is also the type of the result.
This pattern is called the usual arithmetic conversions, which are defined as follow
and includes the following bullet:
Otherwise, the integral promotions (4.5) shall be performed on both operands.61 Then the following
rules shall be applied to the promoted operands:
which has the following bullets:
If both operands have the same type, no further conversion is needed
Otherwise, if both operands have signed integer types or both have unsigned integer types, the
operand with the type of lesser integer conversion rank shall be converted to the type of the
operand with greater rank.
Otherwise, if the operand that has unsigned integer type has rank greater than or equal to the
rank of the type of the other operand, the operand with signed integer type shall be converted to
the type of the operand with unsigned integer type.
Otherwise, if the type of the operand with signed integer type can represent all of the values of
the type of the operand with unsigned integer type, the operand with unsigned integer type shall
be converted to the type of the operand with signed integer type.
Otherwise, both operands shall be converted to the unsigned integer type corresponding to the
type of the operand with signed integer type.
So in the first case after promotions they both have the same type(int) so no further conversion is needed.
In the second case after promotions they do not(int and long) so a further conversion is required.

From the C++11 Standard:
4 Standard conversions
1 Standard conversions are implicit conversions with built-in meaning. Clause 4 enumerates the full set of such
conversions. A standard conversion sequence is a sequence of standard conversions in the following order:
— Zero or one conversion from the following set: lvalue-to-rvalue conversion, array-to-pointer conversion, and function-to-pointer conversion.
— Zero or one conversion from the following set: integral promotions, floating point promotion, integral conversions, floating point conversions, floating-integral conversions, pointer conversions, pointer to member conversions, and boolean conversions.
— Zero or one qualification conversion.
In the expression,
cval + lval;
since cval is not of type long, it has to be converted to long. However, in the process of applying the standard conversions, integral promotion comes ahead of conversions. Hence, cval is promoted to an int first before being converted to a long.

subtraction of two unsigned gives signed

I have the following piece of code:
#include <cstdint>
template <typename T>
T test(T a, T b)
{
float aabb = reinterpret_cast<float>(a - b);
}
int main(int argc, const char *argv[])
{
std::uint8_t a8, b8;
test(a8, b8);
return 0;
}
I know that the reinterpret_cast<float> can't work and that it gives an error at compile time. I am using that error so that the compiler tells me the type of a - b.
The problem is that in this case, it says that the type of a - b is int when both of them are uint8_t (unsigned char). The same happens with uint16_t. But not with uint32_t which it says that a - b is unsigned int.
So, my question is: Is this intended behaviour (that unsigned char - unsigned char gives an int), or is this some kind of weird compiler bug (tested with both GCC and clang) ?

Yes, this is expected, as part of the so-called usual arithmetic conversions combined with the rules for integral promotion.
The exact wording changed between C++03 and C++11, but the end result is the same in this case.
[C++03: 4.5/1]: An rvalue of type char, signed char, unsigned char, short int, or unsigned short int can be converted to an rvalue of type int if int can represent all the values of the source type; otherwise, the source rvalue can be converted to an rvalue of type unsigned int.
[C++03: 4.5/5]: These conversions are called integral promotions.
[C++03: 5/9]: Many binary operators that expect operands of arithmetic or enumeration type cause conversions and yield result types in a similar way. The purpose is to yield a common type, which is also the type of the result.
This pattern is called the usual arithmetic conversions, which are defined as follows:
If either operand is of type long double, the other shall be converted to long double.
Otherwise, if either operand is double, the other shall be converted to double.
Otherwise, if either operand is float, the other shall be converted to float.
Otherwise, the integral promotions (4.5) shall be performed on both operands.54
Then, if either operand is unsigned long the other shall be converted to unsigned long.
Otherwise, if one operand is a long int and the other unsigned int, then if a long int can represent all the values of an unsigned int, the unsigned int shall be converted to a long int; otherwise both operands shall be converted to unsigned long int.
Otherwise, if either operand is long, the other shall be converted to long.
Otherwise, if either operand is unsigned, the other shall be converted to unsigned.
[Note: otherwise, the only remaining case is that both operands are int ]
[C++11: 4.5/1]: A prvalue of an integer type other than bool, char16_t, char32_t, or wchar_t whose integer conversion rank (4.13) is less than the rank of int can be converted to a prvalue of type int if int can represent all the values of the source type; otherwise, the source prvalue can be converted to a prvalue of type unsigned int.
[C++11: 4.5/7]: These conversions are called integral promotions.
[C++11: 5.9]: Many binary operators that expect operands of arithmetic or enumeration type cause conversions and yield result types in a similar way. The purpose is to yield a common type, which is also the type of the result.
This pattern is called the usual arithmetic conversions, which are defined as follows:
If either operand is of scoped enumeration type (7.2), no conversions are performed; if the other operand does not have the same type, the expression is ill-formed.
If either operand is of type long double, the other shall be converted to long double.
Otherwise, if either operand is double, the other shall be converted to double.
Otherwise, if either operand is float, the other shall be converted to float.
Otherwise, the integral promotions (4.5) shall be performed on both operands.59 Then the following rules shall be applied to the promoted operands:
If both operands have the same type, no further conversion is needed.
Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank shall be converted to the type of the operand with greater rank.
Otherwise, if the operand that has unsigned integer type has rank greater than or equal to the rank of the type of the other operand, the operand with signed integer type shall be converted to the type of the operand with unsigned integer type.
Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, the operand with unsigned integer type shall be converted to the type of the operand with signed integer type.
Otherwise, both operands shall be converted to the unsigned integer type corresponding to the type of the operand with signed integer type.

Pointer arithmetic and integral promotion

In the expression p + a where p is a pointer type and a is an integer, will integer promotion rules apply? For example, if a is a char, on a 64-bit machine it will surely be extended to 64 bit before being added to the pointer value (in the compiled assembly), but is it specified by the standards? What will it be promoted to? int, intptr_t or ptrdiff_t? What will unsigned char or size_t be converted to?

It does not seem required by the standard for any promotion to occur since char is an integral type:
For addition, either both operands shall have arithmetic or unscoped enumeration type,
or one operand shall be a pointer to a completely-defined object type and the other shall have integral or unscoped enumeration type
It seems implementations may depend on the type of pointer additions allowed by the underlying architecture - so if the archtecture supports address+BYTE - all is good with char - if not it will likely promote to the smallest address offset size supported.
The result of subtraction of pointers is defined to be of type `std::ptrdiff_t'
When two pointers to elements of the same array object are subtracted, the result is the difference of the subscripts of the two array elements. The type of the result is an implementation-defined signed integral type; this type shall be the same type that is defined as std::ptrdiff_t in the header

C++11 §5.7/1:
“The additive operators + and - group left-to-right. The usual arithmetic conversions are performed for operands of arithmetic or enumeration type.”
This apparently reduces the problem to considering the usual arithmetic conversions, defined by …
C++11 §5/9:
“Many binary operators that expect operands of arithmetic or
enumeration type cause conversions and yield result types in a similar
way. The purpose is to yield a common type, which is also the type of
the result This pattern is called the usual arithmetic conversions,
which are defined as follows:
If either operand is of scoped enumeration type (7.2), no conversions are performed; if the otheroperand does not have the same type, the expression is ill-formed.
If either operand is of type long double, the other shall be converted to long double.
Otherwise, if either operand is double, the other shall be converted to double.
Otherwise, if either operand is float, the other shall be converted to float.
Otherwise, the integral promotions (4.5) shall be performed on both operands. Then the following rules shall be applied to the promoted operands:
If both operands have the same type, no further conversion is needed.
Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank shall be converted to the type of the operand with greater rank.
Otherwise, if the operand that has unsigned integer type has rank greater than or equal to the rank of the type of the other operand, the operand with signed integer type shall be converted to the type of the operand with unsigned integer type.
Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, the operand with unsigned integer type shall be converted to the type of the operand with signed integer type.
Otherwise, both operands shall be converted to the unsigned integer type corresponding to the type of the operand with signed integer type.”
Followed mechanically, this set of rules would end up in the last bullet point (dash in the standard) and convert a pointer operand to the unsigned integer-type corresponding to something non-existing. Which is just wrong. So the wording “The usual arithmetic conversions are performed for operands of arithmetic or enumeration type” can not be interpreted literally – it's IMHO defective – but must be interpreted like “The usual arithmetic conversions are performed for invocations where both operands are of arithmetic or enumeration type“
So, promotions as such, which are invoked via the usual arithmetic conversions, don't come into play when one operand is a pointer.
But a bit further down in §5.7 one finds …
C++11 §5.7/5:
“When an expression that has integral type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integral expression.”
This defines the result entirely in terms of array indexing. For a char array the difference of subscripts can exceed the range of ptrdiff_t. A reasonable way for an implementation to arrange this, is to convert the non-pointer argument to the unsigned integral type size_t (effectively sign extension at the bit level), and use that value with modular arithmetic to compute the resulting pointer value.

I'd say normal integer promotion is applied to a. The C-Standard does not provide any specific rules for the conversion of the integer part of an arithmetic operation on a pointer.
That is, as a is declared char, it is converted to an int prior to being passed to the + operator.
If one adds a size_t it either stays what size_t is defined to be or if (for whatever reasons) it has a smaller rank then int it is promoted to an int.

Yes, it is specified in the C++ Standard (paragraph #1 section 5.7 Additive operators) that that
the usual arithmetic conversions are performed for operands of
arithmetic or enumeration type.
For types (for example char or unsigned char) that have rank less than int the integral promotion will be performed. For size_t (size_t has a rank that is not less than the rank of int or unsigned int) nothing will be done because there is no a second operand of arithmetic type.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js