Pointer arithmetic and integral promotion - c++

In the expression p + a where p is a pointer type and a is an integer, will integer promotion rules apply? For example, if a is a char, on a 64-bit machine it will surely be extended to 64 bit before being added to the pointer value (in the compiled assembly), but is it specified by the standards? What will it be promoted to? int, intptr_t or ptrdiff_t? What will unsigned char or size_t be converted to?

It does not seem required by the standard for any promotion to occur since char is an integral type:
For addition, either both operands shall have arithmetic or unscoped enumeration type,
or one operand shall be a pointer to a completely-defined object type and the other shall have integral or unscoped enumeration type
It seems implementations may depend on the type of pointer additions allowed by the underlying architecture - so if the archtecture supports address+BYTE - all is good with char - if not it will likely promote to the smallest address offset size supported.
The result of subtraction of pointers is defined to be of type `std::ptrdiff_t'
When two pointers to elements of the same array object are subtracted, the result is the difference of the subscripts of the two array elements. The type of the result is an implementation-defined signed integral type; this type shall be the same type that is defined as std::ptrdiff_t in the header

C++11 §5.7/1:
“The additive operators + and - group left-to-right. The usual arithmetic conversions are performed for operands of arithmetic or enumeration type.”
This apparently reduces the problem to considering the usual arithmetic conversions, defined by …
C++11 §5/9:
“Many binary operators that expect operands of arithmetic or
enumeration type cause conversions and yield result types in a similar
way. The purpose is to yield a common type, which is also the type of
the result This pattern is called the usual arithmetic conversions,
which are defined as follows:
If either operand is of scoped enumeration type (7.2), no conversions are performed; if the otheroperand does not have the same type, the expression is ill-formed.
If either operand is of type long double, the other shall be converted to long double.
Otherwise, if either operand is double, the other shall be converted to double.
Otherwise, if either operand is float, the other shall be converted to float.
Otherwise, the integral promotions (4.5) shall be performed on both operands. Then the following rules shall be applied to the promoted operands:
If both operands have the same type, no further conversion is needed.
Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank shall be converted to the type of the operand with greater rank.
Otherwise, if the operand that has unsigned integer type has rank greater than or equal to the rank of the type of the other operand, the operand with signed integer type shall be converted to the type of the operand with unsigned integer type.
Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, the operand with unsigned integer type shall be converted to the type of the operand with signed integer type.
Otherwise, both operands shall be converted to the unsigned integer type corresponding to the type of the operand with signed integer type.”
Followed mechanically, this set of rules would end up in the last bullet point (dash in the standard) and convert a pointer operand to the unsigned integer-type corresponding to something non-existing. Which is just wrong. So the wording “The usual arithmetic conversions are performed for operands of arithmetic or enumeration type” can not be interpreted literally – it's IMHO defective – but must be interpreted like “The usual arithmetic conversions are performed for invocations where both operands are of arithmetic or enumeration type“
So, promotions as such, which are invoked via the usual arithmetic conversions, don't come into play when one operand is a pointer.
But a bit further down in §5.7 one finds …
C++11 §5.7/5:
“When an expression that has integral type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integral expression.”
This defines the result entirely in terms of array indexing. For a char array the difference of subscripts can exceed the range of ptrdiff_t. A reasonable way for an implementation to arrange this, is to convert the non-pointer argument to the unsigned integral type size_t (effectively sign extension at the bit level), and use that value with modular arithmetic to compute the resulting pointer value.

I'd say normal integer promotion is applied to a. The C-Standard does not provide any specific rules for the conversion of the integer part of an arithmetic operation on a pointer.
That is, as a is declared char, it is converted to an int prior to being passed to the + operator.
If one adds a size_t it either stays what size_t is defined to be or if (for whatever reasons) it has a smaller rank then int it is promoted to an int.

Yes, it is specified in the C++ Standard (paragraph #1 section 5.7 Additive operators) that that
the usual arithmetic conversions are performed for operands of
arithmetic or enumeration type.
For types (for example char or unsigned char) that have rank less than int the integral promotion will be performed. For size_t (size_t has a rank that is not less than the rank of int or unsigned int) nothing will be done because there is no a second operand of arithmetic type.

Related

C++: Does Comparing different sized integers cause UB? [duplicate]

This question already has answers here:
Comparing int with long and others
(2 answers)
Closed 1 year ago.
So this is probably a really simple question and if it was not about C++ I would just go ahead and check if it works on my computer or not, but unfortunately in C++ things usually tend to work on a couple of systems while still being UB and therefore not working on other systems.
Consider the following code snippet:
unsigned long long int a = std::numeric_limits< unsigned long long int >::max();
unsigned int b = 12;
bool test = a > b;
My question is: Can we compare integers of different size with one another without explicitly casting the smaller type to the bigger one using e.g. static_cast without running into undefined behavior (UB)?
In general there are three ways I can imagine this turning out:
The smaller type is implicitly cast to the bigger type before conversion (either via a real cast or by some clever way of being able to "pretend" it had been casted)
The bigger type is truncated to the size of the smaller one before comparison
This is not defined and one needs to add in an explicit cast in order to arrive at defined behavior
This is not undefined behavior. This is covered by the usual arithmetic conversions which are detailed in section 8p11.5 of the C++17 standard:
The integral promotions (7.6) shall be performed on both operands.
Then the following rules shall be applied to the promoted operands:
(11.5.1) If both operands have the same type, no further conversion is needed.
(11.5.2) Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser
integer conversion rank shall be converted to the type of the operand
with greater rank.
(11.5.3) Otherwise, if the operand that has unsigned integer type has rank greater than or equal to the rank of the type of the other
operand, the operand with signed integer type shall be converted to
the type of the operand with unsigned integer type.
(11.5.4)Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with
unsigned integer type, the operand with unsigned integer type shall be
converted to the type of the operand with signed integer type.
(11.5.5)Otherwise, both operands shall be converted to the unsigned integer type corresponding to the type of the operand with signed
integer type.
The passage in bold is what applies here. Since both types are unsigned, the the smaller type is converted to the larger type as the format can hold a subset of values the latter can hold.
This is safe. C++ has what are called the Usual arithmetic conversions and they handle how to implicitly convert the objects passed to the built in binary operators.
In this case, integer promotion happens and b is converted to a unsigned long long int for you and then operator > is evaluated.

C++ Unexpected Integer Promotion

I was writing some code recently that was actually supposed to test other code, and I stumbled upon a surprising case of integer promotion. Here's the minimal testcase:
#include <cstdint>
#include <limits>
int main()
{
std::uint8_t a, b;
a = std::numeric_limits<std::uint8_t>::max();
b = a;
a = a + 1;
if (a != b + 1)
return 1;
else
return 0;
}
Surprisingly this program returns 1. Some debugging and a hunch revealed that b + 1 in the conditional was actually returning 256, while a + 1 in assignment produced the expected value of 0.
Section 8.10.6 (on the equality/ineuqlity operators) of the C++17 draft states that
If both operands are of arithmetic or enumeration type, the usual arithmetic conversions are performed on
both operands; each of the operators shall yield true if the specified relationship is true and false if it is
false.
What are "the usual arithmetic conversions", and where are they defined in the standard? My guess is that they implicitly promote smaller integers to int or unsigned int for certain operators (which is also supported by the fact that replacing std::uint8_t with unsigned int yields 0, and further in that the assignment operator lacks the "usual arithmetic conversions" clause).
What are "the usual arithmetic conversions", and where are they defined in the standard?
[expr.arith.conv]/1
Many binary operators that expect operands of arithmetic or
enumeration type cause conversions and yield result types in a similar
way. The purpose is to yield a common type, which is also the type of
the result. This pattern is called the usual arithmetic conversions,
which are defined as follows:
(1.1) If either operand is of scoped enumeration type, no conversions
are performed; if the other operand does not have the same type, the
expression is ill-formed.
(1.2) If either operand is of type long double, the other shall be
converted to long double.
(1.3) Otherwise, if either operand is double, the other shall be
converted to double.
(1.4) Otherwise, if either operand is float, the other shall be
converted to float.
(1.5) Otherwise, the integral promotions ([conv.prom]) shall be
performed on both operands.59 Then the following rules shall be
applied to the promoted operands:
(1.5.1) If both operands have the same type, no further conversion is
needed.
(1.5.2) Otherwise, if both operands have signed integer types or both
have unsigned integer types, the operand with the type of lesser
integer conversion rank shall be converted to the type of the operand
with greater rank.
(1.5.3) Otherwise, if the operand that has unsigned integer type has
rank greater than or equal to the rank of the type of the other
operand, the operand with signed integer type shall be converted to
the type of the operand with unsigned integer type.
(1.5.4) Otherwise, if the type of the operand with signed integer type
can represent all of the values of the type of the operand with
unsigned integer type, the operand with unsigned integer type shall be
converted to the type of the operand with signed integer type.
(1.5.5) Otherwise, both operands shall be converted to the unsigned
integer type corresponding to the type of the operand with signed
integer type.
59) As a consequence, operands of type bool, char8_­t, char16_­t,
char32_­t, wchar_­t, or an enumerated type are converted to some
integral type.
For uint8_t vs int (for operator+ and operator!= later), #1.5 is applied, uint8_t will be promoted to int, and the result of operator+ is int too.
On the other hand, for unsigned int vs int (for operator+), #1.5.3 is applied, int will be converted to unsigned int, and the result of operator+ is unsigned int.
Your guess is correct. Operands to many operators in C++ (e.g., binary arithmetic and comparison operators) are subject to the usual arithmetic conversions. In C++17, the usual arithmetic conversions are specified in [expr]/11. I'm not going to quote the whole paragraph here because it's rather large (you can just click on the link), but for integral types, the usual arithmetic conversions boil down to integral promotions being applied followed by effectively some more promoting in the sense that if the types of the two operands after the initial integral promotions are not the same, the smaller type is converted to the larger one of the two. The integral promotions basically mean that any type smaller than an int will be promoted to int or unsigned int, whichever of the two can represent all possible values of the original type, which is mainly what is causing the behavior in your example.
As you have already figured out yourself, in your code, the usual arithmetic conversions happen in a = a + 1; and, most noticeably, in the condition of your if
if (a != b + 1)
…
where they cause b to be promoted to int, making the result of b + 1 to be of type int, as well as a being promoted to int and the !=, thus, happening on values of type int, which causes the condition to be true rather than false…

Arithmetic conversion VS integral promotion

char cval;
short sval;
long lval;
sval + cval; // sval and cval promoted to int
cval + lval; // cval converted to long
This is a piece of code on C++ Primer.
I know sval+cval generates an int type according to
convert the small integral types to a larger integral type. The types
bool, char, signed char, unsigned char, short, and unsigned short are
promoted to int if all possible values of that type fit in an int.
But for the last one I couldn't understand why it uses "converted". Why is cval not promoted to int first and then the int converted (or maybe promoted I'm not sure whether promoted can be used from int to long because I only see definition of promotion on smaller type to int) to long. I didn't see any explanation or examples on char straightly to long in that part of the book. Is there any thing wrong with my understanding?
I'm quite new at C++, someone please enlighten me! Many thanks in advance!
The additive operators perform what is called the usual arithmetic conversion on their operands which can include integral promotions and then after that we can have further conversions. The purpose is to yield a common type and if the promotions do not accomplish that then a further conversion is required.
This is covered in section 5 [expr] of the draft C++ standard which says (emphasis mine):
Many binary operators that expect operands of arithmetic or enumeration type cause conversions and yield
result types in a similar way. The purpose is to yield a common type, which is also the type of the result.
This pattern is called the usual arithmetic conversions, which are defined as follow
and includes the following bullet:
Otherwise, the integral promotions (4.5) shall be performed on both operands.61 Then the following
rules shall be applied to the promoted operands:
which has the following bullets:
If both operands have the same type, no further conversion is needed
Otherwise, if both operands have signed integer types or both have unsigned integer types, the
operand with the type of lesser integer conversion rank shall be converted to the type of the
operand with greater rank.
Otherwise, if the operand that has unsigned integer type has rank greater than or equal to the
rank of the type of the other operand, the operand with signed integer type shall be converted to
the type of the operand with unsigned integer type.
Otherwise, if the type of the operand with signed integer type can represent all of the values of
the type of the operand with unsigned integer type, the operand with unsigned integer type shall
be converted to the type of the operand with signed integer type.
Otherwise, both operands shall be converted to the unsigned integer type corresponding to the
type of the operand with signed integer type.
So in the first case after promotions they both have the same type(int) so no further conversion is needed.
In the second case after promotions they do not(int and long) so a further conversion is required.
From the C++11 Standard:
4 Standard conversions
1 Standard conversions are implicit conversions with built-in meaning. Clause 4 enumerates the full set of such
conversions. A standard conversion sequence is a sequence of standard conversions in the following order:
— Zero or one conversion from the following set: lvalue-to-rvalue conversion, array-to-pointer conversion, and function-to-pointer conversion.
— Zero or one conversion from the following set: integral promotions, floating point promotion, integral conversions, floating point conversions, floating-integral conversions, pointer conversions, pointer to member conversions, and boolean conversions.
— Zero or one qualification conversion.
In the expression,
cval + lval;
since cval is not of type long, it has to be converted to long. However, in the process of applying the standard conversions, integral promotion comes ahead of conversions. Hence, cval is promoted to an int first before being converted to a long.

subtraction of two unsigned gives signed

I have the following piece of code:
#include <cstdint>
template <typename T>
T test(T a, T b)
{
float aabb = reinterpret_cast<float>(a - b);
}
int main(int argc, const char *argv[])
{
std::uint8_t a8, b8;
test(a8, b8);
return 0;
}
I know that the reinterpret_cast<float> can't work and that it gives an error at compile time. I am using that error so that the compiler tells me the type of a - b.
The problem is that in this case, it says that the type of a - b is int when both of them are uint8_t (unsigned char). The same happens with uint16_t. But not with uint32_t which it says that a - b is unsigned int.
So, my question is: Is this intended behaviour (that unsigned char - unsigned char gives an int), or is this some kind of weird compiler bug (tested with both GCC and clang) ?
Yes, this is expected, as part of the so-called usual arithmetic conversions combined with the rules for integral promotion.
The exact wording changed between C++03 and C++11, but the end result is the same in this case.
[C++03: 4.5/1]: An rvalue of type char, signed char, unsigned char, short int, or unsigned short int can be converted to an rvalue of type int if int can represent all the values of the source type; otherwise, the source rvalue can be converted to an rvalue of type unsigned int.
[C++03: 4.5/5]: These conversions are called integral promotions.
[C++03: 5/9]: Many binary operators that expect operands of arithmetic or enumeration type cause conversions and yield result types in a similar way. The purpose is to yield a common type, which is also the type of the result.
This pattern is called the usual arithmetic conversions, which are defined as follows:
If either operand is of type long double, the other shall be converted to long double.
Otherwise, if either operand is double, the other shall be converted to double.
Otherwise, if either operand is float, the other shall be converted to float.
Otherwise, the integral promotions (4.5) shall be performed on both operands.54
Then, if either operand is unsigned long the other shall be converted to unsigned long.
Otherwise, if one operand is a long int and the other unsigned int, then if a long int can represent all the values of an unsigned int, the unsigned int shall be converted to a long int; otherwise both operands shall be converted to unsigned long int.
Otherwise, if either operand is long, the other shall be converted to long.
Otherwise, if either operand is unsigned, the other shall be converted to unsigned.
[Note: otherwise, the only remaining case is that both operands are int ]
[C++11: 4.5/1]: A prvalue of an integer type other than bool, char16_t, char32_t, or wchar_t whose integer conversion rank (4.13) is less than the rank of int can be converted to a prvalue of type int if int can represent all the values of the source type; otherwise, the source prvalue can be converted to a prvalue of type unsigned int.
[C++11: 4.5/7]: These conversions are called integral promotions.
[C++11: 5.9]: Many binary operators that expect operands of arithmetic or enumeration type cause conversions and yield result types in a similar way. The purpose is to yield a common type, which is also the type of the result.
This pattern is called the usual arithmetic conversions, which are defined as follows:
If either operand is of scoped enumeration type (7.2), no conversions are performed; if the other operand does not have the same type, the expression is ill-formed.
If either operand is of type long double, the other shall be converted to long double.
Otherwise, if either operand is double, the other shall be converted to double.
Otherwise, if either operand is float, the other shall be converted to float.
Otherwise, the integral promotions (4.5) shall be performed on both operands.59 Then the following rules shall be applied to the promoted operands:
If both operands have the same type, no further conversion is needed.
Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank shall be converted to the type of the operand with greater rank.
Otherwise, if the operand that has unsigned integer type has rank greater than or equal to the rank of the type of the other operand, the operand with signed integer type shall be converted to the type of the operand with unsigned integer type.
Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, the operand with unsigned integer type shall be converted to the type of the operand with signed integer type.
Otherwise, both operands shall be converted to the unsigned integer type corresponding to the type of the operand with signed integer type.

Why auto is deduced to int instead of uint16_t

I have the following code:
uint16_t getLastMarker(const std::string &number);
...
const auto msgMarker = getLastMarker(msg.number) + static_cast<uint16_t>(1);
static_assert(std::is_same<decltype(msgMarker), const int>::value, "Should fail");
static_assert(std::is_same<decltype(msgMarker), const uint16_t>::value, "Should not fail");
and I expect that the first assertion will fail and second one will not. However gcc 4.9.2 and clang 3.6 do the opposite. If I use uint16_t instead of auto in my code proper assertion fails and another one succeeds.
P.S. Initially I had just 1 instead of static_cast<uint16_t>(1) and thought that the issue is caused by the fact that numeric literal 1 has type int but wrong assertion fails even after explicit cast here.
Addition will perform the usual arithmetic conversions on its operands which in this case will result in the operands being promoted to int due the the integer promotions and the result will also be int.
You can use uint16_t instead of auto to force a conversion back or in the general case you can use static_cast.
For a rationale as to why type smaller than int are promoted to larger types see Why must a short be converted to an int before arithmetic operations in C and C++?.
For reference, from the draft C++ standard section 5.7 Additive operators:
[...]The usual arithmetic conversions are performed for operands of
arithmetic or enumeration type[...]
and from section 5 Expressions:
[...]Otherwise, the integral promotions (4.5) shall be performed on
both operands.59 Then the following rules shall be applied
to the promoted operands[...]
and from section 4.5 Integral promotions (emphasis mine):
A prvalue of an integer type other than bool, char16_t, char32_t, or
wchar_t whose integer conversion rank (4.13) is less than the rank
of int can be converted to a prvalue of type int if int can represent
all the values of the source type; otherwise, the source prvalue can
be converted to a prvalue of type unsigned int.
Assuming int is larger than 16-bit.
Arithmetic operations don't work on any type smaller than int. So, if uint16_t is smaller than int, it will be promoted to int (or possibly a larger type, if necessary to match the other operand) before performing the addition.
The result of the addition will be the promoted type. If you want another type, you'll have to convert afterwards.