Why does the expression below characterize a narrowing conversion? - c++

This expression can be found in the Example in §8.5.4/7 in the Standard (N3797)
unsigned int ui1 = {-1}; // error: narrows
Given §8.5.4/7 and its 4th bullet point:
A narrowing conversion is an implicit conversion:
from an integer type or unscoped enumeration type to an integer type that cannot represent all the values of the original type,
except where the source is a constant expression whose value after
integral promotions will fit into the target type.
I would say there's no narrowing here, as -1 is a constant expression, whose value after integral promotion fits into an unsigned int.
See also §4.5/1 about Integral Promotion:
A prvalue of an integer type other than bool, char16_t, char32_t, or
wchar_t whose integer conversion rank (4.13) is less than the rank of
int can be converted to a prvalue of type int if int can represent all
the values of the source type; otherwise, the source prvalue can be
converted to a prvalue of type unsigned int.
From 4.13 we have that the rank of -1 (an int) is equal to the rank of an unsigned int, and so it can be converted to an unsigned int.
Edit
Unfortunately Jerry Coffin removed his answer from this thread. I believe he was on the right track, if we accept the fact that the current reading of the 4th bullet point in §8.5.4/7 is wrong, after this change in the Standard.

There is no integral promotion from int to unsigned int, therefor it is still illformed.
That would be an integral conversion.

The change in the wording of the standard is intended to confirm the understanding that converting a negative value into an unsigned type is and always has been a narrowing conversion.
Informally, -1 cannot be represented within the range of any unsigned type, and the bit pattern that represents it does not represent the same value if stored in an unsigned int. Therefore this is a narrowing conversion and promotion/widening is not involved.
This is about the dainty art of reading the standard. As usual, the compiler knows best.

Narrowing is an implicit conversion from an integer type to an integer type that cannot represent all the values of the original type
Conversion from integer type int to integer type unsigned int, of course, cannot represent all the values of the original type -- the standard is quite unambigious here. If you really need it, you can do
unsigned int ui1 = {-1u};
This works without any errors/warnings since 1u is a literal of type unsigned int which is then negated. This is well-defined as [expr.unary.op] in the standard states:
The negative of an unsigned quantity is computed by subtracting its value from 2n, where n is the number of bits in the promoted operand.
-1 however is an int before and after the negation and hence it becomes a narrowing conversion. There are no negative literals; see this answer for details.
Promotion is, informally, copying the same value into a larger space, so it shouldn't be confused here as the sizes (of signed and unsigned) are equal. Even if you try to convert it to some type of larger size, say unsigned long long it's still a narrowing conversion as it cannot represent a negative number truly.

Related

Why is signed and unsigned addition converted differently for 16 and 32 bit integers?

It seems the GCC and Clang interpret addition between a signed and unsigned integers differently, depending on their size. Why is this, and is the conversion consistent on all compilers and platforms?
Take this example:
#include <cstdint>
#include <iostream>
int main()
{
std::cout <<"16 bit uint 2 - int 3 = "<<uint16_t(2)+int16_t(-3)<<std::endl;
std::cout <<"32 bit uint 2 - int 3 = "<<uint32_t(2)+int32_t(-3)<<std::endl;
return 0;
}
Result:
$ ./out.exe
16 bit uint 2 - int 3 = -1
32 bit uint 2 - int 3 = 4294967295
In both cases we got -1, but one was interpreted as an unsigned integer and underflowed. I would have expected both to be converted in the same way.
So again, why do the compilers convert these so differently, and is this guaranteed to be consistent? I tested this with g++ 11.1.0, clang 12.0. and g++ 11.2.0 on Arch Linux and Debian, getting the same result.
When you do uint16_t(2)+int16_t(-3), both operands are types that are smaller than int. Because of this, each operand is promoted to an int and signed + signed results in a signed integer and you get the result of -1 stored in that signed integer.
When you do uint32_t(2)+int32_t(-3), since both operands are the size of an int or larger, no promotion happens and now you are in a case where you have unsigned + signed which results in a conversion of the signed integer into an unsigned integer, and the unsigned value of -1 wraps to being the largest value representable.
So again, why do the compilers convert these so differently,
Standard quotes for [language-lawyer]:
[expr.arith.conv]
Many binary operators that expect operands of arithmetic or enumeration type cause conversions and yield result types in a similar way.
The purpose is to yield a common type, which is also the type of the result.
This pattern is called the usual arithmetic conversions, which are defined as follows:
...
Otherwise, the integral promotions ([conv.prom]) shall be performed on both operands.
Then the following rules shall be applied to the promoted operands:
If both operands have the same type, no further conversion is needed.
Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank shall be converted to the type of the operand with greater rank.
Otherwise, if the operand that has unsigned integer type has rank greater than or equal to the rank of the type of the other operand, the operand with signed integer type shall be converted to the type of the operand with unsigned integer type.
Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, the operand with unsigned integer type shall be converted to the type of the operand with signed integer type.
Otherwise, both operands shall be converted to the unsigned integer type corresponding to the type of the operand with signed integer type.
[conv.prom]
A prvalue of an integer type other than bool, char8_­t, char16_­t, char32_­t, or wchar_­t whose integer conversion rank ([conv.rank]) is less than the rank of int can be converted to a prvalue of type int if int can represent all the values of the source type; otherwise, the source prvalue can be converted to a prvalue of type unsigned int.
These conversions are called integral promotions.
std::uint16_t type may have a lower conversion rank than int in which case it will be promoted when used as an operand. int may be able to represent all values of std::uint16_t in which case the promotion will be to int. The common type of two int is int.
std::uint32_t type may have the same or a higher conversion rank than int in which case it won't be promoted. The common type of an unsigned type and a signed of same rank is an unsigned type.
For an explanation why this conversion behaviour was chosen, see chapter "6.3.1.1 Booleans, characters, and integers" of "
Rationale for
International Standard—
Programming Languages—
C". I won't quote the entire chapter here.
is this guaranteed to be consistent?
The consistency depends on relative sizes of the integer types which are implementation defined.
Why is this,
C (and hence C++) has a rule that effectively says when a type smaller than int is used in an expression it is first promoted to int (the actual rule is a little more complex than that to allow for multiple distinct types of the same size).
Section 6.3.1.1 of the Rationale for International Standard Programming Languages C claims that in early C compilers there were two versions of the promotion rule. "unsigned preserving" and "value preserving" and talks about why they chose the "value preserving" option. To summarise they believed it would produce correct results in a greater proportion of situations.
It does not however explain why the concept of promotion exists in the first place. I would speculate that it existed because on many processors, including the PDP-11 for which C was originally designed, arithmetic operations only operated on words, not on units smaller than words. So it was simpler and more efficient to convert everything smaller than a word to a word at the start of an expression.
On most platforms today int is 32 bits. So both uint16_t and int16_t are promoted to int. The artithmetic proceeds to produce a result of type int with a value of -1.
OTOH uint32_t and int32_t are not smaller than int, so they retain their original size and signedness through the promotion step. The rules for when the operands to an arithmetic operator are of different types come into play and since the operands are the same size the signed operand is converted to unsigned.
The rationale does not seem to talk about this rule, which suggests it goes back to pre-standard C.
and is the conversion consistent on all compilers and platforms?
On an Ansi C or ISO C++ platform it depends on the size of int. With 16 bit int both examples would give large positive values. With 64-bit int both examples would give -1.
On pre-standard implementations it's possible that both expressions might return large positive numbers.
A 16-bit unsigned int can be promoted to a 32-bit int without any lost values due to range differences, so that's what happens. Not so for the 32-bit integers.

C++: Does Comparing different sized integers cause UB? [duplicate]

This question already has answers here:
Comparing int with long and others
(2 answers)
Closed 1 year ago.
So this is probably a really simple question and if it was not about C++ I would just go ahead and check if it works on my computer or not, but unfortunately in C++ things usually tend to work on a couple of systems while still being UB and therefore not working on other systems.
Consider the following code snippet:
unsigned long long int a = std::numeric_limits< unsigned long long int >::max();
unsigned int b = 12;
bool test = a > b;
My question is: Can we compare integers of different size with one another without explicitly casting the smaller type to the bigger one using e.g. static_cast without running into undefined behavior (UB)?
In general there are three ways I can imagine this turning out:
The smaller type is implicitly cast to the bigger type before conversion (either via a real cast or by some clever way of being able to "pretend" it had been casted)
The bigger type is truncated to the size of the smaller one before comparison
This is not defined and one needs to add in an explicit cast in order to arrive at defined behavior
This is not undefined behavior. This is covered by the usual arithmetic conversions which are detailed in section 8p11.5 of the C++17 standard:
The integral promotions (7.6) shall be performed on both operands.
Then the following rules shall be applied to the promoted operands:
(11.5.1) If both operands have the same type, no further conversion is needed.
(11.5.2) Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser
integer conversion rank shall be converted to the type of the operand
with greater rank.
(11.5.3) Otherwise, if the operand that has unsigned integer type has rank greater than or equal to the rank of the type of the other
operand, the operand with signed integer type shall be converted to
the type of the operand with unsigned integer type.
(11.5.4)Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with
unsigned integer type, the operand with unsigned integer type shall be
converted to the type of the operand with signed integer type.
(11.5.5)Otherwise, both operands shall be converted to the unsigned integer type corresponding to the type of the operand with signed
integer type.
The passage in bold is what applies here. Since both types are unsigned, the the smaller type is converted to the larger type as the format can hold a subset of values the latter can hold.
This is safe. C++ has what are called the Usual arithmetic conversions and they handle how to implicitly convert the objects passed to the built in binary operators.
In this case, integer promotion happens and b is converted to a unsigned long long int for you and then operator > is evaluated.

Why is the type of (-i), where i is unsigned int, still unsigned int?

int main() {
unsigned int i = 1;
typeof(-i) j = -i;
}
When I look into the type of j in gdb, it shows it to be of type unsigned int. Why didn't the type move to the corresponding signed type? Is there a reason behind choosing this behavior? I find if the type would have converted to corresponding signed type more intuitive.
The unary negation operator - doesn't alter the type of its argument, other than applying integer promotions. Section 6.5.3.3p3 of the C standard states:
The result of the unary - operator is the negative of its (promoted)
operand. The integer promotions are performed on the operand, and the
result has the promoted type
The details of integer promotions are detailed in section 6.3.1.1p2:
The following may be used in an expression wherever an int or
unsigned int may be used:
An object or expression with an integer type (other than int or unsigned int) whose integer conversion rank is less than or equal to
the rank of int and unsigned int.
A bit-field of type _Bool, int, signed int, or unsigned int.
If an int can represent all values of the original type (as
restricted by the width, for a bit-field), the value is converted to
an int; otherwise, it is converted to an unsigned int. These are
called the integer promotions. All other types are unchanged by the
integer promotions.
The type of i is unsigned int which according to the above passage is not subject to integer promotions (indeed, it is a destination type of integer promotions). Therefore no promotion is performed and the type of -i is the same as the type of i.
The negation of an unsigned type doesn't produce a signed, negative value, because unsigned negation is meaningful and useful.
Firstly, for an given unsigned value a there is an unsigned value b such that a + b == 0. It makes sense to identify b as -a, and vice versa: a kind of additive inverse operation under the type's modulo arithmetic.
Unsigned arithmetic can emulate two's complement; in that situation, the negation serves as the two's complement operator: it "flips all the bits and adds one". For instance, -0x1U is 0xFF...FFU.

Why does applying bitwise NOT to a char yield an int? [duplicate]

This question already has answers here:
What is going on with bitwise operators and integer promotion?
(4 answers)
Closed 6 years ago.
On my laptop, running the following code:
#include <iostream>
using namespace std;
int main()
{
char a;
cout << sizeof(~a) << endl;
}
prints 4.
I expected the result of ~a to be a char, but apparently, it is an int.
Why is that?
~ is an arithemtic operator (bitwise NOT), and a is being promoted from signed char to int (and in many implementations sizeof(int) == 4). See below for an explanation:
http://en.cppreference.com/w/cpp/language/implicit_conversion#integral_promotion
Prvalues of small integral types (such as char) may be converted to
prvalues of larger integral types (such as int). In particular,
arithmetic operators do not accept types smaller than int as
arguments, and integral promotions are automatically applied after
lvalue-to-rvalue conversion, if applicable. This conversion always
preserves the value.
The standard says (§[expr.primary]/10):
The operand of ~ shall have integral or unscoped enumeration type; the result is the one’s complement of its operand. Integral promotions are performed. The type of the result is the type of the promoted operand.
"Integral promotions" means (§[conv.prom]/1):
A prvalue of an integer type other than bool, char16_t, char32_t, or wchar_t whose integer conversion rank (4.13) is less than the rank of int can be converted to a prvalue of type int if int can represent all the values of the source type; otherwise, the source prvalue can be converted to a prvalue of type unsigned int.
In your case, a has type char, which has a conversion rank less than the rank of int1, so it's being promoted to either int or unsigned int, both of which have the same size (apparently 4 in your implementation).
As to why things were done this way: I think a great deal is that it just simplifies both the language definition and the compiler quite a bit. Rather than having to generate code separately for nearly every type, it does its best to collapse everything down to a few types, and most code is generated only for those types. That's not so much the case any more (now that we have multiple types larger than int), but back when C was young, the integer types were: char, short, int (and unsigned versions of those), so all the other types were promoted to int, and all code to manipulate anything was done with ints.
Note that this applied to function calls and such too: in early versions of C there were no function prototypes, so any parameter of type char or short was promoted to int before being passed to a function too.
The same basic idea was followed with floating point types: under most circumstances (including passing them to functions) floats were promoted to double, and all the actual processing was done on doubles (after which you could convert back to float, if necessary.
In case you really want the quote for that too (§[conv.rank]:
1.3 A prvalue of an integer type other than bool, char16_t, char32_t, or wchar_t whose integer conversion rank (4.13) is less than the rank of int can be converted to a prvalue of type int if int can represent all the values of the source type; otherwise, the source prvalue can be converted to a prvalue of type unsigned int.
[...]
1.6 The rank of char shall equal the rank of signed char and unsigned char.

Why is unsigned short (multiply) unsigned short converted to signed int? [duplicate]

This question already has answers here:
Implicit type conversion rules in C++ operators
(9 answers)
Closed 7 years ago.
Why is unsigned short * unsigned short converted to int in C++11?
The int is too small to handle max values as demonstrated by this line of code.
cout << USHRT_MAX * USHRT_MAX << endl;
overflows on MinGW 4.9.2
-131071
because (source)
USHRT_MAX = 65535 (2^16-1) or greater*
INT_MAX = 32767 (2^15-1) or greater*
and (2^16-1)*(2^16-1) = ~2^32.
Should I expect any problems with this solution?
unsigned u = static_cast<unsigned>(t*t);
This program
unsigned short t;
cout<<typeid(t).name()<<endl;
cout<<typeid(t*t).name()<<endl;
gives output
t
i
on
gcc version 4.4.7 20120313 (Red Hat 4.4.7-16) (GCC)
gcc version 4.8.2 (GCC)
MinGW 4.9.2
with both
g++ p.cpp
g++ -std=c++11 p.cpp
which proves that t*t is converted to int on these compilers.
Usefull resources:
Signed to unsigned conversion in C - is it always safe?
Signed & unsigned integer multiplication
https://bytes.com/topic/c-sharp/answers/223883-multiplication-types-smaller-than-int-yields-int
http://www.cplusplus.com/reference/climits
http://en.cppreference.com/w/cpp/language/types
Edit: I have demonstrated the problem on the following image.
You may want to read about implicit conversions, especially the section about numeric promotions where it says
Prvalues of small integral types (such as char) may be converted to prvalues of larger integral types (such as int). In particular, arithmetic operators do not accept types smaller than int as arguments
What the above says is that if you use something smaller than int (like unsigned short) in an expression that involves arithmetic operators (which of course includes multiplication) then the values will be promoted to int.
It's the usual arithmetic conversions in action.
Commonly called argument promotion, although the standard uses that term in a more restricted way (the eternal conflict between reasonable descriptive terms and standardese).
C++11 §5/9:
” Many binary operators that expect operands of arithmetic or enumeration type cause conversions and yield
result types in a similar way. The purpose is to yield a common type, which is also the type of the result. This pattern is called the usual arithmetic conversions […]
The paragraph goes on to describe the details, which amount to conversions up a ladder of more general types, until all arguments can be represented. The lowest rung on this ladder is integral promotion of both operands of a binary operation, so at least that is performed (but the conversion can start at a higher rung). And integral promotion starts with this:
C++11 §4.5/1:
” A prvalue of an integer type other than bool, char16_t, char32_t, or wchar_t whose integer conversion
rank (4.13) is less than the rank of int can be converted to a prvalue of type int if int can represent all
the values of the source type; otherwise, the source prvalue can be converted to a prvalue of type unsigned int
Crucially, this is about types, not arithmetic expressions. In your case the arguments of the multiplication operator * are converted to int. Then the multiplication is performed as an int multiplication, yielding an int result.
As pointed out by Paolo M in comments, USHRT_MAX has type int (this is specified by 5.2.4.2.1/1: all such macros have a type at least as big as int).
So USHRT_MAX * USHRT_MAX is already an int x int, no promotions occur.
This invokes signed integer overflow on your system, causing undefined behaviour.
Regarding the proposed solution:
unsigned u = static_cast<unsigned>(t*t);
This does not help because t*t itself causes undefined behaviour due to signed integer overflow. As explained by the other answers, t is promoted to int before the multiplication occurs, for historical reasons.
Instead you could use:
auto u = static_cast<unsigned int>(t) * t;
which, after integer promotion, is an unsigned int multiplied by an int; and then according to the rest of the usual arithmetic conversions, the int is promoted to unsigned int, and a well-defined modular multiplication occurs.
With integer promotion rules
USHRT_MAX value is promoted to int.
then we do the multiplication of 2 int (with possible overflow).
It seems that nobody has answered this part of the question yet:
Should I expect any problems with this solution?
u = static_cast<unsigned>(t*t);
Yes, there is a problem here: it first computes t*t and allows it to overflow, then it converts the result to unsigned. Integer overflow causes undefined behavior according to the C++ standard (even though it may always work fine in practice). The correct solution is:
u = static_cast<unsigned>(t)*t;
Note that the second t is promoted to unsigned before the multiplication because the first operand is unsigned.
As it has been pointed out by other answers, this happens due to integer promotion rules.
The simplest way to avoid the conversion from an unsigned type with a smaller rank than a signed type with a larger rank, is to make sure the conversion is done into an unsigned int and not int.
This is done by multiplying by the value 1 that is of type unsigned int. Due to 1 being a multiplicative identity, the result will remain unchanged:
unsigned short c = t * 1U * t;
First the operands t and 1U are evaluated. Left operand is signed and has a smaller rank than the unsigned right operand, so it gets converted to the type of the right operand. Then the operands are multiplied and the same happens with the result and the remaining right operand. The last paragraph in the Standard cited below is used for this promotion.
Otherwise, the integer promotions are performed on both operands. Then the
following rules are applied to the promoted operands:
-If both operands have the same type, then no further conversion is needed.
-Otherwise, if both operands have signed integer types or both have unsigned
integer types, the operand with the type of lesser integer conversion rank is
converted to the type of the operand with greater rank.
-Otherwise, if the operand that has unsigned integer type has rank greater or
equal to the rank of the type of the other operand, then the operand with
signed integer type is converted to the type of the operand with unsigned
integer type.