I am not sure if promotion just means converting a data type to a larger data type (for example short to int).
Or does promotion means converting a data type to another "compatible" data type, for example converting a short to an int, which will keep the same bit pattern (the extra space will be filled with zeros). And is conversion means converting something like an int to a float, which will create a completely different bit pattern?
There are two things that are called promotions: integral promotions and floating point promotions. Integral promotion refers to integral types (including bitfields and enums) being converted to "larger" integral types and floating point promotion is specifically just float to double.
Both types of promotions are subsets of a wider range of conversions.
char -> int: integral promotion
float -> double: floating point promotion
int -> char: [narrowing] conversion (not a promotion)
int -> float: conversion
const char* -> std::string: conversion
Foo -> Bar: possibly undefined conversion?
etc.
A promotion is a specific kind of conversion for built-in types that is guaranteed not to change the value.
The type you are promoting to must be able to accurately represent any possible value of the type you are promoting from.
Here is a list of the applicable conversions.
Promotion
char or short values (signed or unsigned) are promoted to int (or unsigned) before anything else happens
this is done because int is assumed to be the most efficient integral datatype, and it is guaranteed that no information will be lost by going from a smaller datatype to a larger one
Conversion
after integral promotion, the arguments to an operator are checked
if both are the same datatype, evaluation proceeds
if the arguments are of different datatypes, conversion will occur
Casts
the type of an expression can be forced using casts.
a cast is simply any valid datatype enclosed in parentheses and placed next to a constant, variable or expression
Please Refer to this : website
Related
It seems the GCC and Clang interpret addition between a signed and unsigned integers differently, depending on their size. Why is this, and is the conversion consistent on all compilers and platforms?
Take this example:
#include <cstdint>
#include <iostream>
int main()
{
std::cout <<"16 bit uint 2 - int 3 = "<<uint16_t(2)+int16_t(-3)<<std::endl;
std::cout <<"32 bit uint 2 - int 3 = "<<uint32_t(2)+int32_t(-3)<<std::endl;
return 0;
}
Result:
$ ./out.exe
16 bit uint 2 - int 3 = -1
32 bit uint 2 - int 3 = 4294967295
In both cases we got -1, but one was interpreted as an unsigned integer and underflowed. I would have expected both to be converted in the same way.
So again, why do the compilers convert these so differently, and is this guaranteed to be consistent? I tested this with g++ 11.1.0, clang 12.0. and g++ 11.2.0 on Arch Linux and Debian, getting the same result.
When you do uint16_t(2)+int16_t(-3), both operands are types that are smaller than int. Because of this, each operand is promoted to an int and signed + signed results in a signed integer and you get the result of -1 stored in that signed integer.
When you do uint32_t(2)+int32_t(-3), since both operands are the size of an int or larger, no promotion happens and now you are in a case where you have unsigned + signed which results in a conversion of the signed integer into an unsigned integer, and the unsigned value of -1 wraps to being the largest value representable.
So again, why do the compilers convert these so differently,
Standard quotes for [language-lawyer]:
[expr.arith.conv]
Many binary operators that expect operands of arithmetic or enumeration type cause conversions and yield result types in a similar way.
The purpose is to yield a common type, which is also the type of the result.
This pattern is called the usual arithmetic conversions, which are defined as follows:
...
Otherwise, the integral promotions ([conv.prom]) shall be performed on both operands.
Then the following rules shall be applied to the promoted operands:
If both operands have the same type, no further conversion is needed.
Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank shall be converted to the type of the operand with greater rank.
Otherwise, if the operand that has unsigned integer type has rank greater than or equal to the rank of the type of the other operand, the operand with signed integer type shall be converted to the type of the operand with unsigned integer type.
Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, the operand with unsigned integer type shall be converted to the type of the operand with signed integer type.
Otherwise, both operands shall be converted to the unsigned integer type corresponding to the type of the operand with signed integer type.
[conv.prom]
A prvalue of an integer type other than bool, char8_t, char16_t, char32_t, or wchar_t whose integer conversion rank ([conv.rank]) is less than the rank of int can be converted to a prvalue of type int if int can represent all the values of the source type; otherwise, the source prvalue can be converted to a prvalue of type unsigned int.
These conversions are called integral promotions.
std::uint16_t type may have a lower conversion rank than int in which case it will be promoted when used as an operand. int may be able to represent all values of std::uint16_t in which case the promotion will be to int. The common type of two int is int.
std::uint32_t type may have the same or a higher conversion rank than int in which case it won't be promoted. The common type of an unsigned type and a signed of same rank is an unsigned type.
For an explanation why this conversion behaviour was chosen, see chapter "6.3.1.1 Booleans, characters, and integers" of "
Rationale for
International Standard—
Programming Languages—
C". I won't quote the entire chapter here.
is this guaranteed to be consistent?
The consistency depends on relative sizes of the integer types which are implementation defined.
Why is this,
C (and hence C++) has a rule that effectively says when a type smaller than int is used in an expression it is first promoted to int (the actual rule is a little more complex than that to allow for multiple distinct types of the same size).
Section 6.3.1.1 of the Rationale for International Standard Programming Languages C claims that in early C compilers there were two versions of the promotion rule. "unsigned preserving" and "value preserving" and talks about why they chose the "value preserving" option. To summarise they believed it would produce correct results in a greater proportion of situations.
It does not however explain why the concept of promotion exists in the first place. I would speculate that it existed because on many processors, including the PDP-11 for which C was originally designed, arithmetic operations only operated on words, not on units smaller than words. So it was simpler and more efficient to convert everything smaller than a word to a word at the start of an expression.
On most platforms today int is 32 bits. So both uint16_t and int16_t are promoted to int. The artithmetic proceeds to produce a result of type int with a value of -1.
OTOH uint32_t and int32_t are not smaller than int, so they retain their original size and signedness through the promotion step. The rules for when the operands to an arithmetic operator are of different types come into play and since the operands are the same size the signed operand is converted to unsigned.
The rationale does not seem to talk about this rule, which suggests it goes back to pre-standard C.
and is the conversion consistent on all compilers and platforms?
On an Ansi C or ISO C++ platform it depends on the size of int. With 16 bit int both examples would give large positive values. With 64-bit int both examples would give -1.
On pre-standard implementations it's possible that both expressions might return large positive numbers.
A 16-bit unsigned int can be promoted to a 32-bit int without any lost values due to range differences, so that's what happens. Not so for the 32-bit integers.
This question already has answers here:
Comparing int with long and others
(2 answers)
Closed 1 year ago.
So this is probably a really simple question and if it was not about C++ I would just go ahead and check if it works on my computer or not, but unfortunately in C++ things usually tend to work on a couple of systems while still being UB and therefore not working on other systems.
Consider the following code snippet:
unsigned long long int a = std::numeric_limits< unsigned long long int >::max();
unsigned int b = 12;
bool test = a > b;
My question is: Can we compare integers of different size with one another without explicitly casting the smaller type to the bigger one using e.g. static_cast without running into undefined behavior (UB)?
In general there are three ways I can imagine this turning out:
The smaller type is implicitly cast to the bigger type before conversion (either via a real cast or by some clever way of being able to "pretend" it had been casted)
The bigger type is truncated to the size of the smaller one before comparison
This is not defined and one needs to add in an explicit cast in order to arrive at defined behavior
This is not undefined behavior. This is covered by the usual arithmetic conversions which are detailed in section 8p11.5 of the C++17 standard:
The integral promotions (7.6) shall be performed on both operands.
Then the following rules shall be applied to the promoted operands:
(11.5.1) If both operands have the same type, no further conversion is needed.
(11.5.2) Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser
integer conversion rank shall be converted to the type of the operand
with greater rank.
(11.5.3) Otherwise, if the operand that has unsigned integer type has rank greater than or equal to the rank of the type of the other
operand, the operand with signed integer type shall be converted to
the type of the operand with unsigned integer type.
(11.5.4)Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with
unsigned integer type, the operand with unsigned integer type shall be
converted to the type of the operand with signed integer type.
(11.5.5)Otherwise, both operands shall be converted to the unsigned integer type corresponding to the type of the operand with signed
integer type.
The passage in bold is what applies here. Since both types are unsigned, the the smaller type is converted to the larger type as the format can hold a subset of values the latter can hold.
This is safe. C++ has what are called the Usual arithmetic conversions and they handle how to implicitly convert the objects passed to the built in binary operators.
In this case, integer promotion happens and b is converted to a unsigned long long int for you and then operator > is evaluated.
This question already has answers here:
What is going on with bitwise operators and integer promotion?
(4 answers)
Closed 6 years ago.
On my laptop, running the following code:
#include <iostream>
using namespace std;
int main()
{
char a;
cout << sizeof(~a) << endl;
}
prints 4.
I expected the result of ~a to be a char, but apparently, it is an int.
Why is that?
~ is an arithemtic operator (bitwise NOT), and a is being promoted from signed char to int (and in many implementations sizeof(int) == 4). See below for an explanation:
http://en.cppreference.com/w/cpp/language/implicit_conversion#integral_promotion
Prvalues of small integral types (such as char) may be converted to
prvalues of larger integral types (such as int). In particular,
arithmetic operators do not accept types smaller than int as
arguments, and integral promotions are automatically applied after
lvalue-to-rvalue conversion, if applicable. This conversion always
preserves the value.
The standard says (§[expr.primary]/10):
The operand of ~ shall have integral or unscoped enumeration type; the result is the one’s complement of its operand. Integral promotions are performed. The type of the result is the type of the promoted operand.
"Integral promotions" means (§[conv.prom]/1):
A prvalue of an integer type other than bool, char16_t, char32_t, or wchar_t whose integer conversion rank (4.13) is less than the rank of int can be converted to a prvalue of type int if int can represent all the values of the source type; otherwise, the source prvalue can be converted to a prvalue of type unsigned int.
In your case, a has type char, which has a conversion rank less than the rank of int1, so it's being promoted to either int or unsigned int, both of which have the same size (apparently 4 in your implementation).
As to why things were done this way: I think a great deal is that it just simplifies both the language definition and the compiler quite a bit. Rather than having to generate code separately for nearly every type, it does its best to collapse everything down to a few types, and most code is generated only for those types. That's not so much the case any more (now that we have multiple types larger than int), but back when C was young, the integer types were: char, short, int (and unsigned versions of those), so all the other types were promoted to int, and all code to manipulate anything was done with ints.
Note that this applied to function calls and such too: in early versions of C there were no function prototypes, so any parameter of type char or short was promoted to int before being passed to a function too.
The same basic idea was followed with floating point types: under most circumstances (including passing them to functions) floats were promoted to double, and all the actual processing was done on doubles (after which you could convert back to float, if necessary.
In case you really want the quote for that too (§[conv.rank]:
1.3 A prvalue of an integer type other than bool, char16_t, char32_t, or wchar_t whose integer conversion rank (4.13) is less than the rank of int can be converted to a prvalue of type int if int can represent all the values of the source type; otherwise, the source prvalue can be converted to a prvalue of type unsigned int.
[...]
1.6 The rank of char shall equal the rank of signed char and unsigned char.
This expression can be found in the Example in §8.5.4/7 in the Standard (N3797)
unsigned int ui1 = {-1}; // error: narrows
Given §8.5.4/7 and its 4th bullet point:
A narrowing conversion is an implicit conversion:
from an integer type or unscoped enumeration type to an integer type that cannot represent all the values of the original type,
except where the source is a constant expression whose value after
integral promotions will fit into the target type.
I would say there's no narrowing here, as -1 is a constant expression, whose value after integral promotion fits into an unsigned int.
See also §4.5/1 about Integral Promotion:
A prvalue of an integer type other than bool, char16_t, char32_t, or
wchar_t whose integer conversion rank (4.13) is less than the rank of
int can be converted to a prvalue of type int if int can represent all
the values of the source type; otherwise, the source prvalue can be
converted to a prvalue of type unsigned int.
From 4.13 we have that the rank of -1 (an int) is equal to the rank of an unsigned int, and so it can be converted to an unsigned int.
Edit
Unfortunately Jerry Coffin removed his answer from this thread. I believe he was on the right track, if we accept the fact that the current reading of the 4th bullet point in §8.5.4/7 is wrong, after this change in the Standard.
There is no integral promotion from int to unsigned int, therefor it is still illformed.
That would be an integral conversion.
The change in the wording of the standard is intended to confirm the understanding that converting a negative value into an unsigned type is and always has been a narrowing conversion.
Informally, -1 cannot be represented within the range of any unsigned type, and the bit pattern that represents it does not represent the same value if stored in an unsigned int. Therefore this is a narrowing conversion and promotion/widening is not involved.
This is about the dainty art of reading the standard. As usual, the compiler knows best.
Narrowing is an implicit conversion from an integer type to an integer type that cannot represent all the values of the original type
Conversion from integer type int to integer type unsigned int, of course, cannot represent all the values of the original type -- the standard is quite unambigious here. If you really need it, you can do
unsigned int ui1 = {-1u};
This works without any errors/warnings since 1u is a literal of type unsigned int which is then negated. This is well-defined as [expr.unary.op] in the standard states:
The negative of an unsigned quantity is computed by subtracting its value from 2n, where n is the number of bits in the promoted operand.
-1 however is an int before and after the negation and hence it becomes a narrowing conversion. There are no negative literals; see this answer for details.
Promotion is, informally, copying the same value into a larger space, so it shouldn't be confused here as the sizes (of signed and unsigned) are equal. Even if you try to convert it to some type of larger size, say unsigned long long it's still a narrowing conversion as it cannot represent a negative number truly.
So, if I understood it well, integral promotion provides that: char, wchar_t, bool, enum, short types ALWAYS are converted to int (or unsigned int). Then, if there are different types in an expression, further conversion will be applied.
Am I understanding this well?
And if yes, then my question: Why is it good? Why? Don't become char/wchar_t/bool/enum/short unnecessary? I mean for example:
char c1;
char c2;
c1 = c2;
As I described before, char ALWAYS is converted to int, so in this case after automatic converting this looks like this:
int c1;
int c2;
c1 = c2;
But I can't understand why is this good, if I know that char type will be enough for my needs.
Storage types are never automatically converted. You only get automatic integer promotion as soon as you start doing integer arithmetics (+, -, bitshifts, ...) on those variables.
char c1, c2; // stores them as char
char c3 = c1 + c2; // equivalent to
char c3 = (char)((int)c1 + (int)c2);
The conversions you're asking about are the usual arithmetic conversions and the integer promotions, defined in section 6.3.1.8 of the latest ISO C standard. They're applied to the operands of most binary operators ("binary" meaning that they take two operands, such as +, *, etc.). (The rules are similar for C++. In this answer, I'll just refer to the C standard.)
Briefly the usual arithmetic conversions are:
If either operand is long double, the other operand is converted to long double.
Otherwise, if either operand is double, the other operand is converted to double.
Otherwise, if either operand is float, the other operand is converted to float.
Otherwise, the integer promotions are performed on both operands, and then some other rules are applied to bring the two operands to a common type.
The integer promotions are defined in section 6.3.1.1 of the C standard. For a type narrower than int, if the type int can hold all the values of the type, then an expression of that type is converted to int; otherwise it's converted to unsigned int. (Note that this means that an expression of type unsigned short may be converted either to int or to unsigned int, depending on the relative ranges of the types.)
The integer promotions are also applied to function arguments when the declaration doesn't specify the type of the parameter. For example:
short s = 2;
printf("%d\n", s);
promotes the short value to int. This promotion does not occur for non-variadic functions.
The quick answer for why this is done is that the standard says so.
The underlying reason for all this complexity is to allow for the restricted set of arithmetic operations available on most CPUs. With this set of rules, all arithmetic operators (other than the shift operators, which are a special case) are only required to work on operands of the same type. There is no short + long addition operator; instead, the short operand is implicitly converted to long. And there are no arithmetic operators for types narrower than int; if you add two short values, both arguments are promoted to int, yielding an int result (which might then be converted back to short).
Some CPUs can perform arithmetic on narrow operands, but not all can do so. Without this uniform set of rules, either compilers would have to emulate narrow arithmetic on CPUs that don't support it directly, or the behavior of arithmetic expressions would vary depending on what operations the target CPU supports. The current rules are a good compromise between consistency across platforms and making good use of CPU operations.
if I understood it well, integral promotion provides that: char, wchar_t, bool, enum, short types ALWAYS converted to int (or unsigned int).
Your understanding is only partially correct: short types are indeed promoted to int, but only when you use them in expressions. The conversion is done immediately before the use. It is also "undone" when the result is stored back.
The way the values are stored remains consistent with the properties of the type, letting you control the way you use your memory for the variables that you store. For example,
struct Test {
char c1;
char c2;
};
will be four times as small as
struct Test {
int c1;
int c2;
};
on systems with 32-bit ints.
The conversion is not performed when you store the value in the variable. The conversion is done if you cast the value or if you perform some operation like some arithmetic operation on it explicitly
It really depends on your underlying microprocessor architecture. For example, if your processor is 32-bit, that is its native integer size. Using its native integer size in integer computations is better optimized.
Type conversion takes place when arithmetic operations, shift operations, unary operations are performed. See what standard says about it:
C11; 6.3.1.4 Real floating and integer:
If an int can represent all values of the original type (as restricted by the width, for a
bit-field), the value is converted to an int; otherwise, it is converted to an unsigned
int. These are called the integer promotions.58) All other types are unchanged by the
integer promotions.
58.The integer promotions are applied only: as part of the usual arithmetic conversions, to certain argument expressions, to the operands of the unary +, -, and ~ operators, and to both operands of the shift operators,1 as specified by their respective subclauses
1. Emphasis is mine.