Is promotion and widening the same thing? - c++

Is there's a difference between promotion and widening, I've heard that widening only describes integral promotion.

Widening "typically" refers to integral/floating point types (as in a char going to a long or float to double), but it can also refer to character widening (as in going from a char type to a wchar_t type).
Widening conversions are also known as "promotions" and narrowing conversions are known as "coercion".
The notion of "promotion" and "coercion" can also be used in the OO since as well (polymorphism); as in promotion of a base class to a derived type, or coercion of derived type to base. In this since it's still a "widening" and "narrowing" as the address space used for the base is "less" than the derived type (hence you are widening/promoting your types when "up-casting", or narrowing/coercing your types when "down-casting").
So to answer directly: Is there's a difference between promotion and widening .. no not really (unless you are feeling pedantic), though I probably wouldn't say "widen that class type" over "promote that class type" if I was talking about non-integrals (just to avoid any possible initial confusion).

It really depends on context, because the term "widening" is an informal term, and the meaning varies a bit depending on who is telling the story. I'll describe some common interpretations (but not the only ones).
Before doing that, it is necessary to describe what promotions are.
The C++ standard describes integral promotions (between integral types) and floating point promotions (between floating point types). Conversion between an integral type and a floating point type is not described as a promotion.
The common features are that promotions are generally value preserving (except from signed to unsigned integral types, which uses modulo arithmetic) but need not involve increasing the size of a variable (or range of values it can represent). For example, a short may be promoted to an int, but a short and an int may also be the same size (albeit that is implementation/compiler dependent).
The C++ standard doesn't use the term "widening" at all (except in some contexts in the library, unrelated to type conversions). A common informal meaning, in context of integral and floating point conversions, is a promotion that is BOTH value preserving AND to a larger type. The implementation is typically setting the additional bits in the result to zero (i.e. making the value wider without fiddling the bits that represent it). So signed char to short, short to long, unsigned char to unsigned short are widening conversions (assuming none of the types are equal size). Similarly, float to double is a widening conversion (the standard guarantees that the values a float can represent is a strict subset of the values that a double can represent). Conversion from int to double is not a widening (e.g. not necessarily value preserving, bits may be fiddled).
Widening is also sometimes used to describe a conversion of a pointer to derived class into a pointer to base class (or between similar references). The reverse is called "narrowing" and - in C++ - can only be forced with an explicit type conversion.

Related

Is comparing a short to a long an implicit conversion?

From what I understand, comparing two different types, including a short and a long will still cause a conversion. I believe that a short will be promoted to an int. However, I can't seem to find a direct answer on comparing a short to a long.
For example:
Is it improper to compare a Uint32 to a Uint8.
Is it improper to add a Uint32 to a Uint8?
Uint32/Uint8 are shorthand typedefs in SDL for uint32_t and uint8_t, respectively.
EDIT:
I suppose I should be a bit more explicit on my overall question. I'm really wondering whether or not comparing or evaluating with two different types of ints, that are the same signage (in the example case, unsigned), but differ in the SIZE (uint8_t and uint32_t), is an improper thing to do.
Perhaps it is improper due to a implicit conversion. Perhaps it is improper because of a performance issue other than conversion. Maybe it is frowned upon because of some sort of readability issue I am unaware of.
In the comments two similar questions were linked, however they are comparing an int to a long. Which I believe is very similar, but doesn't an int just take the form of whichever version is needed (uint8_t, sint16_t, etc.)?
I believe this question is answered with this: http://en.cppreference.com/w/cpp/language/operator_arithmetic under the subsection 'Conversions'.
Otherwise, the operand has integer type (because bool, char, char16_t, char32_t, wchar_t, and unscoped enumeration were promoted at this point) and integral conversions are applied to produce the common type, as follows:
If both operands are signed or both are unsigned, the operand with lesser conversion rank is converted to the operand with the greater integer conversion rank
So the overall answer to my question is, Yes, there is a conversion. And from what I can tell, there are no issues with comparing two unsigned int types, as long as you don't compared signed and unsigned.

Why does C++ standard specify signed integer be cast to unsigned in binary operations with mixed signedness?

The C and C++ standards stipulate that, in binary operations between a signed and an unsigned integer of the same rank, the signed integer is cast to unsigned. There are many questions on SO caused by this... let's call it strange behavior: unsigned to signed conversion, C++ Implicit Conversion (Signed + Unsigned), A warning - comparison between signed and unsigned integer expressions, % (mod) with mixed signedness, etc.
But none of these give any reasons as to why the standard goes this way, rather than casting towards signed ints. I did find a self-proclaimed guru who says it's the obvious right thing to do, but he doesn't give a reasoning either: http://embeddedgurus.com/stack-overflow/2009/08/a-tutorial-on-signed-and-unsigned-integers/.
Looking through my own code, wherever I combine signed and unsigned integers, I always need to cast from unsigned to signed. There are places where it doesn't matter, but I haven't found a single example of code where it makes sense to cast the signed integer to unsigned.
What are cases where casting to unsigned in the correct thing to do? Why is the standard the way it is?
Casting from unsigned to signed results in implementation-defined behaviour if the value cannot be represented. Casting from signed to unsigned is always modulo two to the power of the unsigned's bitsize, so it is always well-defined.
The standard conversion is to the signed type if every possible unsigned value is representable in the signed type. Otherwise, the unsigned type is chosen. This guarantees that the conversion is always well-defined.
Notes
As indicated in comments, the conversion algorithm for C++ was inherited from C to maintain compatibility, which is technically the reason it is so in C++.
When this note was written, the C++ standard allowed three binary representations, including sign-magnitude and ones' complement. That's no longer the case, and there's every reason to believe that it won't be the case for C either in the reasonably bear future. I'm leaving the footnote as a historical relic, but it says nothing relevant to the current language.
It has been suggested that the decision in the standard to define signed to unsigned conversions and not unsigned to signed conversion is somehow arbitrary, and that the other possible decision would be symmetric. However, the possible conversion are not symmetric.
In both of the non-2's-complement representations contemplated by the standard, an n-bit signed representation can represent only 2n−1 values, whereas an n-bit unsigned representation can represent 2n values. Consequently, a signed-to-unsigned conversion is lossless and can be reversed (although one unsigned value can never be produced). The unsigned-to-signed conversion, on the other hand, must collapse two different unsigned values onto the same signed result.
In a comment, the formula sint = uint > sint_max ? uint - uint_max : uint is proposed. This coalesces the values uint_max and 0; both are mapped to 0. That's a little weird even for non-2s-complement representations, but for 2's-complement it's unnecessary and, worse, it requires the compiler to emit code to laboriously compute this unnecessary conflation. By contrast the standard's signed-to-unsigned conversion is lossless and in the common case (2's-complement architectures) it is a no-op.
If the signed casting was chosen, then simple a+1 would always result in signed type (unless constant was typed as 1U).
Assume a was unsigned int, then this seemingly innocent increment a+1 could lead to things like undefined overflow or "index out of bound", in the case of arr[a+1]
Thus, "unsigned casting" seems like a safer approach because people probably don't even expect casting to be happening in the first place, when simply adding a constant.
This is sort of a half-answer, because I don't really understand the committee's reasoning.
From the C90 committee's rationale document: https://www.lysator.liu.se/c/rat/c2.html#3-2-1-1
Since the publication of K&R, a serious divergence has occurred among implementations of C in the evolution of integral promotion rules. Implementations fall into two major camps, which may be characterized as unsigned preserving and value preserving. The difference between these approaches centers on the treatment of unsigned char and unsigned short, when widened by the integral promotions, but the decision has an impact on the typing of constants as well (see §3.1.3.2).
... and apparently also on the conversions done to match the two operands for any operator. It continues:
Both schemes give the same answer in the vast majority of cases, and both give the same effective result in even more cases in implementations with twos-complement arithmetic and quiet wraparound on signed overflow --- that is, in most current implementations.
It then specifies a case where ambiguity of interpretation arises, and states:
The result must be dubbed questionably signed, since a case can be made for either the signed or unsigned interpretation. Exactly the same ambiguity arises whenever an unsigned int confronts a signed int across an operator, and the signed int has a negative value. (Neither scheme does any better, or any worse, in resolving the ambiguity of this confrontation.) Suddenly, the negative signed int becomes a very large unsigned int, which may be surprising --- or it may be exactly what is desired by a knowledgable programmer. Of course, all of these ambiguities can be avoided by a judicious use of casts.
and:
The unsigned preserving rules greatly increase the number of situations where unsigned int confronts signed int to yield a questionably signed result, whereas the value preserving rules minimize such confrontations. Thus, the value preserving rules were considered to be safer for the novice, or unwary, programmer. After much discussion, the Committee decided in favor of value preserving rules, despite the fact that the UNIX C compilers had evolved in the direction of unsigned preserving.
Thus, they consider the case of int + unsigned an unwanted situation, and chose conversion rules for char and short that yield as few of those situations as possible, even though most compilers at the time followed a different approach. If I understand right, this choice then forced them to follow the current choice of int + unsigned yielding an unsigned operation.
I still find all of this truly bizarre.
Why does C++ standard specify signed integer be cast to unsigned in binary operations with mixed signedness?
I suppose that you mean converted rather than "cast". A cast is an explicit conversion.
As I'm not the author nor have I encountered documentation about this decision, I cannot promise that my explanation is the truth. However, there is a fairly reasonable potential explanation: Because that's how C works, and C++ was based on C. Unless there was an opportunity improve upon the rules, there would be no reason to change what works and what programmers have been used to. I don't know if the committee even deliberated changing this.
I know what you may be thinking: "Why does C standard specify signed integer...". Well, I'm also not the author of C standard, but there is at least a fairly extensive document titled "Rationale for
American National Standard
for Information Systems -
Programming Language -
C". As extensive it is, it doesn't cover this question unfortunately (it does cover a very similar question of how to promote integer types narrower than int in which regard the standard differs from some of the C implementations that pre-date the standard).
I don't have access to a pre-standard K&R documents, but I did find a passage from book "Expert C Programming: Deep C Secrets" which quotes rules from the pre-standard K&R C (in context of comparing the rule with the standardised ones):
Section 6.6 Arithmetic Conversions
A great many operators cause conversions and yield result types in a similar way. This pattern will be called the "usual arithmetic conversions."
First, any operands of type char or short are converted to int, and any of type float are converted to double. Then if either operand is double, the other is converted to double and that is the type of the result. Otherwise, if either operand is long, the other is converted to long and that is the type of the result. Otherwise, if either operand is unsigned, the other is converted to unsigned and that is the type of the result. Otherwise, both operands must be int, and that is the type of the result.
So, it appears that this has been the rule from since before standardisation of C and was presumably the chosen by the designer himself. Unless someone can find a written rationale, we may never know the answer.
What are cases where casting to unsigned in the correct thing to do?
Here is an extremely simple case:
unsigned u = INT_MAX;
u + 42;
The type of the literal 42 is signed, so with your proposed / designer rule, u + 42 would also be signed. This would be quite surprising and would result in the shown program to have undefined behaviour due to signed integer overflow.
Basically, implicit conversion to signed and to unsigned each have their problems.

Why cannot floating-point promotion work for arithmetics as well?

I have read a bit about floating-point promotion. I know that it doesn't apply on binary arithmetic operations, only on e.g. overload resolution. But why?
The C++ standard guarantees that double must be at least as precise as float [basic.fundamental.8] and the floating point promotion is required to keep the value unchanged [conv.fpprom]. Yet this question makes it very clear that it does not happen. Stroustrup, 4th edition has the subject even errata-ed (here, Chapter 10, p. 267).
However, I cannot see any reason why the promotion cannot be done in usual arithmetic conversions [expr.10], even if all prerequisites are met. Is there any?
The latest C++14 working draft can be found here, the final version is purchase-only.
Converting a float to a double costs something, and it's likely more expensive than a short to int conversion (it needs several shifts and bit combining operations). And unlike e.g. short, the float type is considered something on which the processor can operate directly (just like it can on int).
Given the facts obove, why should floating-point promotion happen when it's not necessary? That is, if you're adding two floats, why convert them to double, add them, and then convert them back to float?(1)
Note that a floating-point promotion will indeed happen when you're adding mixed arguments (e.g. a float + double), by the very ruling in C++14 [expr] you're referring to.
(10.3) Otherwise, if either operand is double, the other shall be converted to double.
As per [conv.fpprom], this conversion from float to double is carried out by floating point promotion.
(1) Of course, it is perfectly possible this will happen internally if the processor cannot operate on floats directly, and [expr].12 explicitly allows that. But that very paragraph says
the types are not changed thereby.
It does!
I don't know what you call "work", but the scope of definition for floating-point promotion and usual arithmetic conversions is different.
usual arithmetic conversions : Apply to binary operators that expect operands of arithmetic or enumeration type.
floating-point promotion : Apply to prvalues of type float.
Some expressions, like a + b qualify for both, while 1.0f qualify only as a prvalue.
The standard you linked says (about usual arithmetic conversions)
(10.3) if either operand is double, the other shall be converted to
double
...
(10.5) — Otherwise, the integral promotions shall be performed on both operands
It doesn't restrict how the other operand is converted to double, so I would assume that double + float follow the floating-point promotion rule.

In C++, what happens when I use static_cast<char> on an integer value outside -128,127 range?

In a code compiled on i386 Linux using g++, I have used static_cast<char>() cast on a value that might exceed the valid range of -128,127 for a char. There were no errors or exceptions and so I used the code in production.
The problem is now I don't know how this code might behave when a value outside this range is thrown at it. There is no problem if data is modified or truncated, I only need to know how this modification behaves on this particular platform.
Also what would happen if C-style cast ((char)value) had been used? would it behave differently?
In your case this would be an explicit type conversion. Or to be more precise an integral conversions.
The standard says about this(4.7):
If the destination type is signed, the value is unchanged if it can be represented in the destination type (and
bit-field width); otherwise, the value is implementation-defined.
So your problem is implementation-defined. On the other hand I have not yet seen a compiler that does not just truncate the larger value to the smaller one. And I have never seen any compiler that uses the rule mentioned above.
So it should be fairly safe to just cast your integer/short to the char.
I don't know the rules for an C cast by heart and I really try to avoid them because it is not easy to say which rule will kick in.
This is dealt with in §4.7 of the standard (integral conversions).
The answer depends on whether in the implementation in question char is signed or unsigned. If it is unsigned, then modulo arithmetic is applied. §4.7/2 of C++11 states: "If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2 n where n is the number of bits used to represent the unsigned type)." This means that if the input integer is not negative, the normal bit truncation you expect will arise. If is is negative, the same will apply if negative numbers are represented by 2's complement, otherwise the conversion will be bit altering.
If char is signed, §4.7/3 of C++11 applies: "If the destination type is signed, the value is unchanged if it can be represented in the destination type (and bit-field width); otherwise, the value is implementation-defined." So it is up to the documentation for the particular implementation you use. Having said that, on 2's complement systems (ie all those in normal use) I have not seen a case where anything other than normal bit truncation occurs for char types: apart from anything else, by virtue of §3.9.1/1 of the c++11 standard all character types (char, unsigned char and signed char) must have the same object representation and alignment.
The effect of a C style case, an explicit static_cast and an implicit narrowing conversion is the same.
Technically the language specs for unsigned types agree in inposing a plain base-2. And for unsigned plain base-2 its pretty obvious what extension and truncation do.
When going to unsigned, however, the specs are more "tolerant" allowing potentially different kind of processor to use different ways to represent signed numbers. And since a same number may have in different platform different representations is practically not possible to provide a description on what happen to it when adding or removing bits.
For this reason, language specification remain more vague by saying that "the value is unchanged if it can be represented in the destination type (and bit-field width); otherwise, the value is implementation-defined"
In other words, compiler manufacturer are required to do the best as they can to keep the numeric value. But when this cannot be done, they are free to adapt to what is more efficient for them.

difference between widening and narrowing in c++?

What is the difference between widening and narrowing in c++ ?
What is meant by casting and what are the types of casting ?
This is a general casting thing, not C++ specific.
A "widening" cast is a cast from one type to another, where the "destination" type has a larger range or precision than the "source" (e.g. int to long, float to double). A "narrowing" cast is the exact opposite (long to int). A narrowing cast introduces the possibility of overflow.
Widening casts between built-in primitives are implicit, meaning you do not have to specify the new type with the cast operator, unless you want the type to be treated as the wider type during a calculation. By default, types are cast to the widest actual type used on the variable's side of a binary expression or assignment, not counting any types on the other side).
Narrowing casts, on the other hand, must be explicitly cast, and overflow exceptions must be handled unless the code is marked as not being checked for overflow (the keyword in C# is unchecked; I do not know if it's unique to that language)
widening conversion is when you go from a integer to a double, you are increasing the precision of the cast.
narrowing conversion is the inverse of that, when you go from double to integer. You are losing precision
There are two types of casting , implicit and explicit casting. The page below will be helpful. Also the entire website is pretty much the goto for c/c++ needs.
Tutorial on casting and conversion
Take home exam? :-)
Let's take casting first. Every object in C or C++ has a type, which is nothing more than the name give to two kinds of information: how much memory the thing takes up, and what operations you can do on it.
So
int i;
just means that i refers to some location in memory, usually 32 bits wide, on which you can do +,-,*,/,%,++,-- and some others.
Ci isn't really picky about it, though:
int * ip;
defines another type, called pointer to integer which represents an address in memory. It has an additional opertion, prefix-*. On many machines, that also happens to be 32 bits wide.
A cast, or typecast tell the compiler to treat memory identified as one type as if it were another type. Typecasts are written as (typename).
So
(int*) i;
means "treat i as if it were a pointer, and
(int) ip;
means treat the pointer ip as just an integer number.
Now, in this context, widening and narrowing mean casting from one type to another that has more or fewer bits respectively.