C++ type conversion FAQ - c++

Where I can find an excellently understandable article on C++ type conversion covering all of its types (promotion, implicit/explicit, etc.)?
I've been learning C++ for some time and, for example, virtual functions mechanism seems clearer to me than this topic. My opinion is that it is due to the textbook's authors who are complicating too much (see Stroustroup's book and so on).

(Props to Crazy Eddie for a first answer, but I feel it can be made clearer)
Type Conversion
Why does it happen?
Type conversion can happen for two main reasons. One is because you wrote an explicit expression, such as static_cast<int>(3.5). Another reason is that you used an expression at a place where the compiler needed another type, so it will insert the conversion for you. E.g. 2.5 + 1 will result in an implicit cast from 1 (an integer) to 1.0 (a double).
The explicit forms
There are only a limited number of explicit forms. First off, C++ has 4 named versions: static_cast, dynamic_cast, reinterpret_cast and const_cast. C++ also supports the C-style cast (Type) Expression. Finally, there is a "constructor-style" cast Type(Expression).
The 4 named forms are documented in any good introductory text. The C-style cast expands to a static_cast, const_cast or reinterpret_cast, and the "constructor-style" cast is a shorthand for a static_cast<Type>. However, due to parsing problems, the "constructor-style" cast requires a singe identifier for the name of the type; unsigned int(-5) or const float(5) are not legal.
The implicit forms
It's much harder to enumerate all the contexts in which an implicit conversion can happen. Since C++ is a typesafe OO language, there are many situations in which you have an object A in a context where you'd need a type B. Examples are the built-in operators, calling a function, or catching an exception by value.
The conversion sequence
In all cases, implicit and explicit, the compiler will try to find a conversion sequence. A conversion sequence is a series of steps that gets you from type A to type B. The exact conversion sequence chosen by the compiler depends on the type of cast. A dynamic_cast is used to do a checked Base-to-Derived conversion, so the steps are to check whether Derived inherits from Base, via which intermediate class(es). const_cast can remove both const and volatile. In the case of a static_cast, the possible steps are the most complex. It will do conversion between the built-in arithmetic types; it will convert Base pointers to Derived pointers and vice versa, it will consider class constructors (of the destination type) and class cast operators (of the source type), and it will add const and volatile. Obviously, quite a few of these step are orthogonal: an arithmetic type is never a pointer or class type. Also, the compiler will use each step at most once.
As we noted earlier, some type conversions are explicit and others are implicit. This matters to static_cast because it uses user-defined functions in the conversion sequence. Some of the conversion steps consiered by the compiler can be marked as explicit (In C++03, only constructors can). The compiler will skip (no error) any explicit conversion function for implicit conversion sequences. Of course, if there are no alternatives left, the compiler will still give an error.
The arithmetic conversions
Integer types such as char and short can be converted to "greater" types such as int and long, and smaller floating-point types can similarly be converted into greater types. Signed and unsigned integer types can be converted into each other. Integer and floating-point types can be changed into each other.
Base and Derived conversions
Since C++ is an OO language, there are a number of casts where the relation between Base and Derived matters. Here it is very important to understand the difference between actual objects, pointers, and references (especially if you're coming from .Net or Java). First, the actual objects. They have precisely one type, and you can convert them to any base type (ignoring private base classes for the moment). The conversion creates a new object of base type. We call this "slicing"; the derived parts are sliced off.
Another type of conversion exists when you have pointers to objects. You can always convert a Derived* to a Base*, because inside every Derived object there is a Base subobject. C++ will automatically apply the correct offset of Base with Derived to your pointer. This conversion will give you a new pointer, but not a new object. The new pointer will point to the existing sub-object. Therefore, the cast will never slice off the Derived part of your object.
The conversion the other way is trickier. In general, not every Base* will point to Base sub-object inside a Derived object. Base objects may also exist in other places. Therefore, it is possible that the conversion should fail. C++ gives you two options here. Either you tell the compiler that you're certain that you're pointing to a subobject inside a Derived via a static_cast<Derived*>(baseptr), or you ask the compiler to check with dynamic_cast<Derived*>(baseptr). In the latter case, the result will be nullptr if baseptr doesn't actually point to a Derived object.
For references to Base and Derived, the same applies except for dynamic_cast<Derived&>(baseref) : it will throw std::bad_cast instead of returning a null pointer. (There are no such things as null references).
User-defined conversions
There are two ways to define user conversions: via the source type and via the destination type. The first way involves defining a member operator DestinatonType() const in the source type. Note that it doesn't have an explicit return type (it's always DestinatonType), and that it's const. Conversions should never change the source object. A class may define several types to which it can be converted, simply by adding multiple operators.
The second type of conversion, via the destination type, relies on user-defined constructors. A constructor T::T which can be called with one argument of type U can be used to convert a U object into a T object. It doesn't matter if that constructor has additional default arguments, nor does it matter if the U argument is passed by value or by reference. However, as noted before, if T::T(U) is explicit, then it will not be considered in implicit conversion sequences.
it is possible that multiple conversion sequences between two types are possible, as a result of user-defined conversion sequences. Since these are essentially function calls (to user-defined operators or constructors), the conversion sequence is chosen via overload resolution of the different function calls.

Don't know of one so lets see if it can't be made here...hopefully I get it right.
First off, implicit/explicit:
Explicit "conversion" happens everywhere that you do a cast. More specifically, a static_cast. Other casts either fail to do any conversion or cover a different range of topics/conversions. Implicit conversion happens anywhere that conversion is happening without your specific say-so (no casting). Consider it thusly: Using a cast explicitly states your intent.
Promotion:
Promotion happens when you have two or more types interacting in an expression that are of different size. It is a special case of type "coercion", which I'll go over in a second. Promotion just takes the small type and expands it to the larger type. There is no standard set of sizes for numeric types but generally speaking, char < short < int < long < long long, and, float < double < long double.
Coercion:
Coercion happens any time types in an expression do not match. The compiler will "coerce" a lesser type into a greater type. In some cases, such as converting an integer to a double or an unsigned type into a signed type, information can be lost. Coercion includes promotion, so similar types of different size are resolved in that manner. If promotion is not enough then integral types are converted to floating types and unsigned types are converted to signed types. This happens until all components of an expression are of the same type.
These compiler actions only take place regarding raw, numeric types. Coercion and promotion do not happen to user defined classes. Generally speaking, explicit casting makes no real difference unless you are reversing promotion/coercion rules. It will, however, get rid of compiler warnings that coercion often causes.
User defined types can be converted though. This happens during overload resolution. The compiler will find the various entities that resemble a name you are using and then go through a process to resolve which of the entities should be used. The "identity" conversion is preferred above all; this means that a f(t) will resolve to f(typeof_t) over anything else (see Function with parameter type that has a copy-constructor with non-const ref chosen? for some confusion that can generate). If the identity conversion doesn't work the system then goes through this complex higherarchy of conversion attempts that include (hopefully in the right order) conversion to base type (slicing), user-defined constructors, user-defined conversion functions. There's some funky language about references which will generally be unimportant to you and that I don't fully understand without looking up anyway.
In the case of user type conversion explicit conversion makes a huge difference. The user that defined a type can declare a constructor as "explicit". This means that this constructor will never be considered in such a process as I described above. In order to call an entity in such a way that would use that constructor you must explicitly do so by casting (note that syntax such as std::string("hello") is not, strictly speaking, a call to the constructor but instead a "function-style" cast).
Because the compiler will silently look through constructors and type conversion overloads during name resolution, it is highly recommended that you declare the former as 'explicit' and avoid creating the latter. This is because any time the compiler silently does something there's room for bugs. People can't keep in mind every detail about the entire code tree, not even what's currently in scope (especially adding in koenig lookup), so they can easily forget about some detail that causes their code to do something unintentional due to conversions. Requiring explicit language for conversions makes such accidents much more difficult to make.

For integer types, check the book Secure Coding n C and C++ by Seacord, the chapter about integer overflows.
As for implicit type conversions, you will find the books Effective C++ and More Effective C++ to be very, very useful.
In fact, you shouldn't be a C++ developer without reading these.

Related

in C++ , can I define an implicit conversion to a class without modifying the class?

Recently I have to work with some C libraries in my C++ code. The C library I am using defined a complex number class as follow:
typedef struct sfe_complex_s
{
float real;
float img;
} sfe_complex_t;
Naturally I do not want to work with this C-style data structure in C++, so for convenience I want to define an implicit conversion from this type to std::complex<float>. Is there a way to do so? Or I have to explicitly do the conversion?
Implicit conversion is supposed to mean something. It represents a strong relationship between the source and destination types of that conversion. That one of them is, on some level, designed to be equivalent to another in some degree.
As such, only code which is intimately associated with either the source or destination type can define that relationship. That is, if you don't have control over the source or destination types, then C++ doesn't feel that you are qualified to create an implicit conversion relationship between them.
3rd parties cannot make types implicitly convertible.
An implicit conversion can only take place in a constructor of the class converted to or a conversion operator of the class converted from, which has to be a member function. Since you cannot modify either one, you are out of luck.
It may be appealing to define an own class with implicit constructors from and conversion operators to both std::complex<float> and sfe_complex_t, but this will not give you implicit conversions between these two clases, since user defined conversions cannot be chained. You could a class defined in this way in your code and it would always be converted implicitly if you put it somewhere either of the others are expected, but you should really consider if that is what you want. It would introduce another standard and make your code less readable.
To me, it seems like calling an explicit conversion is the cleanest variant. Depending on the number of functions of your C library you could perhaps for each one provide an overload taking std::complex and explicitly converting it before passing it on.
No, you cannot define an implicit conversion between types that you don't have access to change. Your only option will be to define your own function that explicitly takes a sfe_complex_t as input and returns a std::complex<float> as output.

Does the C++ specification say how types are chosen in the static_cast/const_cast chain to be used in a C-style cast?

This question concerns something I noticed in the C++ spec when I was trying to answer this earlier, intriguing question about C-style casts and type conversions.
The C++ spec talks about C-style casts in §5.4. It says that the cast notation will try the following casts, in this order, until one is found that is valid:
const_cast
static_cast
static_cast followed by const_cast
reinterpret_cast
reinterpret_cast followed by const_cast.
While I have a great intuitive idea of what it means to use a static_cast followed by a const_cast (for example, to convert a const Derived* to a Base* by going through a const_cast<Base*>(static_cast<const Base*>(expr))), I don't see any wording in the spec saying how, specifically, the types used in the static_cast/const_cast series are to be deduced. In the case of simple pointers it's not that hard, but as seen in the linked question the cast might succeed if an extra const is introduced in one place and removed in another.
Are there any rules governing how a compiler is supposed to determine what types to use in the casting chain? If so, where are they? If not, is this a defect in the language, or are there sufficient implicit rules to uniquely determine all the possible casts to try?
If not, is this a defect in the language, or are there sufficient implicit rules to uniquely determine all the possible casts to try?
What about constructing all types that can be cast to the target type using only const_cast, i.e. all "intermediate types"?
Given target type T, if static_cast doesn't work, identify all positions where one could add cv-qualifiers such that the resulting type can be cast back to T by const_cast1. Sketch of the algorithm: take the cv-decomposition ([conv.qual]/1) of T; each cvj can be augmented. If T is a reference, we can augment the referee type's cv-qualification.
Now add const volatile to all these places. Call the resulting type CT. Try static_casting the expression to CT instead. If that works, our cast chain is const_cast<T>(static_cast<CT>(e)).
If this doesn't work, there very probably is no conversion using static_cast followed by const_cast (I haven't delved into the deep corners of overload resolution (well I have, but not for this question)). But we could use brute force to repeatedly remove const/volatiles and check each type if we really wanted. So in theory, there is no ambiguity or underspecification; if there is some cast chain, it can be determined. In practice, the algorithm can be made very simple, because the "most cv-qualified" type we can construct from T is (assuredly) sufficient.

Determining what the compiler generates for a given C-style cast

My understanding of C-style casts is that the compiler runs through all sorts of increasingly complicated/dangerous permutations of C++-style casts until it finds one that works, then silently sticks it in. I'm in the process of working through a codebase and replacing these C-style casts with true, explicit C++ casts.
How can I determine which C++ casts the compiler generates for a given C-style cast?
If architecture and compiler are relevant, I'm using Visual Studio 2010 on Windows 7 (though I'd prefer answers that are cognizant of Linux and GCC as well).
Since you are modifying the code anyway, you might first see if there is a way to refactor the code to not require the cast. If that proves fruitless:
Most of the time, you should use static_cast. The most common use of this is turning a void * into a T *.
If you are trying to drop a const from an object, you need const_cast.
If the cast is truly between incompatible types (like, turning an int into a pointer), you will need reinterpret_cast.
As described C++11 §5.4/4, when the compiler encounters the cast notation, it should be attempting the conversion by following the recipe below until the first one that generates a valid result:
The conversions performed by
— a const_cast (5.2.11),
— a static_cast (5.2.9),
— a static_cast followed by a const_cast,
— a reinterpret_cast (5.2.10), or
— a reinterpret_cast followed by a const_cast,
can be performed using the cast notation of explicit type conversion.
...static_cast has some tweaks, see below...
If a conversion can be interpreted in more than one of the ways listed above, the interpretation that appears first in the list is used, even if a cast resulting from that interpretation is ill-formed. If a conversion can be interpreted in more than one way as a static_cast followed by a const_cast, the conversion is ill-formed.
The standard says that the static_cast that is performed on behalf of the cast notation is slightly tweaked to work even if there is non-public inheritance.
The same semantic restrictions and behaviors apply, with the exception that in performing a static_cast in the following situations the conversion is valid even if the base class is inaccessible:
— a pointer to an object of derived class type or an lvalue or rvalue of derived class type may be explicitly converted to a pointer or reference to an unambiguous base class type, respectively;
— a pointer to member of derived class type may be explicitly converted to a pointer to member of an
unambiguous non-virtual base class type;
— a pointer to an object of an unambiguous non-virtual base class type, a glvalue of an unambiguous
non-virtual base class type, or a pointer to member of an unambiguous non-virtual base class type
may be explicitly converted to a pointer, a reference, or a pointer to member of a derived class type, respectively.

How to restrict implicit conversion of typedef'ed types?

Suppose there're two types:
typedef unsigned short Altitude;
typedef double Time;
To detect some errors like passing time argument in position of altitude to functions at compile time I'd like to prohibit implicit conversion from Altitude to Time and vice-versa.
What I tried first was declaring an operator Altitude(Time) without an implementation, but the compiler said that it must be a member function, so I understood that it's not going to work for a typedefed type.
Next I've tried turning one of these types into a class, but it appeared that the project extensively uses lots of arithmetic including implicit conversions to double, int, bool etc., as well as passes them to and from streams via operator<< and operator>>. So despite this way allowed me to find the errors I was looking for, I didn't even try to make full implementation of the compatible class because it would take a lot of code.
So my question is: is there a more elegant way to prevent implicit conversions between two particular typedefed types, and if yes, then how?
A typedef does nothing more than establish another name for an existing type.
Therefore your question boils down to whether you can disable implicit conversions between unsigned short and double, which is not possible in general.
Two ways exist to deal with this problem:
First, you can make Altitude and Time into their own types (read: define classes instead of typedefs) and ensure that they can be easily converted to and from their underlying numeric types - but not each other.
Second, you can ensure that whatever you do is protected by language constructs, e.g. if you have a function f that should take an Altitude a.k.a. unsigned short, you can overload it with another function f that takes an Time a.k.a. double and causes an error. This provides a better overload match and would thus prevent the implicit conversion in that case.

What's the -complete- list of kinds of automatic type conversions a C++ compiler will do for a function argument?

Given a C++ function f(X x) where x is a variable of type X, and a variable y of type Y, what are all the automatic/implicit conversions the C++ compiler will perform on y so that the statement "f(y);" is legal code (no errors, no warnings)?
For example:
Pass Derived& to function taking Base& - ok
Pass Base& to function Derived& - not ok without a cast
Pass int to function taking long - ok, creates a temporary long
Pass int& to function taking long& - NOT ok, taking reference to temporary
Note how the built-in types have some quirks compared to classes: a Derived can be passed to function taking a Base (although it gets sliced), and an int can be passed to function taking a long, but you cannot pass an int& to a function taking a long&!!
What's the complete list of cases that are always "ok" (don't need to use any cast to do it)?
What it's for: I have a C++ script-binding library that lets you bind your C++ code and it will call C++ functions at runtime based on script expressions. Since expressions are evaluated at runtime, all the legal combinations of source types and function argument types that might need to be used in an expression have to be anticipated ahead of time and precompiled in the library so that they'll be usable at runtime. If I miss a legal combination, some reasonable expressions won't work in runtime expressions; if I accidently generate a combination that isn't legal C++, my library just won't compile.
Edit (narrowing the question):
Thanks, all of your answers are actually pretty helpful. I knew the answer was complicated, but it sounds like I've only seen the tip of the iceberg.
Let me rephrase the question a little then to limit its scope then:
I will let the user specify a list of "BaseClasses" and a list of "UserDefinedConversions". For Bases, I'll generate everything including reference and pointer conversions. But what cases (const/reference/pointer) can I safely do from the UserDefined Conversions list? (The user will give bare types, I will decorate with *, &, const, etc. in the template.)
C++ Standard gives the answer to your question in 13.3.3.1 Implicit conversion sequences, but it too large to post it here. I recommend you to read at least that part of C++ Standard.
Hope this link will help you.
Unfortunately the answer to your question is hugely complex, occupying at least 9 pages in the ISO C++ standard (specifically: ~6 pages in "3 Standard Conversions" and ~3 pages in "13.3.3.1 Implicit Conversion Sequences").
Brief summary: A conversion that does not require a cast is called an "implicit conversion sequence". C++ has "standard conversions", which are conversions between fundamental types (such as char being promoted to int) and things such as array-to-pointer decay; there can be several of these in a row, hence the term "sequences". C++ also permits user-defined conversions, which are defined by conversion functions and converting constructors. The important thing to note is that an implicit conversion sequence can have at most one user-defined conversion, with optionally a sequence of standard conversions on either side -- C++ will never "chain" more than one user-defined conversion together without a cast.
(If anyone would like to flesh this post out with the full details, please go ahead... But for me, that would just be too exhausting, sorry :-/)
Note how the built-in types have some
quirks compared to classes: a Derived
can be passed to function taking a
Base (although it gets sliced), and an
int can be passed to function taking a
long, but you cannot pass an int& to a
function taking a long&!!
That's not a quirk of built-in vs. class types. It's a quirk of inheritance.
If you had classes A and B, and B had a conversion to A (either because A has a constructor taking B, or because B has a conversion operator to A), then they'd behave just like int and long in this respect - conversion can occur where a function takes a value, but not where it takes a non-const reference. In both cases the problem is that there is no object to which the necessary non-const reference can be taken: a long& can't refer to an int, and an A& can't refer to a B, and no non-const reference can refer to a temporary.
The reason the base/derived example doesn't encounter this problem because a non-const Base reference can refer to a Derived object. The fact that the types are user-defined is a necessary but not a sufficient condition for the reference to be legal. Convertible user-defined classes where there is no inheritance behave just like built-ins.
This comment is way too long for comments, so I've used an answer. It doesn't actually answer your question, though, other than to distinguish between:
"Conversions" where a reference to a derived class is passed to a function taking a reference to a base class.
Conversions where a user-defined or built-in conversion actually creates an object, such as from int to long.