"Ambiguous conversion sequence" - what is the purpose of this concept? - c++

In N4659 16.3.3.1 Implicit conversion sequences says
10 If several different sequences of conversions exist that each convert the argument to the parameter type, the implicit conversion sequence associated with the parameter is defined to be the unique conversion sequence designated the ambiguous conversion sequence. For the purpose of ranking implicit conversion sequences as described in 16.3.3.2, the ambiguous conversion sequence is treated as a user-defined conversion sequence that is indistinguishable from any other user-defined conversion sequence [Note: This rule prevents a function from becoming non-viable because of an ambiguous conversion sequence for one of its parameters.] If a function that uses the ambiguous conversion sequence is selected as the best viable function, the call will be ill-formed because the conversion of one of the arguments in the call is ambiguous.
(The corresponding section of the current draft is 12.3.3.1)
What is the intended purpose of this rule and the concept of ambiguous conversion sequence it introduces?
The note supplied in the text states that the purpose of this rule is "to prevent a function from becoming non-viable because of an ambiguous conversion sequence for one of its parameters". Um... What does this actually refer to? The concept of a viable function is defined in the preceding sections of the document. It does not depend on ambiguity of conversions at all (conversions for each argument must exist, but they don't have to be unambiguous). And there seems to be no provision for a viable function to somehow "become non-viable" later (neither because of some ambiguity nor anything else). Viable functions are enumerated, they compete against each other for being "the best" in accordance with certain rules and if there's a single "winner", the resolution is successful. At no point in this process a viable function may (or needs) to turn into a non-viable one.
The example provided within the aforementioned paragraph is not very enlightening (i.e. it is not clear what role the above rule plays in that example).
The question originally popped up in connection with this simple example
struct S
{
operator int() const { return 0; };
operator long() const { return 0; };
};
void foo(int) {}
int main()
{
S s;
foo(s);
}
Let's just mechanically apply the above rule here. foo is a viable function. There are two implicit conversion sequences from argument type S to parameter type int: S -> int and S -> long -> int. This means that per the above rule we have to "pack" them into a single ambiguous conversion sequence. Then we conclude that foo is the best viable function. Then we discover that it uses our ambiguous conversion sequence. Consequently, per the above rule the code is ill-formed.
This seems to make no sense. The natural expectation here is that S -> int conversion should be chosen, since it is ranked higher than S -> long -> int conversion. All compilers I know follow that "natural" overload resolution.
So, what am I misunderstanding?

To be viable, there must be an implicit conversion sequence.
The standard could have permitted multiple implicit conversion sequences, but that might make the wording for determining which overload to select more complex.
So the standard ends up defining one and exactly one implicit conversion sequence for each argument to each parameter. In the case where there is ambiguity, the one it uses is the ambiguous conversion sequence.
Once this is done, it no longer has to deal with the possibility of multiple conversion sequences of one argument to one parameter.
Imagine if we where writing this in C++. We might have a few types:
namespace conversion_sequences {
struct standard;
struct user_defined;
struct ellipsis;
}
where each has plenty of stuff in them (skipped here).
As each conversion sequence is one of the above, we define:
using any_kind = std::variant< standard, user_defined, ellipsis >;
Now, we run into the case of there being more than one conversion sequence for a given argument and parameter. We have two choices at this point.
We could pass around a using any_kinds = std::vector<any_kind> for a given argument,parameter pair, and ensure all of the logic that handles picking a conversion sequence deals with this vector...
Or we could note that the handling of a 1 or more entry vector never looks at the elements in the vector, and it is treated exactly like a kind of user_defined conversion sequence, until the very end at which point we generate an error.
Storing that extra state, and having the extra logic, is a pain. We know we don't need that state, and we don't need code to handle the vector. So we just define a sub-type of conversion_sequence::user_defined, and the code after it finds the preferred overload it checks if the resulting overload chosen should generate errors.
While the C++ standard is not (always) implemented in C++ and the wording doesn't have to have a 1:1 relationship with its implementation, writing a robust standard document is a kind of coding, and some of the same concerns apply.

Related

Type conversion operators related to Compare for std::equal_range()

This question is a follow-up on
Invalid operand compiler error involving std::equal_range
In order to use the overloaded version of std::equal_range() taking the address of the binary comparison predicate as the fourth argument, it is also required to provide conversion operators that enable conversion between the involved types according to
http://en.cppreference.com/w/cpp/algorithm/equal_range
To do this type conversion, within the context of the problem described in the original question (see link at the beginning), conversion operators need to be provided that could convert between V const& and std::pair<K const&, V const&>, where K and V are the template arguments. My understanding is that a conversion operator can only be defined within the scope of a class/template class and the only possibility is to convert from the class/template class to a required type.
In the light of this restriction, is there a way that the conversion operators could be defined to address the prerequisite for invoking the overloaded version of std::equal_range(), as intended within the scope of the discussion presented in the original question (see link at the beginning)?

Incompatible operand types when using ternary conditional operator

This code:
bool contains = std::find(indexes.begin(), indexes.end(), i) != indexes.end();
CardAbility* cardAbility = contains ? new CardAbilityBurn(i) : new CardAbilityEmpty;
gives me the following error:
Incompatible operand types CardAbilityBurn and CardAbilityEmpty
However if I write the code like this:
if (contains)
{
cardAbility = new CardAbilityBurn(i);
}
else
{
cardAbility = new CardAbilityEmpty;
}
then the compiler doesn't mind. Why so? I want to use ternary conditional operator because it is just one line. What's wrong there?
I need to note (I think you might need this information) that CardAbilityEmpty and CardAbilityBurn both derive from CardAbility so they are so to say brothers.
Thanks
C++'s type system determines expressions' types from the inside out[1]. That means that the conditional expression's type is determined before the assignment to CardAbility* is made, and the compiler has to choose with only CardAbilityBurn* and CardAbilityEmpty*.
As C++ features multiple inheritance and some more possible conversion paths, since none of the types is a superclass of the other the compilation stops there.
To compile successfully, you need to provide the missing part : cast one or both operands to the base class type, so the conditional expression as a whole can take that type.
auto* cardAbility = contains
? static_cast<CardAbility*>(new CardAbilityBurn(i))
: static_cast<CardAbility*>(new CardAbilityEmpty );
(Note the use of auto, since you already provided the destination type in the right-side expression.)
It is however a bit convoluted, so in the end the if-else structure is better-suited in this case.
[1] There is one exception : overloaded function names have no definitive type until you convert them (implicitly or explicitly) to one of their versions.
There are several cases described for Microsoft compilers, how to handle operand types.
If both operands are of the same type, the result is of that type.
If both operands are of arithmetic or enumeration types, the usual
arithmetic conversions (covered in Arithmetic Conversions) are performed to
convert them to a common type.
If both operands are of pointer types or if one is a pointer type and the
other is a constant expression that evaluates to 0, pointer conversions are
performed to convert them to a common type.
If both operands are of reference types, reference conversions are
performed to convert them to a common type.
If both operands are of type void, the common type is type void.
If both operands are of the same user-defined type, the common type is
that type.
If the operands have different types and at least one of the operands
has user-defined type then the language rules are used to
determine the common type. (See warning below.)
And then there is a caution:
If the types of the second and third operands are not identical, then
complex type conversion rules, as specified in the C++ Standard, are
invoked. These conversions may lead to unexpected behavior including
construction and destruction of temporary objects. For this reason, we
strongly advise you to either (1) avoid using user-defined types as
operands with the conditional operator or (2) if you do use
user-defined types, then explicitly cast each operand to a common
type.
Probably, this is the reason, Apple deactivated this implicit conversion in LLVM.
So, if/else seems to be more appropriate in your case.

Why user-defined conversions are limited?

In C++, only one user-defined conversion is allowed in implicit conversion sequence. Are there any practical reasons (from language user point of view) for that limit?
Allowing only one user defined conversion limits the search scope when attempting to match the source and destination types. Only those two types need to be checked to determine whether they are convertible (non-user defined conversions aside).
Not having that limit might cause ambiguity in some cases and it might even require testing infinite conversion paths in others. Since the standard cannot require to do something unless it is impossible to do in your particular case, the simple rule is only one conversion.
Consider as a convoluted example:
template <typename T>
struct A {
operator A<A<T>> () const; // A<int> --> A<A<int>>
};
int x = A<int>();
Now, there can potentially be a specialization for A<A<A<....A<int>...>>> that might have a conversion to int, so the compiler would have to recursively instantiate infinite versions of A and check whether each one of them is convertible to int.
Conversely, with two types that are convertible from-to any other type, it would cause other issues:
struct A {
template <typename T>
A(T);
};
struct B {
template <typename T>
operator T() const;
};
struct C {
C(A);
operator B() const;
};
If multiple user-defined conversions where allowed, any type T can be converted to any type U by means of the conversion path: T -> A -> C -> B -> U. Just managing the search space would be a hard task for the compiler, and it would most probably cause confusion on the users.
IMHO It is a design choice based on the fact that constructors are not explicit by default. Which is a practical reason for setting a limit, to disallow the following expression to be valid.
objectn o = 5;
5-> object1(5)->object2(object1)->object3(object2)->...->objectn(objectn-1)
Each arrow above is an implicit conversion. One seems to be a reasonable choice.If more are allowed, you have several implicit conversion paths between an object0 and objectn . Each one leading to possibly different objectn o. Which one to choose??
If the allowed number were infinite, you could quite quickly end up with an attempted circular conversion sequence.
It's much easier to say "if you need more than one conversion, you have a design flaw" and use that rationale to emplace a bug-saving limit within the language.
In any case, implicit conversions are generally considered to be best when used sporadically. One conversion is very useful for the sort of heavy operator overloading used by the standard library, and similar code; beyond that, there's not much excuse to ever use implicit conversions. Certainly a language would go in the direction of seeking fewer, not more (e.g. C++11 adding explicit in this context, to help you enforce no implicit conversions at all).
If it allows more than one, then how many in the sequence? If infinite, then wouldn't it make too complicated for a large number of types in a large project? For example, everytime you add implicit conversion, you have to think really really hard as to what else it converts into and then what else that converts into? and the chain goes on. Very dangerous situation, IMHO.
Implicit conversions (what is currently allowed by the language) are considered bad, which is why C++11 has added contextual conversion which is implemented using explicit keyword.

C++ type conversion FAQ

Where I can find an excellently understandable article on C++ type conversion covering all of its types (promotion, implicit/explicit, etc.)?
I've been learning C++ for some time and, for example, virtual functions mechanism seems clearer to me than this topic. My opinion is that it is due to the textbook's authors who are complicating too much (see Stroustroup's book and so on).
(Props to Crazy Eddie for a first answer, but I feel it can be made clearer)
Type Conversion
Why does it happen?
Type conversion can happen for two main reasons. One is because you wrote an explicit expression, such as static_cast<int>(3.5). Another reason is that you used an expression at a place where the compiler needed another type, so it will insert the conversion for you. E.g. 2.5 + 1 will result in an implicit cast from 1 (an integer) to 1.0 (a double).
The explicit forms
There are only a limited number of explicit forms. First off, C++ has 4 named versions: static_cast, dynamic_cast, reinterpret_cast and const_cast. C++ also supports the C-style cast (Type) Expression. Finally, there is a "constructor-style" cast Type(Expression).
The 4 named forms are documented in any good introductory text. The C-style cast expands to a static_cast, const_cast or reinterpret_cast, and the "constructor-style" cast is a shorthand for a static_cast<Type>. However, due to parsing problems, the "constructor-style" cast requires a singe identifier for the name of the type; unsigned int(-5) or const float(5) are not legal.
The implicit forms
It's much harder to enumerate all the contexts in which an implicit conversion can happen. Since C++ is a typesafe OO language, there are many situations in which you have an object A in a context where you'd need a type B. Examples are the built-in operators, calling a function, or catching an exception by value.
The conversion sequence
In all cases, implicit and explicit, the compiler will try to find a conversion sequence. A conversion sequence is a series of steps that gets you from type A to type B. The exact conversion sequence chosen by the compiler depends on the type of cast. A dynamic_cast is used to do a checked Base-to-Derived conversion, so the steps are to check whether Derived inherits from Base, via which intermediate class(es). const_cast can remove both const and volatile. In the case of a static_cast, the possible steps are the most complex. It will do conversion between the built-in arithmetic types; it will convert Base pointers to Derived pointers and vice versa, it will consider class constructors (of the destination type) and class cast operators (of the source type), and it will add const and volatile. Obviously, quite a few of these step are orthogonal: an arithmetic type is never a pointer or class type. Also, the compiler will use each step at most once.
As we noted earlier, some type conversions are explicit and others are implicit. This matters to static_cast because it uses user-defined functions in the conversion sequence. Some of the conversion steps consiered by the compiler can be marked as explicit (In C++03, only constructors can). The compiler will skip (no error) any explicit conversion function for implicit conversion sequences. Of course, if there are no alternatives left, the compiler will still give an error.
The arithmetic conversions
Integer types such as char and short can be converted to "greater" types such as int and long, and smaller floating-point types can similarly be converted into greater types. Signed and unsigned integer types can be converted into each other. Integer and floating-point types can be changed into each other.
Base and Derived conversions
Since C++ is an OO language, there are a number of casts where the relation between Base and Derived matters. Here it is very important to understand the difference between actual objects, pointers, and references (especially if you're coming from .Net or Java). First, the actual objects. They have precisely one type, and you can convert them to any base type (ignoring private base classes for the moment). The conversion creates a new object of base type. We call this "slicing"; the derived parts are sliced off.
Another type of conversion exists when you have pointers to objects. You can always convert a Derived* to a Base*, because inside every Derived object there is a Base subobject. C++ will automatically apply the correct offset of Base with Derived to your pointer. This conversion will give you a new pointer, but not a new object. The new pointer will point to the existing sub-object. Therefore, the cast will never slice off the Derived part of your object.
The conversion the other way is trickier. In general, not every Base* will point to Base sub-object inside a Derived object. Base objects may also exist in other places. Therefore, it is possible that the conversion should fail. C++ gives you two options here. Either you tell the compiler that you're certain that you're pointing to a subobject inside a Derived via a static_cast<Derived*>(baseptr), or you ask the compiler to check with dynamic_cast<Derived*>(baseptr). In the latter case, the result will be nullptr if baseptr doesn't actually point to a Derived object.
For references to Base and Derived, the same applies except for dynamic_cast<Derived&>(baseref) : it will throw std::bad_cast instead of returning a null pointer. (There are no such things as null references).
User-defined conversions
There are two ways to define user conversions: via the source type and via the destination type. The first way involves defining a member operator DestinatonType() const in the source type. Note that it doesn't have an explicit return type (it's always DestinatonType), and that it's const. Conversions should never change the source object. A class may define several types to which it can be converted, simply by adding multiple operators.
The second type of conversion, via the destination type, relies on user-defined constructors. A constructor T::T which can be called with one argument of type U can be used to convert a U object into a T object. It doesn't matter if that constructor has additional default arguments, nor does it matter if the U argument is passed by value or by reference. However, as noted before, if T::T(U) is explicit, then it will not be considered in implicit conversion sequences.
it is possible that multiple conversion sequences between two types are possible, as a result of user-defined conversion sequences. Since these are essentially function calls (to user-defined operators or constructors), the conversion sequence is chosen via overload resolution of the different function calls.
Don't know of one so lets see if it can't be made here...hopefully I get it right.
First off, implicit/explicit:
Explicit "conversion" happens everywhere that you do a cast. More specifically, a static_cast. Other casts either fail to do any conversion or cover a different range of topics/conversions. Implicit conversion happens anywhere that conversion is happening without your specific say-so (no casting). Consider it thusly: Using a cast explicitly states your intent.
Promotion:
Promotion happens when you have two or more types interacting in an expression that are of different size. It is a special case of type "coercion", which I'll go over in a second. Promotion just takes the small type and expands it to the larger type. There is no standard set of sizes for numeric types but generally speaking, char < short < int < long < long long, and, float < double < long double.
Coercion:
Coercion happens any time types in an expression do not match. The compiler will "coerce" a lesser type into a greater type. In some cases, such as converting an integer to a double or an unsigned type into a signed type, information can be lost. Coercion includes promotion, so similar types of different size are resolved in that manner. If promotion is not enough then integral types are converted to floating types and unsigned types are converted to signed types. This happens until all components of an expression are of the same type.
These compiler actions only take place regarding raw, numeric types. Coercion and promotion do not happen to user defined classes. Generally speaking, explicit casting makes no real difference unless you are reversing promotion/coercion rules. It will, however, get rid of compiler warnings that coercion often causes.
User defined types can be converted though. This happens during overload resolution. The compiler will find the various entities that resemble a name you are using and then go through a process to resolve which of the entities should be used. The "identity" conversion is preferred above all; this means that a f(t) will resolve to f(typeof_t) over anything else (see Function with parameter type that has a copy-constructor with non-const ref chosen? for some confusion that can generate). If the identity conversion doesn't work the system then goes through this complex higherarchy of conversion attempts that include (hopefully in the right order) conversion to base type (slicing), user-defined constructors, user-defined conversion functions. There's some funky language about references which will generally be unimportant to you and that I don't fully understand without looking up anyway.
In the case of user type conversion explicit conversion makes a huge difference. The user that defined a type can declare a constructor as "explicit". This means that this constructor will never be considered in such a process as I described above. In order to call an entity in such a way that would use that constructor you must explicitly do so by casting (note that syntax such as std::string("hello") is not, strictly speaking, a call to the constructor but instead a "function-style" cast).
Because the compiler will silently look through constructors and type conversion overloads during name resolution, it is highly recommended that you declare the former as 'explicit' and avoid creating the latter. This is because any time the compiler silently does something there's room for bugs. People can't keep in mind every detail about the entire code tree, not even what's currently in scope (especially adding in koenig lookup), so they can easily forget about some detail that causes their code to do something unintentional due to conversions. Requiring explicit language for conversions makes such accidents much more difficult to make.
For integer types, check the book Secure Coding n C and C++ by Seacord, the chapter about integer overflows.
As for implicit type conversions, you will find the books Effective C++ and More Effective C++ to be very, very useful.
In fact, you shouldn't be a C++ developer without reading these.

What's the -complete- list of kinds of automatic type conversions a C++ compiler will do for a function argument?

Given a C++ function f(X x) where x is a variable of type X, and a variable y of type Y, what are all the automatic/implicit conversions the C++ compiler will perform on y so that the statement "f(y);" is legal code (no errors, no warnings)?
For example:
Pass Derived& to function taking Base& - ok
Pass Base& to function Derived& - not ok without a cast
Pass int to function taking long - ok, creates a temporary long
Pass int& to function taking long& - NOT ok, taking reference to temporary
Note how the built-in types have some quirks compared to classes: a Derived can be passed to function taking a Base (although it gets sliced), and an int can be passed to function taking a long, but you cannot pass an int& to a function taking a long&!!
What's the complete list of cases that are always "ok" (don't need to use any cast to do it)?
What it's for: I have a C++ script-binding library that lets you bind your C++ code and it will call C++ functions at runtime based on script expressions. Since expressions are evaluated at runtime, all the legal combinations of source types and function argument types that might need to be used in an expression have to be anticipated ahead of time and precompiled in the library so that they'll be usable at runtime. If I miss a legal combination, some reasonable expressions won't work in runtime expressions; if I accidently generate a combination that isn't legal C++, my library just won't compile.
Edit (narrowing the question):
Thanks, all of your answers are actually pretty helpful. I knew the answer was complicated, but it sounds like I've only seen the tip of the iceberg.
Let me rephrase the question a little then to limit its scope then:
I will let the user specify a list of "BaseClasses" and a list of "UserDefinedConversions". For Bases, I'll generate everything including reference and pointer conversions. But what cases (const/reference/pointer) can I safely do from the UserDefined Conversions list? (The user will give bare types, I will decorate with *, &, const, etc. in the template.)
C++ Standard gives the answer to your question in 13.3.3.1 Implicit conversion sequences, but it too large to post it here. I recommend you to read at least that part of C++ Standard.
Hope this link will help you.
Unfortunately the answer to your question is hugely complex, occupying at least 9 pages in the ISO C++ standard (specifically: ~6 pages in "3 Standard Conversions" and ~3 pages in "13.3.3.1 Implicit Conversion Sequences").
Brief summary: A conversion that does not require a cast is called an "implicit conversion sequence". C++ has "standard conversions", which are conversions between fundamental types (such as char being promoted to int) and things such as array-to-pointer decay; there can be several of these in a row, hence the term "sequences". C++ also permits user-defined conversions, which are defined by conversion functions and converting constructors. The important thing to note is that an implicit conversion sequence can have at most one user-defined conversion, with optionally a sequence of standard conversions on either side -- C++ will never "chain" more than one user-defined conversion together without a cast.
(If anyone would like to flesh this post out with the full details, please go ahead... But for me, that would just be too exhausting, sorry :-/)
Note how the built-in types have some
quirks compared to classes: a Derived
can be passed to function taking a
Base (although it gets sliced), and an
int can be passed to function taking a
long, but you cannot pass an int& to a
function taking a long&!!
That's not a quirk of built-in vs. class types. It's a quirk of inheritance.
If you had classes A and B, and B had a conversion to A (either because A has a constructor taking B, or because B has a conversion operator to A), then they'd behave just like int and long in this respect - conversion can occur where a function takes a value, but not where it takes a non-const reference. In both cases the problem is that there is no object to which the necessary non-const reference can be taken: a long& can't refer to an int, and an A& can't refer to a B, and no non-const reference can refer to a temporary.
The reason the base/derived example doesn't encounter this problem because a non-const Base reference can refer to a Derived object. The fact that the types are user-defined is a necessary but not a sufficient condition for the reference to be legal. Convertible user-defined classes where there is no inheritance behave just like built-ins.
This comment is way too long for comments, so I've used an answer. It doesn't actually answer your question, though, other than to distinguish between:
"Conversions" where a reference to a derived class is passed to a function taking a reference to a base class.
Conversions where a user-defined or built-in conversion actually creates an object, such as from int to long.