Is it valid to convert an enum to a different enum via int conversion, like illustrated below ?
It looks like gcc for x64 has no problem with it, but is it something to expect with other compilers and platforms as well ?
What happens when a equals A_third and has no equivalent in enum_B ?
enum enum_A {
A_first = 0,
A_second,
A_third
};
enum enum_B {
B_first = 0,
B_second
};
enum_A a = A_first;
enum_B b;
b = enum_B(int(a));
You have to be careful when doing this, due to some edge cases:
From the C++11 standard (§7.2,6):
For an enumeration whose underlying type is not fixed, the underlying
type is an integral type that can represent all the enumerator values
defined in the enumeration. If no integral type can represent all the
enumerator values, the enumeration is ill-formed. It is
implementation-defined which integral type is used as the underlying
type except that the underlying type shall not be larger than int
unless the value of an enumerator cannot fit in an int or unsigned
int.
This means that it is possible that an enum is a larger type than an int so the conversion from enum to int could fail with undefined results.
Subject to the above, it is possible to convert an int to an enum that results in an enumerator value that the enum does not specify explicitly. Informally, you can think of an enum as being an integral type with a few values explicitly defined with labels.
Related
I was going through the C++ standard n4713.pdf. Consider below code:
#include <iostream>
#include <type_traits>
enum UEn
{
EN_0,
EN_1,
EN_L = 0x7FFFFFFFFFFFFFFF // EN_L has type "long int"
}; // UEn has underlying type "unsigned long int"
int main()
{
long lng = 0x7FFFFFFFFFFFFFFF;
std::cout << std::boolalpha;
std::cout << "typeof(unsigned long == UEn):" << std::is_same<unsigned long, std::underlying_type<UEn>::type>::value << std::endl; // Outputs "true"
std::cout << "sizeof(EN_L):" << sizeof(EN_L) << std::endl;
std::cout << "sizeof(unsigned):" << sizeof(unsigned) << std::endl;
std::cout << "sizeof(unsigned long):" << sizeof(unsigned long) << std::endl;
std::cout << "sizeof(unsigned long):" << sizeof(unsigned long long) << std::endl;
lng = EN_L + 1; // Invokes UB as EN_L is 0x7FFFFFFFFFFFFFFF and has type "long int"
return 0;
}
The above code outputs (tested on g++-8.1, Clang):
typeof(unsigned long == UEn):true sizeof(EN_L):8 sizeof(unsigned):4 sizeof(unsigned long):8 sizeof(unsigned long):8
As per Section 10.2p5 (10.2 Enumeration declarations):
Following the closing brace of an enum-specifier, each enumerator has
the type of its enumeration...If the underlying type is not fixed, the
type of each enumerator prior to the closing brace is determined as
follows:
If an initializer is specified for an enumerator, the
constant-expression shall be an integral constant expression (8.6). If
the expression has unscoped enumeration type, the enumerator has the
underlying type of that enumeration type, otherwise it has the same
type as the expression.
If no initializer is specified for the first
enumerator, its type is an unspecified signed integral type.
Otherwise the type of the enumerator is the same as that of the
preceding enumerator unless the incremented value is not representable
in that type, in which case the type is an unspecified integral type
sufficient to contain the incremented value. If no such type exists,
the program is ill-formed.
Further, section 10.2p7 states:
For an enumeration whose underlying type is not fixed, the underlying
type is an integral type that can represent all the enumerator values
defined in the enumeration. If no integral type can represent all the
enumerator values, the enumeration is ill-formed. It is
implementation-defined which integral type is used as the underlying
type except that the underlying type shall not be larger than int
unless the value of an enumerator cannot fit in an int or unsigned
int.
Thus I have following questions:
Why is the underlying type of enum UEn an unsigned long when 0x7FFFFFFFFFFFFFFF is an integer constant of type long int and thus type of EN_L is also long int. Is this a compiler bug or well defined behaviour?
When the standard says each enumerator has the type of its enumeration, shouldn't it imply that the integral types of enumerator and enumeration should also match? What could be the reason in having these two different from each other?
The underlying type is implementation-defined. It only has to be able to represent every enumerator, and it can't be larger than int unless required. There is no requirement on signedness (aside that the base type has to be able to represent every enumerator), per dcl.enum.7, as you already found. This limits the back-propagation of enumerators' types more than you appear to assume. Notably, it doesn't say anywhere that the base type of the enum has to be the type of any of the enumerators' initializer.
Clang prefers unsigned integers as enum bases over signed integers; that's all there is to it. Importantly, the type of the enum does not have to match any specific enumerator's type: it only has to be able to represent every enumerator. This is fairly normal and well-understood in other contexts. For instance, if you had EN_1 = 1, it wouldn't surprise you that the enum's base type isn't int or unsigned int, even though 1 is an int.
You are also correct in saying that the type of 0x7fffffffffffffff is long. Clang agrees with you, however it implicitly casts the constant to unsigned long:
TranslationUnitDecl
`-EnumDecl <line:1:1, line:5:1> line:1:6 Foo
|-EnumConstantDecl <line:2:5> col:5 Frob 'Foo'
|-EnumConstantDecl <line:3:5> col:5 Bar 'Foo'
`-EnumConstantDecl <line:4:5, col:11> col:5 Baz 'Foo'
`-ImplicitCastExpr <col:11> 'unsigned long' <IntegralCast>
`-IntegerLiteral <col:11> 'long' 576460752303423487
This is allowed, because as we said before, the enumeration's base type doesn't need to be the verbatim type of any enumerator.
When the standard says that each enumerator has the type of the enumeration, it means that the type of EN_1 is enum UEn after the enum's closing brace. Note the "after the closing brace" and "prior to the closing brace" mentions.
Prior to the closing brace, if the enum has no fixed type, then the type of each enumerator is that of its initializing expression type, but this is only temporary. This is what allows you, for instance, to write EN_2 = EN_1 + 1 without casting EN_1, even in the scope of an enum class. This is no longer true after the closing brace. You can trick the compiler into showing you by inspecting error messages or by looking at disassembly:
template<typename T>
T tell_me(const T&& value);
enum Foo {
Baz = 0x7ffffffffffffff,
Frob = tell_me(Baz)
// non-constexpr function 'tell_me<long>' cannot be used in a constant expression
};
Notice that in this case T was inferred to be long, but after the closing brace, it's inferred to be Foo:
template<typename T>
T tell_me(const T&& value);
enum Foo {
Baz = 0x7ffffffffffffff
};
int main() {
tell_me(Baz);
// call Foo tell_me<Foo>(Foo const&&)
}
If you want your enum type to be signed with Clang, you need to specify it using the : base_type syntax, or you need to have a negative enumerator.
I believe the answer for this (admittedly unintuitive) warning is in 7.6 Integral promotions [conv.prom]:
A prvalue of an unscoped enumeration type whose underlying type is not
fixed (10.2) can be converted to a prvalue of the first of the
following types that can represent all the values of the enumeration
(i.e., the values in the range bmin to bmax as described in 10.2):
int, unsigned int, long int, unsigned long int, long long int, or
unsigned long long int.
I.e., if your underlying type is not fixed, and you use an enumeration member in an expression, it doesn't necessarily convert to the enumeration's underlying type. It instead converts to the first type in that list in which all members fit.
Don't ask me why, the rule seems nuts to me.
This section goes on to say:
A prvalue of an unscoped enumeration type whose underlying type is
fixed (10.2) can be converted to a prvalue of its underlying type.
I.e. if you fix the underlying type with unsigned long:
enum UEn : unsigned long
...
then the warning goes away.
Another way to get rid of the warning (and leave the underlying type not fixed) is to add a member which requires unsigned long storage:
EN_2 = 0x8000000000000000
Then again, the warning goes away.
Good question. I learned a lot in answering it.
The wording of section 10.2p5 explicitly says "...prior to the closing brace..." suggests the following interpretation. The type of an enumerator within the definition of the enum type (before closing brace) is chosen to be some integral type large enough to represent its value. This value may then be reused in the definition of subsequent enumerators definition in the same enum. When the enum type closing brace is encountered, then the compiler chooses an integral type large enough to represent all enumerator values. After the definition of the enum type, all enumerator values have the same type (which is the enum type) and share the underlying type of the enum. For example:
#include <iostream>
#include <typeinfo>
#include <type_traits>
enum E1
{
e1 = 0, // type of the initializer (int), value = 0
e2 = e1 + 1U, // type of the initializer (unsigned = int + unsigned), value = 1U
e3 = e1 - 1, // type of the initializer (int = int - int), value = -1
}; // range of values [-1, 1], underlying type is int
int main()
{
std::cout << typeid(std::underlying_type<E1>::type).name() << '\n';
std::cout << typeid(e1).name() << '\n';
std::cout << typeid(e2).name() << '\n';
std::cout << typeid(e3).name() << '\n';
}
Ran with clan5 and gcc8 and it outputs:
i
2E1
2E1
2E1
Why does the code below compile without any errors?
enum class Enumeration;
void func()
{
auto enumeration = static_cast<Enumeration>(2);
auto value = static_cast<int>(enumeration);
}
It compiles because the compiler knows at compile time the size of Enumeration (which happens to be empty).
You see it explicitly using the following syntax:
enum class Enumeration : short;
The compiler knows everything there is to know about the Enumeration.
Enumeration is a opaque-enum-declaration which means also that the type is complete i.e. you can use sizeofon it. If needed you can specify the list of enumerators in a later redeclaration (unless the redeclaration comes with a different underlying type, obviously).
Note that since you are using enum class usage of static_cast is mandatory.
Strongly typed enum does not allow implicit conversion to int but you can safely use static_cast on them to retrieve their integral value.
They are still enum afterall.
Quoting cppreference
There are no implicit conversions from the values of a scoped
enumerator to integral types, although static_cast may be used to
obtain the numeric value of the enumerator.
More on this topic here: How to automatically convert strongly typed enum into int?
I came across some code like the following in one the CppCon 2014 talks that confused the heck out of me. The audience accepted it without comment, so I presume that it's legal:
enum class Foo { Bar };
Foo const v1 = Foo(5);
The question is: why does this compile? I would expect compilation to fail and complain that we can't convert an int to a Foo. The slightly modified line below fails with the expected error:
Foo const v1(5);
Scoped enumeration types have an implicit underlying type of int, assuming no other underlying type is specified. All possible values of type int can be represented.
7.2p5:
[...] For a scoped enumeration type, the underlying type is int if it is not explicitly specified. In both of these cases, the underlying type
is said to be fixed. [...]
7.2p8:
For an enumeration whose underlying type is fixed, the values of the enumeration are the values of the underlying type. [...]
And any integral value that can be represented by the enumeration can be explicitly converted to that enumeration type, as #Columbo had pointed out in his now-deleted answer:
5.2.9p10:
A value of integral or enumeration type can be explicitly converted to an enumeration type. The value is unchanged if the original value is within the range of the enumeration values (7.2). [...]
Since there is some confusion in the comments about what that means:
enum class Foo { Bar };
Foo const v1 = Foo(5);
is well-defined. Not undefined, not unspecified, not even implementation-defined. The parts of the standard I quote explain that:
The underlying type of Foo is int, and that the underlying type is fixed.
The values of Foo are the values of int.
Since 5 is in the range of the enumeration values, the value is unchanged by the conversion.
In C++11 we can cast a strongly-typed enum (enum class) to its underlying type. But it seems we cannot cast a pointer to the same:
enum class MyEnum : int {};
int main()
{
MyEnum me;
int iv = static_cast<int>(me); // works
int* ip = static_cast<int*>(&me); // "invalid static_cast"
}
I'm trying to understand why this should be: is there something about the enum mechanism that makes it hard or nonsensical to support this? Is it a simple oversight in the standard? Something else?
It seems to me that if an enum type is truly built on top of an integral type as above, we should be able to cast not only the values but also the pointers. We can still use reinterpret_cast<int*> or a C-style cast but that's a bigger hammer than I thought we'd need.
TL;DR: The designers of C++ don't like type punning.
Others have pointed out why it's not allowed by the standard; I will try to address why the writers of the standard might have made it that way. According to this proposal, the primary motivation for strongly-typed enums was type safety. Unfortunately, type safety means many things to many people. It's fair to assume consistency was another goal of the standards committee, so let's examine type safety in other relevant contexts of C++.
C++ type safety
In C++ in general, types are unrelated unless explicitly specified to be related (through inheritance). Consider this example:
class A
{
double x;
int y;
};
class B
{
double x;
int y;
};
void foo(A* a)
{
B* b = static_cast<B*>(a); //error
}
Even though A and B have the exact same representation (the standard would even call them "standard-layout types"), you cannot convert between them without a reinterpret_cast. Similarly, this is also an error:
class C
{
public:
int x;
};
void foo(C* c)
{
int* intPtr = static_cast<int*>(c); //error
}
Even though we know the only thing in C is an int and you can freely access it, the static_cast fails. Why? It's not explicitly specified that these types are related. C++ was designed to support object-oriented programming, which provides a distinction between composition and inheritance. You can convert between types related by inheritance, but not those related by composition.
Based on the behavior you've seen, it's clear strongly-typed enums are related by composition to their underlying types. Why might this have been the model the standard committee chose?
Composition vs Inheritance
There are many articles on this issue better written than anything I could fit here, but I'll attempt to summarize. When to use composition vs. when to use inheritance is certainly a grey area, but there are many points in favor of composition in this case.
Strongly-typed enums are not intended to be used as integral values. Thus the 'is-a' relationship indicated by inheritance does not fit.
On the highest level, enums are meant to represent a set of discrete values. The fact that this is implemented through assigning an id number to each value is generally not important (unfortunately C exposes and thus enforces this relationship).
Looking back at the proposal, the listed reason for allowing a specified underlying type is to specify the size and signedness of the enum. This is much more of an implementation detail than an essential part of the enum, again favoring composition.
You could argue for days about whether or not inheritance or composition is better in this case, but ultimately a decision had to be made and the behavior was modeled on composition.
Instead, look at it in a slightly different way. You can't static_cast a long* to int* even if int and long have identical underlying representations. For same same reason an enum based on int is yet treated as a unique, unrelated type to int and as such requires the reinterpret_cast.
An enumeration is a distinct type (3.9.2) with named constants. [...] Each enumeration defines a type that is different from all other types. [...] Two enumeration types are layout-compatible if they have the same underlying type.
[dcl.enum] (§7.2)
The underlying type specifies the layout of the enum in memory, not its relation to other types in the type system (as the standard says, it's a distinct type, a type of its own). A pointer to an enum : int {} can never implicitly convert to an int*, the same way that a pointer to a struct { int i; }; cannot, even though they all look the same in memory.
So why does the implicit conversion to int work in the first place?
For an enumeration whose underlying type is fixed, the values of the
enumeration are the values of the underlying type. [...] The value of
an enumerator or an object of an unscoped enumeration type is
converted to an integer by integral promotion (4.5).
[dcl.enum] (§7.2)
So we can assign values of an enum to an int because they are of type int. An object of enum type can be assigned to an int because of the rules of integer promotion. By the way, the standard here specifically points out that this is only true for C-style (unscoped) enums. This means that you still need the static_cast<int> in the first line of your example, but as soon as you turn the enum class : int into an enum : int it will work without the explicit cast. Still no luck with the pointer type though.
Integral promotions are defined in the standard at [conv.prom] (§4.5). I'll spare you the details of quoting the full section, but the important detail here is that all rules in there apply to prvalues of non-pointer types, so none of this applies to our little problem.
The final piece of the puzzle can be found in [expr.static.cast] (§5.2.9), which describes how static_cast works.
A value of a scoped enumeration type (7.2) can be explicitly converted
to an integral type.
That explains why your cast from enum class to int works.
But note that all of the static_casts allowed on pointer types (again, I won't quote the rather lengthy section) require some relationship between the types. If you remember the beginning of the answer, each enum is a distinct type, so there is no relationship to their underlying type or other enums of the same underlying type.
This ties in with #MarkB's answer: Static-casting a pointer enum to a pointer to int is analogous to casting a pointer from one integral type to another - even if both have the same memory layout underneath and values of one will implicitly convert to the other by the rules integral promotions, they are still unrelated types, so static_cast will not work here.
I think the error of thinking is that
enum class MyEnum : int {};
is not really inheritance. Of course you can say MyEnum is an int. However, it is different from classic inheritance, inasmuch as not all operations that are available on ints are available for MyEnum also.
Let's compare this to the following: A circle is an ellipse. However, it would almost always be wrong to implement a CirlceShape as inheriting from EllipseShape since not all operations that are possible on ellipses are also possible for circle. A simple example would be scaling the shape in x direction.
Hence, to think of enum classes as inheriting from an integer type leads to the confusion in your case. You cannot increment an instance of an enum class, but you can increment integers. Since it's not really inheritance, it makes sense to prohibit casting pointers to these types statically. The following line is not safe:
++*reinterpret_cast<int*>(&me);
This might be the reason why the committee prohibited static_cast in this case. In general reinterpret_cast is considered to be evil while static_cast is considered to be ok.
The answers to your questions can be found in the section 5.2.9 Static cast in the draft standard.
Support for allowing
int iv = static_cast<int>(me);
can be obtained from:
5.2.9/9 A value of a scoped enumeration type (7.2) can be explicitly converted to an integral type. The value is unchanged if the original value can be represented by the specified type. Otherwise, the resulting value is unspecified.
Support for allowing
me = static_cast<MyEnum>(100);
can be obtained from:
5.2.9/10 A value of integral or enumeration type can be explicitly converted to an enumeration type. The value is unchanged if the original value is within the range of the enumeration values (7.2). Otherwise, the resulting value is unspecified (and might not be in that range).
Support for not allowing
int* ip = static_cast<int*>(&me);
can be obtained from:
5.2.9/11 A prvalue of type “pointer to cv1 B,” where B is a class type, can be converted to a prvalue of type “pointer to cv2 D,” where D is a class derived (Clause 10) from B, if a valid standard conversion from “pointer to D” to “pointer to B” exists (4.10), cv2 is the same cv-qualification as, or greater cv-qualification than, cv1, and B is neither a virtual base class of D nor a base class of a virtual base class of D. The null pointer value (4.10)
is converted to the null pointer value of the destination type. If the prvalue of type “pointer to cv1 B” points to a B that is actually a subobject of an object of type D, the resulting pointer points to the enclosing object of type D. Otherwise, the result of the cast is undefined.
static_cast cannot be used to cast &me to an int* since MyEnum and int are not related by inheritance.
I think the reason for first static_cast is being able to work with functions and libraries that expect old style enum or even used a bunch of defined values for enumerations and directly expect an integral type. But there is no other logical relation between type enum and an integral type, so you should use reinterpret_cast if you want that cast. but if you have problems with reinterpret_cast you can use your own helper:
template< class EnumT >
typename std::enable_if<
std::is_enum<EnumT>::value,
typename std::underlying_type<EnumT>::type*
>::type enum_as_pointer(EnumT& e)
{
return reinterpret_cast<typename std::underlying_type<EnumT>::type*>(&e);
}
or
template< class IntT, class EnumT >
IntT* static_enum_cast(EnumT* e,
typename std::enable_if<
std::is_enum<EnumT>::value &&
std::is_convertible<
typename std::underlying_type<EnumT>::type*,
IntT*
>::value
>::type** = nullptr)
{
return reinterpret_cast<typename std::underlying_type<EnumT>::type*>(&e);
}
While this answer may not satisfy you about the reason of prohibiting static_cast of enum pointers, it give you a safe way to use reinterpret_cast with them.
If I have a strongly-typed enum, with say, underlying type int, is it ok to cast an int value that does not match any enumerator to the enum type?
enum e1 : int { x = 0, y = 1 };
enum class e2 : int { x = 0, y = 1 };
int main() {
e1 foo = static_cast<e1>(42); // is this UB?
e2 bar = static_cast<e2>(42);
}
From n3290, 5.2.9 Static cast [expr.static.cast]:
10 A value of integral or enumeration type can be explicitly converted
to an enumeration type. The value is unchanged if the original value
is within the range of the enumeration values (7.2). Otherwise, the
resulting value is unspecified (and might not be in that range). [...]
Enumeration type comprises both those types that are declared with enum and those that are declared with enum class or enum struct, which the Standard calls respectively unscoped enumerations and scoped enumerations. Described in more details in 7.2 Enumeration declarations [dcl.enum].
The values of an enumeration type are not be confused with its enumerators. In your case, since the enumerations you declared all have int as their underlying types their range of values is the same as that of int: from INT_MIN to INT_MAX (inclusive).
Since 42 has type int and is obviously a value of int the behaviour is defined.