static_cast on integer to enum conversion - c++

There is some function that takes in an enum as argument
void myfunc(myEnum input);
As I understand, if I have to give an integer to this function, it is advised to explicitly cast it to enum, the reason being all integers may not be valid enum values.
As per MSDN
"The static_cast operator can explicitly convert an integral value to
an enumeration type. If the value of the integral type does not fall
within the range of enumeration values, the resulting enumeration
value is undefined."
and as per the C++ standards 5.2.9 Static cast -> 10
"A value of integral or enumeration type can be explicitly converted
to an enumeration type. The value is unchanged if the original value
is within the range of the enumeration values (7.2). Otherwise, the
resulting value is unspecified (and might not be in that range)."
So what's the point using static_cast in this scenario? Is there some option that would raise exceptions on values outside the enum range (other than writing explicit code for that)?

As usual, the compiler is just trying to keep you from shooting yourself in the foot. That's why you cannot just pass an int to a function expecting an enum. The compiler will rightfully complain, because the int might not match any valid enum value.
By adding the cast you basically tell the compiler 'Shut up, I know what I am doing'. What you are communicating here is that you are sure that the value you pass in is 'within the range of the enumeration values'. And you better make sure that is the case, or you are on a one-way trip to undefined-behavior-land.
If this is so dangerous, then why doesn't the compiler add a runtime check for the integer value? The reason is, as so often with C++, performance. Maybe you just know from the surrounding program logic that the int value will always be valid and you absolutely cannot waste any time on stupid runtime checks. From a language-design point of view, this might not be the most reasonable default to chose, especially when your goal is writing robust code. But that's just how C++ works: A developer should never have to pay for functionality that they might not want to use.

Related

C++ - What underlying type would an enum with more than 18,446,744,073,709,551,616 elements have?

This is more of a hypothetical question than a practical one. Of course there would be memory issues if we actually tried to compile a program with that large.
Enums in C++ will take on an underlying type to fit the maximum element of the enum. Also, if I specify no integer values, then each element is always 1 more than the previous starting at 0. So, for example, if I make an enum with 5 elements(or labels, however you call them) then it's type could be something like an int since that can fit the values 0,1,2,3,4.
The largest integral type in C++ is the long long, and an unsigned long long can take a value of up to 18,446,744,073,709,551,615. So what would happen if I made an enum with 18,446,744,073,709,551,616 elements? This would exceed the largest integral type, so it would need another type than long long.
What does the C++ spec have to say about this loophole?
There's no loophole. You'll just get a diagnostic in a conforming compiler.
[dcl.enum]
7 ... If no integral type can represent all the enumerator values, the enumeration is ill-formed. ...
Or more practically, you'll get an error and compilation will halt.
It might even halt sooner, due to implementation defined limits, because that's a lot of enumerators.

Is converting an integer to a pointer always well defined?

Is this valid C++?
int main() {
int *p;
p = reinterpret_cast<int*>(42);
}
Assuming I never dereference p.
Looking up the C++ standard, we have
C++17 §6.9.2/3 [basic.compound]
3 Every value of pointer type is one of the following:
a pointer to an object or function (the pointer is said to point to the object or function), or
a pointer past the end of an object ([expr.add]), or
the null pointer value ([conv.ptr]) for that type, or
an invalid pointer value.
A value of a pointer type that is a pointer to or past the end of an
object represents the address of the first byte in memory
([intro.memory]) occupied by the object or the first byte in memory
after the end of the storage occupied by the object, respectively. [
Note: A pointer past the end of an object ([expr.add]) is not
considered to point to an unrelated object of the object's type that
might be located at that address. A pointer value becomes invalid when
the storage it denotes reaches the end of its storage duration; see
[basic.stc]. — end note ] For purposes of pointer arithmetic
([expr.add]) and comparison ([expr.rel], [expr.eq]), a pointer past
the end of the last element of an array x of n elements is considered
to be equivalent to a pointer to a hypothetical array element n of x
and an object of type T that is not an array element is considered to
belong to an array with one element of type T.
p = reinterpret_cast<int*>(42); does not fit into the list of possible values. And:
C++17 §8.2.10/5 [expr.reinterpret.cast]
A value of integral type or enumeration type can be explicitly
converted to a pointer. A pointer converted to an integer of
sufficient size (if any such exists on the implementation) and back to
the same pointer type will have its original value; mappings between
pointers and integers are otherwise implementation-defined. [ Note:
Except as described in 6.7.4.3, the result of such a conversion will
not be a safely-derived pointer value. — end note ]
C++ standard does not seem to say more about the integer to pointer conversion. Looking up the C17 standard:
C17 §6.3.2.3/5 (emphasis mine)
An integer may be converted to any pointer type. Except as
previously specified, the result is implementation-defined, might not
be correctly aligned, might not point to an entity of the referenced
type, and might be a trap representation.68)
and
C17 §6.2.6.1/5
Certain object representations need not represent a value of the
object type. If the stored value of an object has such a
representation and is read by an lvalue expression that does not have
character type, the behavior is undefined. If such a representation is
produced by a side effect that modifies all or any part of the object
by an lvalue expression that does not have character type, the
behavior is undefined.50) Such a representation is called a trap
representation.
To me, it seems like any value that does not fit into the list in [basic.compound] is a trap representation, thus p = reinterpret_cast<int*>(42); is UB. Am I correct? Is there something else making p = reinterpret_cast<int*>(42); undefined?
This is not UB, but implementation-defined, and you already cited why (§8.2.10/5 [expr.reinterpret.cast]). If a pointer has invalid pointer value, it doesn't necessarily mean that it has a trap representation. It can have a trap representation, and the compiler must document this. All you have here is a not safely-derived pointer.
Note, that we generate pointers with invalid pointer value all the time: if an object is freed by delete, all the pointers which pointed to this object have invalid pointer value.
Using the resulting pointer is implementation defined as well (not UB):
[...] if the object to which the glvalue refers contains an invalid pointer value ([basic.stc.dynamic.deallocation], [basic.stc.dynamic.safety]), the behavior is implementation-defined.
The example shown is valid c++. On some platforms this is how you access "hardware resources" (and if it's not valid you have found a bug/mistake in standard text).
See also this answer for a better explanation.
Update:
The first sentence of reinterpret_cast as you quote yourself:
A value of integral type or enumeration type can be explicitly converted to a pointer.
I recommend you stop reading and rest yourself at this point. The rest of just a lot details including possible implementation specified behavior, etc. That doesn't make it UB/invalid.
Trap Representations
What: As covered by [C17 §6.2.6.1/5], a trap representation is a non-value. It is a bit pattern that fills the space allocated for an object of a given type, but this pattern does not correspond to a value of that type. It is a special pattern that can be recognized for the purpose of triggering behavior defined by the implementation. That is, the behavior is not covered by the standard, which means it falls under the banner of "undefined behavior". The standard sets out the possibilities for when a trap could be (not must be) triggered, but it makes no attempt to limit what a trap might do. For more information, see A: trap representation.
The undefined behavior associated with a trap representation is interesting in that an implementation has to check for it. The more common cases of undefined behavior were left undefined so that implementations do not need to check for them. The need to check for trap representations is a good reason to want few trap representations in an efficient implementation.
Who: The decision of which bit patterns (if any) constitute trap representations falls to the implementation. The standards do not force the existence of trap representations; when trap representations are mentioned, the wording is permissive, as in "might be", as opposed to demanding, as in "shall be". Trap representations are allowed, not required. In fact, N2091 came to the conclusion that trap representations are largely unused in practice, leading up to a proposal to remove them from the C standard. (It also proposes a backup plan if removal proves infeasible: explicitly call out that implementations must document which representations are trap representations, as there is no other way to know for sure whether or not a given bit pattern is a trap representation.)
Why: Theoretically, a trap representation could be used as a debugging aid. For example, an implementation could declare that 0xDDDD is a trap representation for pointer types, then choose to initialize all otherwise uninitialized pointers to this bit pattern. Reading this bit pattern could trigger a trap that alerts the programmer to the use of an uninitialized pointer. (Without the trap, a crash might not occur until later, complicating the debugging process. Sometimes early detection is the key.) In any event, a trap representation requires a trap of some sort to serve a purpose. An implementation would not define a trap representation without also defining its trap.
My point is that trap representations must be specified. They are deliberately removed from the set of values of a given type. They are not simply "everything else".
Pointer Values
C++17 §6.9.2/3 [basic.compound]
This section defines what an invalid pointer value is. It states "Every value of pointer type is one of the following" before listing four possibilities. That means that if you have a pointer value, then it is one of the four possibilities. The first three are fully specified (pointer to object or function, pointer past the end, and null pointer). The last possibility (invalid pointer value) is not fully specified elsewhere, so it becomes the catch-all "everything else" entry in the list (it is a "wild card", to borrow terminology from the comments). Hence this section defines "invalid pointer value" to mean a pointer value that does not point to something, does not point to the end of something, and is not null. If you have a pointer value that does not fit one of those three categories, then it is invalid.
In particular, if we agree that reinterpret_cast<int*>(42) does not point to something, does not point to the end of something, and is not null, then we must conclude that it is an invalid pointer value. (Admittedly, one could assume that the result of the cast is a trap representation for pointers in some implementation. In that case, yes, it does not fit into the list of possible pointer values because it would not be a pointer value, hence it's a trap representation. However, that is circular logic. Furthermore, based upon N2091, few implementations define any trap representations for pointers, so the assumption is likely groundless.)
[ Note: [...] A pointer value becomes invalid when the storage it denotes reaches the end of its storage duration; see [basic.stc]. — end note ]
I should first acknowledge that this is a note. It explains and clarifies without adding new substance. One should expect no definitions in a note.
This note gives an example of an invalid pointer value. It clarifies that a pointer can (perhaps surprisingly) change from "points to an object" to "invalid pointer value" without changing its value. Looking at this from a formal logic perspective, this note is an implication: "if [something] then [invalid pointer]". Viewing this as a definition of "invalid pointer" is a fallacy; it is merely an example of one of the ways one can get an invalid pointer.
Casting
C++17 §8.2.10/5 [expr.reinterpret.cast]
A value of integral type or enumeration type can be explicitly converted to a pointer.
This explicitly permits reinterpret_cast<int*>(42). Therefore, the behavior is defined.
To be thorough, one should make sure there is nothing in the standard that makes 42 "erroneous data" to the degree that undefined behavior results from the cast. The rest of [§8.2.10/5] does not do this, and:
C++ standard does not seem to say more about the integer to pointer conversion.
Is this valid C++?
Yes.

Why are C++ array index values signed and not built around the size_t type (or am I wrong in that)?

It's getting harder and harder for me to keep track of the ever-evolving C++ standard but one thing that seems clear to me now is that array index values are meant to be integers (not long long or size_t or some other seemingly more appropriate choice for a size). I've surmised this both from the answer to this question (Type of array index in C++) and also from practices used by well established C++ libraries (like Qt) which also use a simple integer for sizes and array index operators. The nail in the coffin for me is that I am now getting a plethora of compiler warnings from MSVC 2017 stating that my const unsigned long long (aka const size_t) variables are being implicitly converted to type const int when used as an array index.
The answer given by Mat in the question linked above quotes the ISO C++ standard draft n3290 as saying
it shall be an integral constant expression and its value shall be greater than zero.
I have no background in reading these specs and precisely interpreting their language, so maybe a few points of clarification:
Does an "integral constant expression" specifically forbid things like long long which to me is an integral type, just a larger sized one?
Does what they're saying specifically forbid a type that is tagged unsigned like size_t?
If all I am seeing here is true, an array index values are meant to be signed int types, why? This seems counter-intuitive to me. The specs even state that the expression "shall be greater than zero" so we're wasting a bit if it is signed. Sure, we still might want to compare the index with 0 in some way and this is dangerous with unsigned types, but there should be cheaper ways to solve that problem that only waste a single value, not an entire bit.
Also, with registers ever widening, a more future-proof solution would be to allow larger types for the index (like long long) rather than sticking with int which is a problematic type historically anyways (changing its size when processors changed to 32 bits and then not when they went to 64 bits). I even see some people talking about size_t anecdotally like it was designed to be a more future-proof type for use with sizes (and not JUST the type returned in service of the sizeof operator). But of course, that might be apocryphal.
I just want to make sure my foundational programming understanding here is not flawed. When I see experts like the ISO C++ group doing something, or the engineers of Qt, I give them the benefit of the doubt that they have a good reason! For something like an array index, so fundamental to programming, I feel like I need to know what that reason is or I might be missing something important.
Looking at [expr.sub]/1 we have
A postfix expression followed by an expression in square brackets is a postfix expression. One of the expressions shall be a glvalue of type “array of T” or a prvalue of type “pointer to T” and the other shall be a prvalue of unscoped enumeration or integral type. The result is of type “T”. The type “T” shall be a completely-defined object type.67 The expression E1[E2] is identical (by definition) to *((E1)+(E2)), except that in the case of an array operand, the result is an lvalue if that operand is an lvalue and an xvalue otherwise. The expression E1 is sequenced before the expression E2.
emphasis mine
So, the index of the subscript operator need to be a unscoped enumeration or integral type. Looking in [basic.fundamental] we see that standard integer types are signed char, short int, int, long int, and long long int, and their unsigned counterparts.
So any of the standard integer types will work and any other integer type, like size_t, will be valid types to use as an array index. The supplied value to the subscript operator can even have a negative value, so long as that value would access a valid element.
I would argue that the standard library API prefers that indexes be unsigned type. If you look at the documentation for std::size_t it notes
When indexing C++ containers, such as std::string, std::vector, etc, the appropriate type is the member typedef size_type provided by such containers. It is usually defined as a synonym for std::size_t.
This is reinforced when looking at signatures for functions such as std::vector::at
reference at( size_type pos );
const_reference at( size_type pos ) const;
I think you are confusing two types:
The first type is the type of object/value that can be use to define the size of an array. Unfortunately, the question that you link to uses index where they should have used array size. This must be an expression that must be evaluated at compile time and its value must be greater than zero.
int array[SomeExpression]; // Valid as long as SomeExpression can be evaluated
// at compile time and the value is greater than zero.
The second type is the type of object/value that can be used to access an array. Given the above array,
array[i] = SomeValue; // i is an index to access the array
i does not need to be evaluated at compile time, i must be in the range [0, SomeExpression-1]. However it is possible to use negative values as the index to access an array. Since array[i] is evaluated as *(array+i) (ignoring for the time being the overloaded operator[] functions), i can be a negative value if array happens to point to the middle of an array. My answer to another SO post has more information on the subject.
Just as an aside, since array[i] is evaluated as *(array+i), it is legal to use i[array] and is the same as array[i].

What does the C++ standard say about results of casting value of a type that lies outside the range of the target type?

Recently I had to perform some data type conversions from float to 16 bit integer. Essentially my code reduces to the following
float f_val = 99999.0;
short int si_val = static_cast<short int>(f_val);
// si_val is now -32768
This input value was a problem and in my code I had neglected to check the limits of the float value so I can see my fault, but it made me wonder about the exact rules of the language when one has to do this kind of ungainly cast. I was slightly surprised to find that value of the cast was -32768. Furthermore, this is the value I get whenever the value of the float exceeds the limits of a 16 bit integer. I have googled this but found a surprising lack of detailed info about it. The best I could find was the following from cplusplus.com
Converting to int from some smaller integer type, or to double from
float is known as promotion, and is guaranteed to produce the exact
same value in the destination type. Other conversions between
arithmetic types may not always be able to represent the same value
exactly:
If the conversion is from a floating-point type to an integer type, the value
is truncated (the decimal part is removed).
The conversions from/to bool consider false equivalent to zero (for numeric
types) and to null pointer (for pointer types); and true equivalent to all
other values.
Otherwise, when the destination type cannot represent the value, the conversion
is valid between numerical types, but the value is
implementation-specific (and may not be portable).
This suggestion that the results are implementation defined does not surprise me, but I have heard that cplusplus.com is not always reliable.
Finally, when performing the same cast from a 32 bit integer to 16 bit integer (again with a value outisde of 16 bit range) I saw results clearly indicating integer overflow. Although I was not surprised by this it has added to my confusion due to the inconsistency with the cast from float type.
I have no access to the C++ standard, but a lot of C++ people here do so I was wondering what the standard says on this issue? Just for completeness, I am using g++ version 4.6.3.
You're right to question what you've read. The conversion has no defined behaviour, which contradicts what you quoted in your question.
4.9 Floating-integral conversions [conv.fpint]
1 A prvalue of a floating point type can be converted to a prvalue of an integer type. The conversion truncates;
that is, the fractional part is discarded. The behavior is undefined if the truncated value cannot be
represented in the destination type. [ Note: If the destination type is bool, see 4.12. -- end note ]
One potentially useful permitted result that you might get is a crash.

difference between widening and narrowing in c++?

What is the difference between widening and narrowing in c++ ?
What is meant by casting and what are the types of casting ?
This is a general casting thing, not C++ specific.
A "widening" cast is a cast from one type to another, where the "destination" type has a larger range or precision than the "source" (e.g. int to long, float to double). A "narrowing" cast is the exact opposite (long to int). A narrowing cast introduces the possibility of overflow.
Widening casts between built-in primitives are implicit, meaning you do not have to specify the new type with the cast operator, unless you want the type to be treated as the wider type during a calculation. By default, types are cast to the widest actual type used on the variable's side of a binary expression or assignment, not counting any types on the other side).
Narrowing casts, on the other hand, must be explicitly cast, and overflow exceptions must be handled unless the code is marked as not being checked for overflow (the keyword in C# is unchecked; I do not know if it's unique to that language)
widening conversion is when you go from a integer to a double, you are increasing the precision of the cast.
narrowing conversion is the inverse of that, when you go from double to integer. You are losing precision
There are two types of casting , implicit and explicit casting. The page below will be helpful. Also the entire website is pretty much the goto for c/c++ needs.
Tutorial on casting and conversion
Take home exam? :-)
Let's take casting first. Every object in C or C++ has a type, which is nothing more than the name give to two kinds of information: how much memory the thing takes up, and what operations you can do on it.
So
int i;
just means that i refers to some location in memory, usually 32 bits wide, on which you can do +,-,*,/,%,++,-- and some others.
Ci isn't really picky about it, though:
int * ip;
defines another type, called pointer to integer which represents an address in memory. It has an additional opertion, prefix-*. On many machines, that also happens to be 32 bits wide.
A cast, or typecast tell the compiler to treat memory identified as one type as if it were another type. Typecasts are written as (typename).
So
(int*) i;
means "treat i as if it were a pointer, and
(int) ip;
means treat the pointer ip as just an integer number.
Now, in this context, widening and narrowing mean casting from one type to another that has more or fewer bits respectively.