Comparing two constexpr pointers is not constexpr? - c++

I am looking for a way to map types to numeric values at compile time, ideally without using a hash as proposed in this answer.
Since pointers can be constexpr, I tried this:
struct Base{};
template<typename T> struct instance : public Base{};
template<typename T>
constexpr auto type_instance = instance<T>{};
template<typename T>
constexpr const Base* type_pointer = &type_instance<T>;
constexpr auto x = type_pointer<int> - type_pointer<float>; // not a constant expression
Both gcc and clang reject this code because type_pointer<int> - type_pointer<float> is not a constant expression, see here, for instance.
Why though?
I can understand that the difference between both values is not going to be stable from one compilation to the next, but within one compilation, it should be constexpr, IMHO.

Subtraction of two non-null pointers which do not point into the same array or to the same object (including one-past-the-array/object) is undefined behavior, see [expr.add] (in particular paragraph 5 and 7) of the C++17 standard (final draft).
Expressions which would have core undefined behavior[1] if evaluated are never constant expressions, see [expr.const]/2.6.
Therefore type_pointer<int> - type_pointer<float> cannot be a constant expression, because the two pointers are to unrelated objects.
Since type_pointer<int> - type_pointer<float> is not a constant expression, it cannot be used to initialize a constexpr variable such as
constexpr auto x = type_pointer<int> - type_pointer<float>;
Trying to use a non-constant expression as initializer to a constexpr variable makes the program ill-formed and requires the compiler to print a diagnostic message. This is what the error message you are seeing is.
Basically compilers are required to diagnose core undefined behavior when it appears in purely compile-time contexts.
You can see that there will be no error if the pointers are to the same object, e.g.:
constexpr auto x = type_pointer<int> - type_pointer<int>;
Here the subtraction is well-defined and the initializer is a constant expression. So the code will compile (and won't have undefined behavior). x will have a well-defined value of 0.
Be aware that if you make x non-constexpr the compiler won't be required to diagnose the undefined behavior and to print a diagnostic message anymore. It is therefore likely to compile.
Subtracting unrelated pointers is still undefined behavior though, not only unspecified behavior. Therefore you will loose any guarantee on what the resulting program will do. It does not only mean that you will get different values for x in each compilation/execution of the code.
[1] Core undefined behavior here refers to undefined behavior in the core language, in contrast with undefined behavior due to use of the standard library. It is unspecified whether undefined behavior as specified for the library causes an (otherwise constant) expression to not be a constant expression, see final sentence before the example in [expr.const]/2.

Related

Does implicit object creation apply in constant expressions?

#include <memory>
int main() {
constexpr auto v = [] {
std::allocator<char> a;
auto x = a.allocate(10);
x[2] = 1;
auto r = x[2];
a.deallocate(x, 10);
return r;
}();
return v;
}
Is the program ill-formed? Clang thinks so, GCC and MSVC don't: https://godbolt.org/z/o3bcbxKWz
Removing the constexpr I think the program is not ill-formed and has well-defined behavior:
By [allocator.members]/5 the call a.allocate(10) starts the lifetime of the char[10] array it allocates storage for.
According to [intro.object]/13 starting the lifetime of an array of type char implicitly creates objects in its storage.
Scalar types such as char are implicit lifetime types. ([basic.types.general]/9
[intro.object]/10 then says that objects of type char are created in the storage of the char[10] array (and their lifetime started) if that can give the program defined behavior.
Without beginning the lifetime of the char object at x[2], the program without constexpr would have undefined behavior due to the write to x[2] outside its lifetime, but the char object can be implicitly created due to the arguments above, making the program behavior well-defined to exit with status 1.
With constexpr, I am wondering if the program is ill-formed or not. Does implicit object creation apply in constant expressions?
According to [intro.object]/10 objects are implicitly created to give the program defined behavior, but does being ill-formed count as defined behavior?
If not, then the program should not be ill-formed because of implicit creation of the char object for x[2].
If yes, then the next question would be if it is unspecified whether the program is ill-formed or not, because [intro.object]/10 also says that it is unspecified which objects are implicitly created if multiple sets can give the program defined behavior.
From a language design perspective I would expect that implicit object creation is not supposed to happen in constant expressions, because verifying the (non-)existence of a set of objects making the constant expression valid is probably infeasible for a compiler in general.
2469. Implicit object creation vs constant expressions
It is not intended that implicit object creation, as described in 6.7.2 [intro.object] paragraph 10, should occur during constant expression evaluation, but there is currently no wording prohibiting it.
Clang is incorrect here. You've already cited all the parts in the spec that make it well-formed. std::allocator<T>::allocate is constexpr; you get a pointer to char*; allocator<T>::allocate creates an array of T; creating an array of char implicitly creates objects; accessing a char attempts to cause UB, but IOC prevents UB, so UB doesn't happen. Therefore: the code isn't allowed to be il-formed.
Clang claims full support for both IOC and constexpr allocators, so this code should work.
Does implicit object creation apply in constant expressions?
All expressions are core constant expressions unless [expr.const]/5 explicitly excludes it. Nothing there mentions operations which might be UB that determines which objects are created, so such operations must be included.
IOC prevents an expression from being UB.
I would expect that implicit object creation is not supposed to happen in constant expressions, because verifying the (non-)existence of a set of objects making the constant expression valid is probably infeasible for a compiler in general.
You have forgotten about the other restrictions on constexpr code. So long as [expr.const]/5 continues to explicitly forbid reinterpret_cast and conversions from void*, the number of ways you can abuse IOC is pretty limited. You cannot, for example, take the pointer returned by your allocate(10) call and convert it to an int*. So the compiler knows that the only objects that can be implicitly created in that storage are chars.
So at constant evaluation time, the compiler could just take the result of allocator<char>::allocate and create all the char members of that array immediately before returning it. There's no constexpr-valid way for you to take that storage and implicitly create anything other than chars.
And using allocator<T>::allocate when T isn't a byte-wise type will not implicitly create objects in that storage. So either you're just getting a pointer to an array of unformed elements, or you're getting a pointer to an array of byte-wise types.
I'd guess Clang forgot to check this particular case.

Is reading a variable outside its lifetime during constant evaluation diagnosable?

Shall one expect a reliable failure of in constant evaluation if it reads a variable outside of its lifetime?
For example:
constexpr bool g() {
int * p = nullptr;
{
int c = 0;
p = &c;
}
return *p == 0;
};
int main() {
static_assert( g() );
}
Here Clang stops with the error
read of object outside its lifetime is not allowed in a constant expression
But GCC accepts the program silently (Demo).
Are both compilers within their rights, or GCC must fail the compilation as well?
GCC dropped the ball.
[expr.const]
5 An expression E is a core constant expression unless the
evaluation of E, following the rules of the abstract machine
([intro.execution]), would evaluate one of the following:
...
an operation that would have undefined behavior as specified in [intro] through [cpp];
...
Indirection via dangling pointer has undefined behavior.
[basic.stc.general]
4 When the end of the duration of a region of storage is reached, the values of all pointers representing the address of any part of that region of storage become invalid pointer values. Indirection through an invalid pointer value and passing an invalid pointer value to a deallocation function have undefined behavior. Any other use of an invalid pointer value has implementation-defined behavior.
So the invocation of g() may not be a constant expression, and may not appear in the condition of a static_assert which must be constant evaluated.
The program is ill-formed.
The above quotes are from the C++20 standard draft, but C++17 has them too.
Shall one expect a reliable failure of in constant evaluation if it reads a variable outside of its lifetime?
Yes, but your example doesn't necessarily do this. Its behavior is implementation-defined.
When the block with the variable c exits ([basic.stc.auto]/1), the value of p becomes an invalid pointer value ([basic.stc.general]/4).
When *p is evaluated, the lvalue-to-rvalue conversion ([conv.lval]) is applied to p. And [conv.lval]/3 says:
The result of the conversion is determined according to the following rules:
...
— Otherwise, if the object to which the glvalue refers contains an invalid pointer value, the behavior is implementation-defined.
So.
Are both compilers within their rights, or GCC must fail the compilation as well?
AFAIK neither of the implementations define its behavior here, but I think it could theoretically be defined in such a way that neither the conversion nor the rest of the evaluation would make g() not a constant expression.

Do all transient allocations have unique address?

While reading comments of a C++ Weekly video about the constexpr new support in C++20 I found the comment that alleges that C++20 allows UB in constexpr context.
At first I was convinced that comment is right, but more I thought about it more and more I began to suspect that C++20 wording contains some clever language that makes this defined behavior.
Either that all transient allocations return unique addresses or maybe some more general notion in C++ that makes 2 distinct allocation pointers always(even in nonconstexpr context) compare false even if at runtime in reality it is possible that allocator would give you back same address(since you deleted the first allocation).
As a bonus weirdness: you can only use == for comparison, <, > fail...
Here is the program with alleged UB in constexpr:
#include <iostream>
static constexpr bool f()
{
auto p = new int(1);
delete p;
auto q = new int(2);
delete q;
return p == q;
}
int main()
{
constexpr bool res1 = f();
std::cout << res1 << std::endl; // May output 0 or 1
}
godbolt
The result here is implementation-defined. res1 could be false, true, or ill-formed, based on how the implementation wants to define it. And this is just as true for equality comparison as it is for relational comparison.
Both [expr.eq] (for equality) and [expr.rel] (for relational) start by doing an lvalue-to-rvalue conversion on the pointers (because we have to actually read what the value is to do a comparison). [conv.lval]/3 says that the result of that conversion is:
Otherwise, if the object to which the glvalue refers contains an invalid pointer value ([basic.stc.dynamic.deallocation], [basic.stc.dynamic.safety]), the behavior is implementation-defined.
That is the case here: both pointers contain an invalid pointer value, as per [basic.stc.general]/4:
When the end of the duration of a region of storage is reached, the values of all pointers representing the address of any part of that region of storage become invalid pointer values. Indirection through an invalid pointer value and passing an invalid pointer value to a deallocation function have undefined behavior. Any other use of an invalid pointer value has implementation-defined behavior.
with a footnote reading:
Some implementations might define that copying an invalid pointer value causes a system-generated runtime fault.
So the value we get out of the lvalue-to-rvalue conversion is... implementation-defined. It could be implementation-defined in a way that causes those two pointers to compare equal. It could be implementation-defined in a way that causes those two pointers to compare not equal (as apparently all implementations do). Or it could even be implementation-defined in a way that causes the comparison between those two pointers to be unspecified or undefined behavior.
Notably, [expr.const]/5 (the main rule governing constant expressions), despite rejecting undefined behavior and explicitly rejecting any comparison whose result is unspecified ([expr.const]/5.23), says nothing about a comparison whose result is implementation-defined.
There's no undefined behavior here. Anything goes. Which is admittedly very weird during constant evaluation, where we'd expect to see a stricter set of rules.
Notably, with p < q, it appears that gcc and clang reject the comparison as being not a constant expression (which is... an allowed result) while msvc considers both p < q and p > q to be constant expressions whose value is false (which is... also an allowed result).

Is difference between two pointers legal c++17 constant expression?

According to cppreference section Core constant expressions point 19) a subtraction operator between two pointers is not legal constant expression until c++14. Can I assume that following code is legal c++17 code or is this interpretation an abuse?
int X, Y;
template <long long V>
struct S { };
int main() {
S<&X - &Y> s;
(void)s;
}
The question is moot. Pointer arithmetics is only defined on the pointers belonging to the same array, which is certainly not the case there. So, the code above is not legal C++, and in fact, fails to compile with compilers available to me.
The quoted cppref article says
A core constant expression is any expression that does not have any
one of the following ..
7) An expression whose evaluation leads to any form of core language
(since C++17) undefined behavior (including signed integer overflow,
division by zero, pointer arithmetic outside array bounds, etc).
Whether standard library undefined behavior is detected is
unspecified. (since C++17)
19) a subtraction operator between two pointers(until C++14)
Likely only array ptr arithemtics inside array bounds is getting 'legalized' since c++14, not all pointer arithmetics
Actually a demo shows that array ptr arithmetics compiles alright even with c++11 (not c++98)

Endianness in constexpr

I want to create a constexpr function that returns the endianness of the system, like so:
constexpr bool IsBigEndian()
{
constexpr int32_t one = 1;
return (reinterpret_cast<const int8_t&>(one) == 0);
}
Now, since the function will get executed at compile time rather than on the actual target machine, what guarantee does the C++ spec give to make sure that the correct result is returned?
None. In fact, the program is ill-formed. From [expr.const]:
A conditional-expression e is a core constant expression unless the evaluation of e, following the rules of the
abstract machine (1.9), would evaluate one of the following expressions:
— [...]
— a reinterpret_cast.
— [...]
And, from [dcl.constexpr]:
For a constexpr function or constexpr constructor that is neither defaulted nor a template, if no argument
values exist such that an invocation of the function or constructor could be an evaluated subexpression of
a core constant expression (5.20), or, for a constructor, a constant initializer for some object (3.6.2), the
program is ill-formed; no diagnostic required.
The way to do this is just to hope that your compiler is nice enough to provide macros for the endianness of your machine. For instance, on gcc, I could use __BYTE_ORDER__:
constexpr bool IsBigEndian() {
#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
return false;
#else
return true;
#endif
}
As stated by Barry, your code is not legal C++. However, even if you took away the constexpr part, it would still not be legal C++. Your code violates strict aliasing rules and therefore represents undefined behavior.
Indeed, there is no way in C++ to detect the endian-ness of an object without invoking undefined behavior. Casting it to a char* doesn't work, because the standard doesn't require big or little endian order. So while you could read the data through a byte, you would not be able to legally infer anything from that value.
And type punning through a union fails because you're not allowed to type pun through a union in C++ at all. And even if you did... again, C++ does not restrict implementations to big or little endian order.
So as far as C++ as a standard is concerned, there is no way to detect this, whether at compile-time or runtime.