Why is 0 == ("abcde"+1) not a constant expression? - c++

Why doesn't the following code compile?
// source.cpp
int main()
{
constexpr bool result = (0 == ("abcde"+1));
}
The compile command:
$ g++ -std=c++14 -c source.cpp
The output:
source.cpp: In function ‘int main()’:
source.cpp:4:32: error: ‘((((const char*)"abcde") + 1u) == 0u)’ is not a constant expression
constexpr bool result = (0 == ("abcde"+1));
~~~^~~~~~~~~~~~~~~
I'm using gcc6.4.

The restrictions on what can be used in a constant expression are defined mostly as a list of negatives. There's a bunch of things you're not allowed to evaluate ([expr.const]/2 in C++14) and certain things that values have to result in ([expr.const]/4 in C++14). This list changes from standard to standard, becoming more permissive with time.
In trying to evaluate:
constexpr bool result = (0 == ("abcde"+1));
there is nothing that we're not allowed to evaluate, and we don't have any results that we're not allowed to have. No undefined behavior, etc. It's a perfectly valid, if odd, expression. Just one that gcc 6.3 happens to disallow - which is a compiler bug. gcc 7+, clang 3.5+, msvc all compile it.
There seems to be a lot of confusion around this question, with many comments suggesting that since the value of a string literal like "abcde" is not known until runtime, you cannot do anything with such a pointer during constant evaluation. It's important to explain why this is not true.
Let's start with a declaration like:
constexpr char const* p = "abcde";
This pointer has some value. Let's say N. The crucial thing is - just about anything you can do to try to observe N during constant evaluation would be ill-formed. You cannot cast it to an integer to read the value. You cannot compare it to a different, unrelated string† (by way of [expr.rel]/4.3):
constexpr char const* q = "hello";
p > q; // ill-formed
p <= q; // ill-formed
p != q; // ok, false
We can say for sure that p != q because wherever it is they point, they are clearly different. But we cannot say which one goes first. Such a comparison is undefined behavior, and undefined behavior is disallowed in constant expressions.
You can really only compare to pointers within the same array:
constexpr char const* a = p + 1; // ok
constexpr char const* b = p + 17; // ill-formed
a > p; // ok, true
Wherever it is that p points to, we know that a points after it. But we don't need to know N to determine this.
As a result, the actual value N during constant evaluation is more or less immaterial.
"abcde" is... somewhere. "abcde"+1 points to one later than that, and has the value "bcde". Regardless of where it points, you can compare it to a null pointer (0 is a null pointer constant) and it is not a null pointer, hence that comparison evaluates as false.
This is a perfectly well-formed constant evaluation, which gcc 6.3 happens to reject.
†Although we simply state by fiat that std::less()(p, q) provides some value that gives a consistent total order at compile time and that it gives the same answer at runtime. Which is... an interesting conundrum.

Related

Do all transient allocations have unique address?

While reading comments of a C++ Weekly video about the constexpr new support in C++20 I found the comment that alleges that C++20 allows UB in constexpr context.
At first I was convinced that comment is right, but more I thought about it more and more I began to suspect that C++20 wording contains some clever language that makes this defined behavior.
Either that all transient allocations return unique addresses or maybe some more general notion in C++ that makes 2 distinct allocation pointers always(even in nonconstexpr context) compare false even if at runtime in reality it is possible that allocator would give you back same address(since you deleted the first allocation).
As a bonus weirdness: you can only use == for comparison, <, > fail...
Here is the program with alleged UB in constexpr:
#include <iostream>
static constexpr bool f()
{
auto p = new int(1);
delete p;
auto q = new int(2);
delete q;
return p == q;
}
int main()
{
constexpr bool res1 = f();
std::cout << res1 << std::endl; // May output 0 or 1
}
godbolt
The result here is implementation-defined. res1 could be false, true, or ill-formed, based on how the implementation wants to define it. And this is just as true for equality comparison as it is for relational comparison.
Both [expr.eq] (for equality) and [expr.rel] (for relational) start by doing an lvalue-to-rvalue conversion on the pointers (because we have to actually read what the value is to do a comparison). [conv.lval]/3 says that the result of that conversion is:
Otherwise, if the object to which the glvalue refers contains an invalid pointer value ([basic.stc.dynamic.deallocation], [basic.stc.dynamic.safety]), the behavior is implementation-defined.
That is the case here: both pointers contain an invalid pointer value, as per [basic.stc.general]/4:
When the end of the duration of a region of storage is reached, the values of all pointers representing the address of any part of that region of storage become invalid pointer values. Indirection through an invalid pointer value and passing an invalid pointer value to a deallocation function have undefined behavior. Any other use of an invalid pointer value has implementation-defined behavior.
with a footnote reading:
Some implementations might define that copying an invalid pointer value causes a system-generated runtime fault.
So the value we get out of the lvalue-to-rvalue conversion is... implementation-defined. It could be implementation-defined in a way that causes those two pointers to compare equal. It could be implementation-defined in a way that causes those two pointers to compare not equal (as apparently all implementations do). Or it could even be implementation-defined in a way that causes the comparison between those two pointers to be unspecified or undefined behavior.
Notably, [expr.const]/5 (the main rule governing constant expressions), despite rejecting undefined behavior and explicitly rejecting any comparison whose result is unspecified ([expr.const]/5.23), says nothing about a comparison whose result is implementation-defined.
There's no undefined behavior here. Anything goes. Which is admittedly very weird during constant evaluation, where we'd expect to see a stricter set of rules.
Notably, with p < q, it appears that gcc and clang reject the comparison as being not a constant expression (which is... an allowed result) while msvc considers both p < q and p > q to be constant expressions whose value is false (which is... also an allowed result).

C++: int calculations during compile time (not during running time)

Simple question:
Why does the code below work?
int main() {
const int a = 4;
const int b = 16;
const int size = b/a;
int arr[size] = {0,1,2,3};
return 0;
}
I thought the size of static arrays have to be defined during compile time, thus only with an "int" literal. In the code above, the code compiles although the size is a calculation. Is this calculation done during compile time?
If yes, maybe my understanding of compile and running is wrong: Compile just goes through the syntax and translates the code to machine code, but does not do any calculations...
Thank you!
Is this calculation done during compile time?
Yes, compilers did a lot optimizations and calculations for you, the initialization in your code is ok even without any optimization, and it's the result of pre-calculation of compiler.
Generally, calculation here includes the constexpr, const type declaration and so on, which are already in the definition of language itself(see constant expression).
compile-time const and example
just see the output of the example.
compile-time constexpr and example
The constexpr specifier declares that it is possible to evaluate the value of the function or variable at compile time.
This is how array can be initialized, an array declaration is as below:
noptr-declarator [ expr(optional) ] attr(optional)
here, expr is:
an integral constant expression (until C++14) a converted constant expression of type std::size_t (since C++14), which evaluates to a value greater than zero
which are all constant expression, which says:
an expression that can be evaluated at compile time.
So, using the so-called pre-calculation to initialize an array is ok.
Here's also a follow-up: there're also a lot of ways to save more calculations and time by let them done during compile time, and they are shown in the link above.
Just to mention something different: As for optimizations, you can see the difference between the assembly codes of version -O0 and -O3 when calculating the sum from 1 to 100, it's a jaw-breaker -- you'll see the result 5050 is in the assembly code in the -O3 version, it's also a kind of compile-time calculation, but not enabled for all kinds of situation.
I thought the size of static arrays have to be defined during compile time, thus only with an "int" literal.
The first part is true, but the second part was only true before c++11 (you could also have done const int i = 1 + 2). From c++11, the rules say that the initializing expression is evaluated to see if it yields a constant expression.
There is also the rule that a const integral type is implicitly constexpr (i.e. computed at compile time), so long as the initializing expression is computable at compile time.
So in this expression:
const int size = b/a;
the variable size is required to be computed at compile time, so long as the expression a/b is a constant expression. Since a and b are also const ints that are initialized with a literal, the initialization of size is a constant expression. (Note that removing const from any of the declarations will make size a non-constant expression).
So size is required to be computed at compile time (the compiler has no choice in the matter, regardless of optimizations), and so it can be used as an array dimension.
Here are the complete rules for how this works.

Why does a consteval function allow undefined behavior?

There is a very neat property of constant expressions in C++: their evaluation cannot have undefined behavior (7.7.4.7):
An expression e is a core constant expression unless the evaluation of e, following the rules of the abstract machine ([intro.execution]), would evaluate one of the following:
...
an operation that would have undefined behavior as specified in [intro] through [cpp] of this document [ Note: including, for example, signed integer overflow ([expr.prop]), certain pointer arithmetic ([expr.add]), division by zero, or certain shift operations — end note ] ;
Trying to store the value of 13! in a constexpr int indeed yields a nice compile error:
constexpr int f(int n)
{
int r = n--;
for (; n > 1; --n) r *= n;
return r;
}
int main()
{
constexpr int x = f(13);
return x;
}
Output:
9:19: error: constexpr variable 'x' must be initialized by a constant expression
constexpr int x = f(13);
^ ~~~~~
4:26: note: value 3113510400 is outside the range of representable values of type 'int'
for (; n > 1; --n) r *= n;
^
9:23: note: in call to 'f(3)'
constexpr int x = f(13);
^
1 error generated.
(BTW why does the error say "call to 'f(3)'", while it is a call to f(13)?..)
Then, I remove constexpr from x, but make f a consteval. According to the docs:
consteval - specifies that a function is an immediate function, that is, every call to the function must produce a compile-time constant
I do expect that such a program would again cause a compile error. But instead, the program compiles and runs with UB.
Why is that?
UPD: Commenters suggested that this is a compiler bug. I reported it: https://bugs.llvm.org/show_bug.cgi?id=43714
This is a compiler bug. Or, to be more precise, this is an "underimplemented" feature (see the comment in bugzilla):
Yup - seems consteval isn't implemented yet, according to: https://clang.llvm.org/cxx_status.html
(the keyword's probably been added but not the actual implementation support)

const (but not constexpr) used as the built-in array size

I understand that the size of the built-in array must be a constant expression:
// Code 1
constexpr int n = 5;
double arr[n];
I do not understand why the following compiles:
// Code 2
const int n = 5;
double arr[n]; // n is not a constant expression type!
Furthermore, if the compiler is smart enough to see that n is initialized with 5, then why does the following not compile:
// Code 3
int n = 5;
double arr[n]; // n is initialized with 5, so how is this different from Code 2?
P.S. This post answers using quotes from the standard, which I do not understand. I will very much appreciate an answer that uses a simpler language.
n is not a constant expression type!
There is no such thing as a constant expression type. n in that example is a expression, and it is in fact a constant expression. And that is why it can be used as the array size.
It is not necessary for a variable to be declared constexpr in order for its name to be a constant expression. What constexpr does for a variable, is the enforcement of compile time constness. Examples:
int a = 42;
Even though 42 is a consant expression, a is not; Its value may change at runtime.
const int b = 42;
b is a constant expression. Its value is known at compile time
const int c = rand();
rand() is not a constant expression, and so c is neither. Its value is determined at runtime, but may not change after initialisation.
constexpr int d = 42;
d is a constant expression, just like b.
constexpr int f = rand();
Does not compile, because constexpr variables must be initialised with a constant expression.
then why does the following not compile:
Because the rules of the language don't allow it. The value of n is not compile time constant. The value of a non-const variable can change at runtime.
The language cannot have a rule that some value doesn't change at runtime, then it is a constant expression. That would not be of any use to the programmer since they cannot assume which compiler will be able to prove the constness of which variable.
The language has to exactly specify the cases where an expression is constant. It would also be infeasible to specify that a non-const variable is a constant expression if it hasn't been modified before its use, because it is impossible to prove in most cases, even though you've found one case where the proof happens to be easy.
// n is not a constant expression type!
But it is. Per [expr.const]/3
A variable is usable in constant expressions after its initializing declaration is encountered if it is a constexpr variable, or it is a constant-initialized variable of reference type or of const-qualified integral or enumeration type. An object or reference is usable in constant expressions if it is [...]
a complete temporary object of non-volatile const-qualified integral or enumeration type that is initialized with a constant expression.
So, if you have a const integer intialized with a constant expression then you still have a constant expression as nothing can change. This is a rule that existed before constexpr was ever a thing as it allowed programmers to initialize arrays with constant variables instead of using macros.
Furthermore, if the compiler is smart enough to see that n is initialized with 5, then why does the following not compile:
Because the integer is not const so it could be changed. Even though in your case you can prove it can't change, in general you can't so it is just not allowed.
A value declared constexpr means that this value does not change and is known during compile time.
A value declared const means that this value does not change after initialization, but it is not mandatory to be known during compile time. In other words, a constexpr is const, but a const is not constexpr.
Your "Code 3" example doesn't work because you need a constant known at compile time in order to allocate memory for a vector, so you need a constexpr.

compiler behaviour on return value of new

#include <iostream>
using namespace std;
int main(void){
int size = -2;
int* p = new int[size];
cout<<p<<endl;
return 0;
}
Above code compiles without any problem on visual studio 2010.
But
#include <iostream>
using namespace std;
int main(void){
const int size = -2;
int* p = new int[size];
cout<<p<<endl;
return 0;
}
But this code(added const keyword) gives error on compilation(size of array cannot be negative).
Why these different results ?
By making it a constant expression, you've given the compiler the ability to diagnose the problem at compile time. Since you've made it const, the compiler can easily figure out that the value will necessarily be negative when you pass it to new.
With a non-constant expression, you have the same problem, but it takes more intelligence on the part of the compiler to detect it. Specifically, the compiler has to detect that the value doesn't change between the time you initialize it and the time you pass it to new. With optimization turned on, chances are pretty good at least some compilers still could/would detect the problem, because for optimization purposes they detect data flow so they'd "realize" that in this case the value remains constant, even though you haven't specified it as const.
Because const int size = -2; can be replaced at compilation time by the compiler - whereas a non-const cannot - the compiler can tell that size is negative and doesn't allow the allocation.
There's no way for the compiler to tell whether int* p = new int[size]; is legal or not, if size is not const - size can be altered in the program or by a different thread. A const can't.
You'll get into undefined behavior running the first sample anyway, it's just not legal.
In the first, size is a variable whose value happens to be -2 but the compiler doesn't track it as it is a variable (at least for diagnostic purpose, I'm pretty sure the optimizing phase can track it). Execution should be problematic (I don't know if it is guarantee to have an exception or just UB).
In the second, size is a constant and thus its value is known and verified at compile time.
The first yields a compilation error because the compiler can detect a valid syntax is being violated at compilation time.
The Second yields a Undefined Behaivor[1] because compiler cannot detect the illegal syntax at compilation time.
Rationale:
Once you make the variable a const the compiler knows that the value of the variable size is not supposed to change anytime during the execution of the program. So compiler will apply its optimization & just simply replace the const integral type size with its constant value -2 wherever it is being used at compile time, While doing so it understands that legal syntax is not being folowed[1] and complains with a error.
In the Second case not adding a const prevents the compiler from applying the optimization mentioned above, because it cannot be sure that the variable size will never change during the course of execution of the program.So it cannot detect the illegal syntax at all.However, You end up getting a Undefined Behavior at run-time.
[1]Reference:
C++03 5.3.4 New [expr.new]
noptr-new-declarator:
[ expression ] attribute-specifier-seqopt
noptr-new-declarator [ constant-expression ] attribute-specifier-seqopt
constant-expression are further explained in 5.4.3/6 and 5.4.3/7:
Every constant-expression in a noptr-new-declarator shall be an integral constant expression (5.19) and evaluate to a strictly positive value. The expression in a direct-new-declarator shall have integral or enumeration type (3.9.1) with a non-negative value. [Example: if n is a variable of type int, then new float[n][5] is well-formed (because n is the expression of a direct-new-declarator), but new float[5][n] is ill-formed (because n is not a constant-expression). If n is negative, the effect of new float[n][5] is undefined. ]