C++: Why is this constexpr not a compile time constant - c++

In the following C++11 code, the last call to arraySize causes a compile error. Apparently this is because y is a runtime sized array, and the arraySize template parameter N cannot be deduced for y. I don't understand why x is a compile time sized array, but y ends up runtime sized. The arraySize template function is taken directly from Scott Meyers' "Effective Modern C++" Item 1.
#include <cstddef>
template<typename T, std::size_t N>
constexpr std::size_t arraySize(T(&)[N]) noexcept { return N; }
struct S
{
char c[10];
};
int main()
{
S s;
S* ps = &s;
char x[arraySize(s.c)];
char y[arraySize(ps->c)]; // why is y a runtime sized array?
arraySize(x);
arraySize(y); // error !?
return 0;
}

In C++, the error is not the call to arraySize(y), but the declaration of y itself.
The bounds in an array declaration must be a "converted constant expression".
If your compiler accepts the declaration of y and later tells you that y is an array of runtime bound, it is not a C++ compiler. There are no arrays of runtime bound in any ratified version of C++, nor the current draft.
The significant difference between arraySize(s.c) and arraySize(ps->c) is that ps->c is the same as (*ps).c and the * dereference operator requires lvalue-to-rvalue conversion on ps, which is not a constant expression (nor is &s, see below). The rest of the expression doesn't involve lvalue-to-rvalue conversion, the array lvalue is directly bound by the reference.
A constant expression is either a glvalue core constant expression whose value refers to an entity that is a permitted result of a constant expression (as defined below), or a prvalue core constant expression whose value is an object where, for that object and its subobjects:
each non-static data member of reference type refers to an entity that is a permitted result of a constant
expression, and
if the object or subobject is of pointer type, it contains the address of an object with static storage duration, the address past the end of such an object (5.7), the address of a function, or a null pointer value.
An entity is a permitted result of a constant expression if it is an object with static storage duration that is either not a temporary object or is a temporary object whose value satisfies the above constraints, or it is a
function.
Clearly ps contains the address of an object with automatic storage duration, so it cannot be declared constexpr. But everything should start working if you change S s; S* ps = &s; to static S s; constexpr S* ps = &s;
(On the other hand, you'd think that the parameter to arraySize(s.c) is not a constant expression either, since it is a reference and not to an object of static storage duration)

Related

const (but not constexpr) used as the built-in array size

I understand that the size of the built-in array must be a constant expression:
// Code 1
constexpr int n = 5;
double arr[n];
I do not understand why the following compiles:
// Code 2
const int n = 5;
double arr[n]; // n is not a constant expression type!
Furthermore, if the compiler is smart enough to see that n is initialized with 5, then why does the following not compile:
// Code 3
int n = 5;
double arr[n]; // n is initialized with 5, so how is this different from Code 2?
P.S. This post answers using quotes from the standard, which I do not understand. I will very much appreciate an answer that uses a simpler language.
n is not a constant expression type!
There is no such thing as a constant expression type. n in that example is a expression, and it is in fact a constant expression. And that is why it can be used as the array size.
It is not necessary for a variable to be declared constexpr in order for its name to be a constant expression. What constexpr does for a variable, is the enforcement of compile time constness. Examples:
int a = 42;
Even though 42 is a consant expression, a is not; Its value may change at runtime.
const int b = 42;
b is a constant expression. Its value is known at compile time
const int c = rand();
rand() is not a constant expression, and so c is neither. Its value is determined at runtime, but may not change after initialisation.
constexpr int d = 42;
d is a constant expression, just like b.
constexpr int f = rand();
Does not compile, because constexpr variables must be initialised with a constant expression.
then why does the following not compile:
Because the rules of the language don't allow it. The value of n is not compile time constant. The value of a non-const variable can change at runtime.
The language cannot have a rule that some value doesn't change at runtime, then it is a constant expression. That would not be of any use to the programmer since they cannot assume which compiler will be able to prove the constness of which variable.
The language has to exactly specify the cases where an expression is constant. It would also be infeasible to specify that a non-const variable is a constant expression if it hasn't been modified before its use, because it is impossible to prove in most cases, even though you've found one case where the proof happens to be easy.
// n is not a constant expression type!
But it is. Per [expr.const]/3
A variable is usable in constant expressions after its initializing declaration is encountered if it is a constexpr variable, or it is a constant-initialized variable of reference type or of const-qualified integral or enumeration type. An object or reference is usable in constant expressions if it is [...]
a complete temporary object of non-volatile const-qualified integral or enumeration type that is initialized with a constant expression.
So, if you have a const integer intialized with a constant expression then you still have a constant expression as nothing can change. This is a rule that existed before constexpr was ever a thing as it allowed programmers to initialize arrays with constant variables instead of using macros.
Furthermore, if the compiler is smart enough to see that n is initialized with 5, then why does the following not compile:
Because the integer is not const so it could be changed. Even though in your case you can prove it can't change, in general you can't so it is just not allowed.
A value declared constexpr means that this value does not change and is known during compile time.
A value declared const means that this value does not change after initialization, but it is not mandatory to be known during compile time. In other words, a constexpr is const, but a const is not constexpr.
Your "Code 3" example doesn't work because you need a constant known at compile time in order to allocate memory for a vector, so you need a constexpr.

Is reinterpret_cast-ing a memory buffer back and forth safe, and is the alignas necessary in placement new context? [duplicate]

(In reference to this question and answer.)
Before the C++17 standard, the following sentence was included in [basic.compound]/3:
If an object of type T is located at an address A, a pointer of type cv T* whose value is the address A is said to point to that object, regardless of how the value was obtained.
But since C++17, this sentence has been removed.
For example I believe that this sentence made this example code defined, and that since C++17 this is undefined behavior:
alignas(int) unsigned char buffer[2*sizeof(int)];
auto p1=new(buffer) int{};
auto p2=new(p1+1) int{};
*(p1+1)=10;
Before C++17, p1+1 holds the address to *p2 and has the right type, so *(p1+1) is a pointer to *p2. In C++17 p1+1 is a pointer past-the-end, so it is not a pointer to object and I believe it is not dereferencable.
Is this interpretation of this modification of the standard right or are there other rules that compensate the deletion of the cited sentence?
Is this interpretation of this modification of the standard right or are there other rules that compensate the deletion of this cited sentence?
Yes, this interpretation is correct. A pointer past the end isn't simply convertible to another pointer value that happens to point to that address.
The new [basic.compound]/3 says:
Every value of pointer type is one of the following:
(3.1)
a pointer to an object or function (the pointer is said to point to the object or function), or
(3.2)
a pointer past the end of an object ([expr.add]), or
Those are mutually exclusive. p1+1 is a pointer past the end, not a pointer to an object. p1+1 points to a hypothetical x[1] of a size-1 array at p1, not to p2. Those two objects are not pointer-interconvertible.
We also have the non-normative note:
[ Note: A pointer past the end of an object ([expr.add]) is not considered to point to an unrelated object of the object's type that might be located at that address. [...]
which clarifies the intent.
As T.C. points out in numerous comments (notably this one), this is really a special case of the problem that comes with trying to implement std::vector - which is that [v.data(), v.data() + v.size()) needs to be a valid range and yet vector doesn't create an array object, so the only defined pointer arithmetic would be going from any given object in the vector to past-the-end of its hypothetical one-size array. Fore more resources, see CWG 2182, this std discussion, and two revisions of a paper on the subject: P0593R0 and P0593R1 (section 1.3 specifically).
In your example, *(p1 + 1) = 10; should be UB, because it is one past the end of the array of size 1. But we are in a very special case here, because the array was dynamically constructed in a larger char array.
Dynamic object creation is described in 4.5 The C++ object model [intro.object], §3 of the n4659 draft of the C++ standard:
3 If a complete object is created (8.3.4) in storage associated with another object e of type “array of N
unsigned char” or of type “array of N std::byte” (21.2.1), that array provides storage for the created
object if:
(3.1) — the lifetime of e has begun and not ended, and
(3.2) — the storage for the new object fits entirely within e, and
(3.3) — there is no smaller array object that satisfies these constraints.
The 3.3 seems rather unclear, but the examples below make the intent more clear:
struct A { unsigned char a[32]; };
struct B { unsigned char b[16]; };
A a;
B *b = new (a.a + 8) B; // a.a provides storage for *b
int *p = new (b->b + 4) int; // b->b provides storage for *p
// a.a does not provide storage for *p (directly),
// but *p is nested within a (see below)
So in the example, the buffer array provides storage for both *p1 and *p2.
The following paragraphs prove that the complete object for both *p1 and *p2 is buffer:
4 An object a is nested within another object b if:
(4.1) — a is a subobject of b, or
(4.2) — b provides storage for a, or
(4.3) — there exists an object c where a is nested within c, and c is nested within b.
5 For every object x, there is some object called the complete object of x, determined as follows:
(5.1) — If x is a complete object, then the complete object of x is itself.
(5.2) — Otherwise, the complete object of x is the complete object of the (unique) object that contains x.
Once this is established, the other relevant part of draft n4659 for C++17 is [basic.coumpound] §3(emphasize mine):
3 ... Every
value of pointer type is one of the following:
(3.1) — a pointer to an object or function (the pointer is said to point to the object or function), or
(3.2) — a pointer past the end of an object (8.7), or
(3.3) — the null pointer value (7.11) for that type, or
(3.4) — an invalid pointer value.
A value of a pointer type that is a pointer to or past the end of an object represents the address of the
first byte in memory (4.4) occupied by the object or the first byte in memory after the end of the storage
occupied by the object, respectively. [ Note: A pointer past the end of an object (8.7) is not considered to
point to an unrelated object of the object’s type that might be located at that address. A pointer value
becomes invalid when the storage it denotes reaches the end of its storage duration; see 6.7. —end note ]
For purposes of pointer arithmetic (8.7) and comparison (8.9, 8.10), a pointer past the end of the last element
of an array x of n elements is considered to be equivalent to a pointer to a hypothetical element x[n]. The
value representation of pointer types is implementation-defined. Pointers to layout-compatible types shall
have the same value representation and alignment requirements (6.11)...
The note A pointer past the end... does not apply here because the objects pointed to by p1 and p2 and not unrelated, but are nested into the same complete object, so pointer arithmetics make sense inside the object that provide storage: p2 - p1 is defined and is (&buffer[sizeof(int)] - buffer]) / sizeof(int) that is 1.
So p1 + 1 is a pointer to *p2, and *(p1 + 1) = 10; has defined behaviour and sets the value of *p2.
I have also read the C4 annex on the compatibility between C++14 and current (C++17) standards. Removing the possibility to use pointer arithmetics between objects dynamically created in a single character array would be an important change that IMHO should be cited there, because it is a commonly used feature. As nothing about it exist in the compatibility pages, I think that it confirms that it was not the intent of the standard to forbid it.
In particular, it would defeat that common dynamic construction of an array of objects from a class with no default constructor:
class T {
...
public T(U initialization) {
...
}
};
...
unsigned char *mem = new unsigned char[N * sizeof(T)];
T * arr = reinterpret_cast<T*>(mem); // See the array as an array of N T
for (i=0; i<N; i++) {
U u(...);
new(arr + i) T(u);
}
arr can then be used as a pointer to the first element of an array...
To expand on the answers given here is an example of what I believe the revised wording is excluding:
Warning: Undefined Behaviour
#include <iostream>
int main() {
int A[1]{7};
int B[1]{10};
bool same{(B)==(A+1)};
std::cout<<B<< ' '<< A <<' '<<sizeof(*A)<<'\n';
std::cout<<(same?"same":"not same")<<'\n';
std::cout<<*(A+1)<<'\n';//!!!!!
return 0;
}
For entirely implementation dependent (and fragile) reasons possible output of this program is:
0x7fff1e4f2a64 0x7fff1e4f2a60 4
same
10
That output shows that the two arrays (in that case) happen to be stored in memory such that 'one past the end' of A happens to hold the value of the address of the first element of B.
The revised specification is ensuring that regardless A+1 is never a valid pointer to B. The old phrase 'regardless of how the value is obtained' says that if 'A+1' happens to point to 'B[0]' then it's a valid pointer to 'B[0]'.
That can't be good and surely never the intention.

Is a pointer with the right address and type still always a valid pointer since C++17?

(In reference to this question and answer.)
Before the C++17 standard, the following sentence was included in [basic.compound]/3:
If an object of type T is located at an address A, a pointer of type cv T* whose value is the address A is said to point to that object, regardless of how the value was obtained.
But since C++17, this sentence has been removed.
For example I believe that this sentence made this example code defined, and that since C++17 this is undefined behavior:
alignas(int) unsigned char buffer[2*sizeof(int)];
auto p1=new(buffer) int{};
auto p2=new(p1+1) int{};
*(p1+1)=10;
Before C++17, p1+1 holds the address to *p2 and has the right type, so *(p1+1) is a pointer to *p2. In C++17 p1+1 is a pointer past-the-end, so it is not a pointer to object and I believe it is not dereferencable.
Is this interpretation of this modification of the standard right or are there other rules that compensate the deletion of the cited sentence?
Is this interpretation of this modification of the standard right or are there other rules that compensate the deletion of this cited sentence?
Yes, this interpretation is correct. A pointer past the end isn't simply convertible to another pointer value that happens to point to that address.
The new [basic.compound]/3 says:
Every value of pointer type is one of the following:
(3.1)
a pointer to an object or function (the pointer is said to point to the object or function), or
(3.2)
a pointer past the end of an object ([expr.add]), or
Those are mutually exclusive. p1+1 is a pointer past the end, not a pointer to an object. p1+1 points to a hypothetical x[1] of a size-1 array at p1, not to p2. Those two objects are not pointer-interconvertible.
We also have the non-normative note:
[ Note: A pointer past the end of an object ([expr.add]) is not considered to point to an unrelated object of the object's type that might be located at that address. [...]
which clarifies the intent.
As T.C. points out in numerous comments (notably this one), this is really a special case of the problem that comes with trying to implement std::vector - which is that [v.data(), v.data() + v.size()) needs to be a valid range and yet vector doesn't create an array object, so the only defined pointer arithmetic would be going from any given object in the vector to past-the-end of its hypothetical one-size array. Fore more resources, see CWG 2182, this std discussion, and two revisions of a paper on the subject: P0593R0 and P0593R1 (section 1.3 specifically).
In your example, *(p1 + 1) = 10; should be UB, because it is one past the end of the array of size 1. But we are in a very special case here, because the array was dynamically constructed in a larger char array.
Dynamic object creation is described in 4.5 The C++ object model [intro.object], §3 of the n4659 draft of the C++ standard:
3 If a complete object is created (8.3.4) in storage associated with another object e of type “array of N
unsigned char” or of type “array of N std::byte” (21.2.1), that array provides storage for the created
object if:
(3.1) — the lifetime of e has begun and not ended, and
(3.2) — the storage for the new object fits entirely within e, and
(3.3) — there is no smaller array object that satisfies these constraints.
The 3.3 seems rather unclear, but the examples below make the intent more clear:
struct A { unsigned char a[32]; };
struct B { unsigned char b[16]; };
A a;
B *b = new (a.a + 8) B; // a.a provides storage for *b
int *p = new (b->b + 4) int; // b->b provides storage for *p
// a.a does not provide storage for *p (directly),
// but *p is nested within a (see below)
So in the example, the buffer array provides storage for both *p1 and *p2.
The following paragraphs prove that the complete object for both *p1 and *p2 is buffer:
4 An object a is nested within another object b if:
(4.1) — a is a subobject of b, or
(4.2) — b provides storage for a, or
(4.3) — there exists an object c where a is nested within c, and c is nested within b.
5 For every object x, there is some object called the complete object of x, determined as follows:
(5.1) — If x is a complete object, then the complete object of x is itself.
(5.2) — Otherwise, the complete object of x is the complete object of the (unique) object that contains x.
Once this is established, the other relevant part of draft n4659 for C++17 is [basic.coumpound] §3(emphasize mine):
3 ... Every
value of pointer type is one of the following:
(3.1) — a pointer to an object or function (the pointer is said to point to the object or function), or
(3.2) — a pointer past the end of an object (8.7), or
(3.3) — the null pointer value (7.11) for that type, or
(3.4) — an invalid pointer value.
A value of a pointer type that is a pointer to or past the end of an object represents the address of the
first byte in memory (4.4) occupied by the object or the first byte in memory after the end of the storage
occupied by the object, respectively. [ Note: A pointer past the end of an object (8.7) is not considered to
point to an unrelated object of the object’s type that might be located at that address. A pointer value
becomes invalid when the storage it denotes reaches the end of its storage duration; see 6.7. —end note ]
For purposes of pointer arithmetic (8.7) and comparison (8.9, 8.10), a pointer past the end of the last element
of an array x of n elements is considered to be equivalent to a pointer to a hypothetical element x[n]. The
value representation of pointer types is implementation-defined. Pointers to layout-compatible types shall
have the same value representation and alignment requirements (6.11)...
The note A pointer past the end... does not apply here because the objects pointed to by p1 and p2 and not unrelated, but are nested into the same complete object, so pointer arithmetics make sense inside the object that provide storage: p2 - p1 is defined and is (&buffer[sizeof(int)] - buffer]) / sizeof(int) that is 1.
So p1 + 1 is a pointer to *p2, and *(p1 + 1) = 10; has defined behaviour and sets the value of *p2.
I have also read the C4 annex on the compatibility between C++14 and current (C++17) standards. Removing the possibility to use pointer arithmetics between objects dynamically created in a single character array would be an important change that IMHO should be cited there, because it is a commonly used feature. As nothing about it exist in the compatibility pages, I think that it confirms that it was not the intent of the standard to forbid it.
In particular, it would defeat that common dynamic construction of an array of objects from a class with no default constructor:
class T {
...
public T(U initialization) {
...
}
};
...
unsigned char *mem = new unsigned char[N * sizeof(T)];
T * arr = reinterpret_cast<T*>(mem); // See the array as an array of N T
for (i=0; i<N; i++) {
U u(...);
new(arr + i) T(u);
}
arr can then be used as a pointer to the first element of an array...
To expand on the answers given here is an example of what I believe the revised wording is excluding:
Warning: Undefined Behaviour
#include <iostream>
int main() {
int A[1]{7};
int B[1]{10};
bool same{(B)==(A+1)};
std::cout<<B<< ' '<< A <<' '<<sizeof(*A)<<'\n';
std::cout<<(same?"same":"not same")<<'\n';
std::cout<<*(A+1)<<'\n';//!!!!!
return 0;
}
For entirely implementation dependent (and fragile) reasons possible output of this program is:
0x7fff1e4f2a64 0x7fff1e4f2a60 4
same
10
That output shows that the two arrays (in that case) happen to be stored in memory such that 'one past the end' of A happens to hold the value of the address of the first element of B.
The revised specification is ensuring that regardless A+1 is never a valid pointer to B. The old phrase 'regardless of how the value is obtained' says that if 'A+1' happens to point to 'B[0]' then it's a valid pointer to 'B[0]'.
That can't be good and surely never the intention.

Is the size of an array determined at compile time?

When I was reading about the array initialization in this tutorial. I found out this note.
type name [elements];
NOTE: The elements field within square brackets [], representing the number of elements in the array, must be a constant expression, since arrays are blocks of static memory whose size must be determined at compile time, before the program runs.*
As I know array allocate the memory in the run time. This should be a false note? or what does it means?
Please check if the following answers help in giving you clarity about this.
Static array vs. dynamic array in C++
Static arrays are created on the stack, and necessarily have a fixed size (the size of the stack needs to be known going into a function):
int foo[10];
Dynamic arrays are created on the heap. They can have any size, but you need to allocate and free them yourself since they're not part of the stack frame:
int* foo = new int[10];
delete[] foo;
You don't need to deal with the memory management of a static array, but they get destroyed when the function they're in ends
Array size at run time without dynamic allocation is allowed?
C99 standard (http://en.wikipedia.org/wiki/C99) supports variable sized arrays on the stack. Some of the compilers might implement these standards and support variable sized arrays.
The declaration T a[N] requires that N be a converted constant expression.
Converted constant expression is an expression implicitly converted to
prvalue of type T, where the converted expression is a core constant
expression. If the literal constant expression has class type, it is
contextually implicitly converted to the expected integral or unscoped
enumeration type with a constexpr user-defined conversion function.
An int literal such as 5 is a prvalue, so can be used in the declaration T a[5], but an lvalue, for example int n = 5 cannot be used in the declaration T a[n], unless the lvalue under-goes an implicit lvalue-to-rvalue conversion where the lvalue:
a) has integral or enumeration type, is non-volatile const, and is
initialized with a constant expression, or an array of such (including
a string literal)
b) has literal type and refers to a non-volatile object defined with constexpr or to its non-mutable subobject
c) has literal type and refers to a non-volatile temporary, initialized with a constant expression
Therefore the following are valid:
const int n = 5;
int a[n];
constexpr int n = 5;
int a[n];
You may use :
int array[42];
but not
int n;
std::cin >> n;
int array[n]; // Not standard C++
the later is supported as extension by some compiler as VLA (Variable length array)

Usable case of pointer to array with unspecified bounds in C++ (not in C)

Consider following code:
int main() {
int (*p)[]; // pointer to array with unspecified bounds
int a[] = {1};
int b[] = {1,2};
p = &a; // works in C but not in C++
p = &b; // works in C but not in C++
return 0;
}
In pure C you can assign a pointer to this type of address of an array of any dimension. But in C++ you can't. I found one case when compiler allows assign value to such pointer:
struct C
{
static int v[];
};
int main()
{
int (*p)[] = &C::v; // works in C++ if 'v' isn't defined (only declared)
return 0;
}
But could not find any useful case of this code.
Can anyone give an useful example (in C++) of pointer to array with unspecified bounds?
Or is it only vestige remaining from C?
Such a pointer cannot participate in pointer arithmetic, potentially useful things that still can be done are to get its type with decltype or reinterpret_cast it to another pointer type or intptr_t. This is because section 3.9p6 says:
A class type (such as "class X") might be incomplete at one point in a translation unit and complete later on; the type "class X" is the same type at both points. The declared type of an array object might be an array of incomplete class type and therefore incomplete; if the class type is completed later on in the translation unit, the array type becomes complete; the array type at those two points is the same type. The declared type of an array object might be an array of unknown size and therefore be incomplete at one point in a translation unit and complete later on; the array types at those two points ("array of unknown bound of T" and "array of N T") are different types. The type of a pointer to array of unknown size, or of a type defined by a typedef declaration to be an array of unknown size, cannot be completed.
5.3.1 says:
Note: indirection through a pointer to an incomplete type (other than cv void) is valid. The lvalue thus obtained can be used in limited ways (to initialize a reference, for example); this lvalue must not be converted to a prvalue, see 4.1.
Since array-to-pointer decay can be performed on array lvalues without prior conversion to rvalue, the code dyp left in a comment is correct:
(*p)[i]
Relevant rule, from 4.2:
An lvalue or rvalue of type 'array of N T" or "array of unknown bound of T" can be converted to a prvalue of type "pointer to T". The result is a pointer to the first element of the array.
I think a compiler should accept it (irrespective of the -O setting) because a definition of the static can be provided by another compilation unit. (This is perhaps a pragmatic deviation from the standard - I'm not an C++ expert.) The posted snippet can be compiled, but it is incomplete and cannot be brought to execution without a definition of the static member.
File c.h:
struct C {
static int v[];
};
File x.cpp
#include "c.h"
#include <iostream>
int main(){
int (*p)[] = &C::v; // works in C++ if 'v' isn't defined (only declared)
std::cout << (*p)[0] << std::endl;
return 0;
}
File y.cpp
#include "c.h"
int C::v[3] = {1,2,3};
Compiled and linked using (dont-tell-me-it's-old) g++ 4.3.3. Prints 1.