Why constexpr should be static? - c++

After reading this and this I still feel confused about this kind of expressions:
static constexpr int = 0;
AFAIK, in C++:
static ensures life-time memory address along whole execution and safe initialization with concurrent threads
constexpr ensures time-compilation evaluation as rvalue, which means it shall have no memory address
They look contradictory to me. static ensures the variable will have a long-time memory address whereas constexpr ensures the opposite assumption. Surprisingly, the discussion in the first link mentions this:
constexpr int x = 3;
const int* p = &x;
How can we even obtain the memory address of x if it is an rvalue?
Could anyone explain it?

static has a number of meanings. In classes (per your comment), it means that the member is associated with the class, and not a specific instance (object) of that class.
For a constexpr, that makes a lot of sense. That's typically initialized by a value known to the compiler, and not from ctor arguments.

Related

May a compiler store function-scoped, non-static, const arrays in constant data and avoid per-call initialization?

In reading How are char arrays / strings stored in binary files (C/C++)?, I was thinking about the various ways in which the raw string involved, "Nancy", would appear intact in the resulting binary. That post's case was:
int main()
{
char temp[6] = "Nancy";
printf("%s", temp);
return 0;
}
and obviously, in the general case (where the compiler can't confirm if temp is unmutated), it must actually initialize a stack local array to allow for mutations in the future; the array itself must have space allocated (on the stack, or maybe using registers for truly weird architectures), and it must be populated on each call to the function (let's pretend this isn't main which is called only once in C++ and typically only once in C), to avoid reentrancy issues and the like. Whether it hardcodes the initialization into the assembly, or does a memcpy from the program's constant data section is irrelevant; there is definitely something that must be initialized per-call.
By contrast, if char temp[6] = "Nancy"; was replaced with any of:
const char *temp = "Nancy";
char *temp = "Nancy"; (C only; in C++ the literals are const char[], though in practice they're not mutable in C either)
static const char temp[6] = "Nancy";
static char temp[6] = "Nancy";
then the program need not allocate any array-length-based resources per call (just a pointer variable in cases #1 & #2), and in all but case #4, it can put the data in read-only memory baked into the binary's data constants (#4 would put it in the section for read-write memory, but it could still be baked into the binary and loaded copy-on-write).
My question: Does the standard provided leeway for const char temp[6] = "Nancy"; to behave equivalently to static const char temp[6] = "Nancy";? Both are immutable, and modifying them is against the rules. The only differences I'm aware of would be:
Without static, you'd expect the array's address to be colocated with other locals, not in some other part of program memory (could have affects on cache performance)
Without static, you're technically saying the variable is created and destroyed on each call
I don't see anything obviously broken in terms of observable behavior by the standard:
You can't watch the array exist and cease to exist except in terms of undefined behavior, e.g. returning a pointer to temp, where there are no guarantees
You can't legally compute ptrdiff_t for unrelated variables (only within a given array, plus the one-past-the-end virtual element of said array)
so I'd think the compiler could safely "treat as static" for this case by as-if rules; there's no way to observe the difference, so it can do whatever it feels best.
Is there anything I'm missing where either the C or C++ standard would require some sort of per-call initialization of the const but non-static function scoped array? If the C and C++ standards disagree, I'd like to know that too.
Edit: As Barmar points out in the constants, there are standards-legal ways to detect this behavior in a particular compiler, e.g.:
int myfunc() {
const char temp[6] = "Nancy";
const char temp2[6] = "Nancy";
return temp == temp2; // true if compiler implicitly made them static or combined them, false if not
}
or:
int otherfunc(const char *s) {
const char temp[6] = "Nancy";
return s == temp;
}
int myfunc() {
const char temp[6] = "Nancy";
return otherfunc(temp); // true if compiler implicitly made them shared statics, false if not
}
The standard does not prescribe how local variables are implemented. A stack is a common choice, because it makes recursive functions easy. But leaf functions are easy to detect, and the example is almost a leaf function exact for the side-effect carrying printf.
For such leaf functions, a compiler might choose to implement local variables using statically allocated memory. As the question correctly states, the local variables still need to be constructed and destructed, since they're not static.
In this question, however, char temp[6] has no constructors or destructors. So a compiler which implements local variables in leaf functions as described would have a memcpy to initialize temp.
This memcpy would be visible to the optimizer - it would see the global address, the only use of the same address in printf, and it could then deduce that each memcpy can be moved to program startup. Repeated calls of that same memcpy are idempotent and can be optimized out.
This would cause the generated assembly to be identical to the static case. So the answer to the question is yes. A compiler can indeed generate the same code, and there's even a somewhat plausible way in which it could end up doing so.
Per C11, 6.2.2/6 temp has no linkage, because it is:
a block scope identifier for an object declared without the storage-class specifier extern
and per C11, 6.2.2/2:
each declaration of an identifier with no linkage denotes a unique entity
The "unique entity" implies (I guess) "unique address". Hence, the compiler is required to provide the uniqueness property.
However (speculating), if an optimizer proved that the uniqueness property is not used AND estimated that reading from memory is faster than writing & reading registers (generated code for = "Nancy"), then (I guess) it can make temp to have static storage duration. Note that usually writing & reading registers is much faster than reading from memory.
Extra: temp has block scope, not function scope.
Below the initial answer (which is "out of scope").
C11, 6.8 Statements and blocks, Semantics, 3 (emphasis added):
The initializers of objects that have automatic storage duration, and the variable length array declarators of ordinary identifiers with block scope, are evaluated and the values are stored in the objects (including storing an indeterminate value in objects without an initializer) each time the declaration is reached in the order of execution, as if it were a statement, and within each declaration in the order that declarators appear.
For C++, although I would expect the answer for C to be equivalent:
If the function with the declaration
const char temp[6] = "Nancy";
is entered recursively, then, in contrast to the variant with static, the declaration will cause multiple complete const char[6] objects with overlapping lifetimes to exist.
Applying [intro.object]/9, these objects may then not have overlapping memory and their addresses, as well as the addresses of their array elements, must be distinct. On the other hand with static, there would only be one instance of the array and so taking its address in multiple recursions must yield the same value. This is an observable difference between the version with and without static.
So, if the address of the array or one of its elements is taken or a reference to either formed and escapes the function body, and there are function calls which may potentially be recursive, then the compiler cannot generally treat the declaration with an additional static modifier.
If the compiler can be sure that either e.g. no pointer/reference to the array or its elements escapes the function or that the function cannot possibly be called recursively or that the behavior of the function doesn't depend on the addresses of the array copies, then it could under the as-if rule treat the array as static.
Because the array is a const-qualified automatic storage duration variable, it is impossible to modify values in it or to place new objects into its storage. As long as the addresses are not relevant to the behavior, there is therefore nothing else that could cause an observable difference in behavior.
I don't think anything here is specific to const char arrays. This applies to all const automatic storage duration constant-initialized variables with trivial destruction. constexpr instead of const would not change anything here either, since that doesn't affect the object identity.
Because of [intro.object]/9, both functions myfunc in your edit are also guaranteed to return 0. The two arrays have overlapping lifetimes and therefore may not share the same address. This is therefore not a method to "detect" this optimization. It causes it to become impossible.

Should I avoid static constexpr local variables? If so, why?

Consider code like the following, where the literal some_magic_int (e.g. 3) is given a name just to make a bit clearer what constant it represents:
void f() {
static constexpr int this_variable{some_magic_int};
do_something_with(this_variable);
}
int main() {
// ...
f();
// ...
}
I'm pretty sure constexpr has to be here: some_magic_int is literal, so it never changes, and I'm giving it a name just for clarity, not to give a mean to change it, so it should be at least const; then why not constexpr to have it at compile-time?
But what about static? Is it just unnecessary? Or is it detrimental? If so, why? And also, does it have any observable effect, when paired with constexpr in the declaration of a local variable?
As regards the question to which this is marked as duplicate of, it is about static constexpr int x [] = {}, and not static constexpr int x {}. This highlights at least one difference between that case (attributes applying to x pointer vs attributes applied to *x pointee) and my case (there's no pointer).
Furthermore, once I add constexpr to the specifier of a local variable (where it makes sense, e.g. to a int), I'm saying that variable is compile-time known. Why in the world doesn't that imply that no run-time entity is needed whatsoever?
The standard doesn’t actually ever talk about compile-time anything except to say that types are checked and templates are instantiated before execution. That means that this program must be diagnosed (not “rejected”!) even though the non-constant array length and template argument are never “used” and might plausibly be ignored by an interpreter:
template<int> void f() {}
int main(int argc,char **argv) {
if(false) {
int buf[argc]; // accepted by a common extension
f<argc>();
}
}
Beyond that, the semantics are that every evaluation is part of the ordinary execution of the program and constant folding is just an as-if optimization like any other. (After all, we can optimize argc*2*3*4 even though it contains no non-literal subexpression that could be a constant expression.) For constant expressions, this is largely unobservable because constant evaluation can’t have side effects (which also avoids interactions among constant-initialized non-block variables). It can, however, make a difference via the address of the local variable:
bool introspect(const int *p=nullptr) {
constexpr int x=0;
return p ? p==&x : introspect(&x); // always false
}
That the compiler must make such variables occupy memory if their address escapes is much of the content of the answers to the previously marked duplicate. It therefore makes sense to prefer static except perhaps when the object is large and its address doesn’t matter (e.g., for use as a template argument or via recursion).

Creating a variable like x = func(); function call and var creation order defined?

void f()
{
auto x = func();
int y = func();
auto z = f1() * f2() + static_cast<int>(f3());
}
I believe it should be defined that call to the func will always happened first, before memory allocation for x, for the case with auto, but couldn't found info about it.
Is it so?
And is it defined for the case when type is explicitly written?
Evaluation of the initialization expression (func() or f1() * f2() + static_cast<int>(f3())) definitely happens only when the particular line of code is reached.1
Memory for the variable may be obtained (aka "allocation") at any time earlier... however there's no way to use that memory prior to the definition because until the definition is reached there's no way to name that memory. The variable name is introduced into scope by the definition.
The lifetime of the object living in the variable doesn't begin until the initializer is fully evaluated and placed2 into the new object. See [basic.life]:
The lifetime of an object of type T begins when:
storage with the proper alignment and size for type T is obtained, and
its initialization (if any) is complete (including vacuous initialization)
If the address of the variable is never taken, the variable might not need any memory at all (it could fit in a CPU register for its entire lifetime).
1 Well, under the as-if rule, the compiler can move it around so long as you can't tell the difference. Unless you have undefined behavior such as a data race, it will always act exactly like the computation is done when reaching that line of code.
2 For the copy-initialization syntax used in the question, old versions of C++ generally required a copy or move, while newer versions mandate in-place construction via copy-elision.

Why references to the same constant take distinct memory space in C++?

I'm new to the idea of reference in C++, I have a question concerning the memory allocation of reference to a pure number constant. (Another thing I want to check first is that I suspect const reference, which I frequently came across, means reference to const, but I'm not sure.)
Here is my testing on ideone.com:
#include <stdio.h>
int main() {
const int r0 = 123;
const int &r1 = 123;
const int &r2 = 123;
const int &r3 = r2;
printf("%p\n", (void *)&r0);
printf("%p\n", (void *)&r1);
printf("%p\n", (void *)&r2);
printf("%p\n", (void *)&r3);
return 0;
}
and the result:
0x7ffee3bd74c4
0x7ffee3bd74c8
0x7ffee3bd74cc
0x7ffee3bd74cc
The reason r2 is the same as r3 is clear from this answer - How does a C++ reference look, memory-wise?, which says it's depending on compiler. But I'm thinking about why compiler doesn't also make r0,r1,r2 all the same, since all have the same pure constant value 123. (or called prvalue if no wrong search)
As a note: After some search on this site, I found a most related question - but in python. Although different language but I thought the idea should be the same/similar: from the link, if my program were written in python then there will be only one 123 is in the memory space for saving space.
Some other answers I've read:
C++ do references occupy memory: This answer suggests that if it's necessary then int &x is implemented as *(pointer_to_x).
How does a C++ reference look, memory-wise?: This answer suggests that compiler will try its best to save space.
Your 123 isn't a "constant". Rather, it is a literal. A literal forms an expression that is a prvalue (i.e. a temporary object initialized with the value of given by the literal). When you bind that expression to are reference, the lifetime of that object is extended to that of the reference, but the important point here is that each such object is a distinct object, and thus has a distinct address.
If you will, the text string "123" provides a rule for how to create objects, but it is not by itself an object. You can rewrite your code to make this more explicit:
const int & r = int(123); // temporary of type "int" and value "123"
(There's no single such thing as "a constant" in C++. There are lots of things that are constant in one way or another, but they all need more detailed consideration.)
The literal is not an object. The references do not refer to the literal. When you initialise a reference using a literal, a temporary object will be created, and the lifetime of the temporary object is bound to the lifetime of the reference.
The objects (one local variable, two temporaries) are separate and distinct objects despite having the same value. Since they're separate, they occupy separate memory locations. The standard mandates this, and that makes it possible to identify and distinguish objects based on their memory address.
The three declaration statements:
const int &r1 = 123;
const int &r2 = 123;
const int &r3 = r2;
will initialize 3 temporary objects with lifetime extended to be equal to the scope of their respective variables. Now, there is a language rule that says:
Any two objects with overlapping lifetimes (that are not bit fields)
are guaranteed to have different addresses unless one of them is a
subobject of another or provides storage for another, or if they are
subobjects of different type within the same complete object, and one
of them is a zero-size base.
Since the references are bound to 3 distinct temporary, then you cannot observe these objects on overlapping addresses.
Interestingly, the As-if rule might probably permit the program to allocate all three temporary objects at the same address but only if your compiler and linker can theoretically prove that your program can never observe the these objects as allocated at the same address. In your example, this is infeasible since you print the address of the objects.

Is memory allocated for a static const variable whose address is never used?

If I never use the address of a static const variable, is memory allocated for it when using a reasonably modern compiler?
It depends on the type of the variable, and on whether "constant" also means "constant expression". Example:
static const Foo = get_foo(std::cin);
static const int q = argc * 3;
static const std::string s(gets());
These variables are const, but blatantly need an actual allocation.
On the other hand, the following constant expression may never have physical storage:
static const int N = 1000;
static const std::shared_ptr<void> vp(); // constexpr constructor!
Most importantly, static constexpr member variables don't need a definition if you're careful:
struct Bar
{
int size() const { return N; }
static const int N = 8;
};
// does NOT need "const int Bar::N;"
There is chance that it isn't, but that doesn't matter. You can't rely on implementation details, only on the standard.
In practice, space for static storage can be allocated as part of the initial binary loading, or by the runtime during startup; but will always happen before user code is encountered.
In addition to the constraints that Kerrek SB mentions, the storage for a const expr value could be eliminated if the value itself is never used at runtime.
This wouldn't necessarily mean that the value needs to not be evaluated - if a static const expr were only used as a branch condition, that condition may be evaluated statically and other code paths may not be generated or may be excluded by the optimiser.
Pretty much any storage with static duration may be eliminated if the implementation can guarantee behaviour as though the storage were present - i.e. a comparison expression that can be evaluated at compile time - like a different const expr, a pointer comparison where the rhs is known to be an alias to a different variable, or perhaps an incompatible type. It may also be eliminated if the value is only read into variables that are never read themselves; or where the value may be reduced to a const expr.
struct Foo{};
static Foo bar; // static instance
Foo* func() {
if ( ! (&bar) ) { // always non-NULL
// this block may be eliminated
Foo* myCopy(new Foo(bar));
return myCopy;
}
// so 'bar' is never referred to, and we know it has no side-
// effects, so the static variable can be eliminated
return new Foo();
}
3.7.1 Static storage duration
2. If an object of static storage duration has initialization or a destructor with side effects, it shall not be eliminated even if it appears to be unused, except that a class object or its copy may be eliminated as specified in 12.8.
Memory for global variables is reserved by the linker, not the compiler. So the question is whether the linker is smart enough to not reserve space for global variables that are only used by value.
It depends on the type and use of such data; for example, floating point constants generally must be loaded from memory, so they must have storage even if you don't directly use the address.
Having said that, the standard does specify whether you can optimize out static storage (3.7.1.2: [basic.stc.static]):
If a variable with static storage duration has initialization or a
destructor with side effects, it shall not be eliminated even if it
appears to be unused, except that a class object or its copy/move may
be eliminated as specified in 12.8.
So if the static const variable has a constructor or destructor, it cannot be optimized out (although some compilers/linkers will do this anyway). If it doesn't, it can. Whether it will depends on the linker.