Is the address of a local variable a constexpr? - c++

In Bjarne Stroustrup's book "The C++ Programming Language (4th Edition)" on p. 267 (Section 10.4.5 Address Constant Expressions), he uses a code example where the address of a local variable is set to a constexpr variable. I thought this looked odd, so I tried running the example with g++ version 7.3.0 and was unable to get the same results. Here is his code example verbatim (although slightly abridged):
extern char glob;
void f(char loc) {
constexpr const char* p0 = &glob; // OK: &glob's is a constant
constexpr const char* p2 = &loc; // OK: &loc is constant in its scope
}
When I run this, I get:
error: ‘(const char*)(& loc)’ is not a constant expression
Is something happening with g++ that I'm not aware of, or is there something more to Bjarne's example?

An earlier printing of Bjarne Stroustrup's book "The C++ Programming Language (4th Edition)" on p. 267 has the error outlined in the OP's question. The current printing and electronic copies have been "corrected" but introduced another error described later. It now refers to the following code:
constexpr const char* p1="asdf";
This is OK because "asdf" is stored in a fixed memory location. In the earlier printing the book errs here:
void f(char loc) {
constexpr const char* p0 = &glob; // OK: &glob's is a constant
constexpr const char* p2 = &loc; // OK: &loc is constant in its scope
}
However, loc is not in a fixed memory location. it's on the stack and will have varying locations depending on when it is called.
However, the current 4th edition printing has another error. This is the code verbatim from 10.5.4:
int main() {
constexpr const char* p1 = "asdf";
constexpr const char* p2 = p1; // OK
constexpr const char* p3 = p1+2; // error: the compiler does not know the value of p1
}
This is wrong. The compiler/linker does know the value of p1 and can determine the value of p1+2 at link time. It compiles just fine.

It appears that the example from section 10.4.5 provided in my hard-copy of the "The C++ Programming Language (4th Edition)" is incorrect. And so I've concluded that the address of a local variable is not a constexpr.
The example appears to have been updated in some pdf versions as seen here:

This answer tries to clarify why the address of a local variable can't be constexpr by analysing an example for the x86-64 architecture.
Consider the following toy function print_addr(), which displays the address of its local variable local_var and call itself recursively n times:
void print_addr(int n) {
int local_var{};
std::cout << n << " " << &local_var << '\n';
if (!n)
return; // base case
print_addr(n-1); // recursive case
}
A call to print_addr(2) produced the following output on my x86-64 system:
2 0x7ffd89e2cd8c
1 0x7ffd89e2cd5c
0 0x7ffd89e2cd2c
As you can see, the corresponding addresses of local_var are different for each call to print_addr(). You can also see that the deeper the function call, the lower the address of the local variable local_var. This is because the stack grows downwards (i.e., from higher to lower addresses) on the x86-64 platform.
For the output above, the call stack would look like the following on the x86-64 platform:
| . . . |
Highest address ----------------- <-- call to print_addr(2)
| print_addr(2) |
-----------------
| print_addr(1) |
-----------------
| print_addr(0) | <-- base case, end of recursion
Lowest address ----------------- Top of the stack
Each rectangle above represents the stack frame for each call to print_addr(). The local_var of each call is located in its corresponding stack frame. Since the local_var of each call to print_addr() is located in its own (different) stack frame, the addresses of local_var differ.
To conclude, since the address of a local variable in a function may not be the same for every call to the function (i.e., each call's stack frame may be located in a different position in memory), the address of such a variable can't be determined at compile time, and therefore can't be qualified as constexpr.

Just to add to other answers that have pointed out the mistake, C++ standard only allows constexpr pointers to objects of static-storage duration, one past the end of such, or nullptr. See [expr.const/8] specifically #8.2;
It's worth noting that:
string-literals have static-storage duration:
Based on constraints in declaring extern variables, they'll inherently have static-storage duration or thread local-storage duration.
Hence this is valid:
#include <string>
extern char glob;
std::string boom = "Haha";
void f(char loc) {
constexpr const char* p1 = &glob;
constexpr std::string* p2 = nullptr;
constexpr std::string* p3 = &boom;
}

Related

Local static variable not initialized in Release builds if used only in lambda

See code below. Correct output is "10 20 30 ", but in Release builds its "0 0 0 ". Why does this happen?
std::vector<int> inValues = {1, 2, 3};
std::vector<int> outValues(inValues.size());
static const int mag = 10;
std::transform(inValues.cbegin(), inValues.cend(), outValues.begin(),
[](const auto value){
return value * mag;
});
for (const auto value: outValues)
std::cout << value << " ";
If variable is mentioned anywhere inside function, or is declared in global scope, everything works as expected.
Speaking of capturing variables in lambdas, http://en.cppreference.com/w/cpp/language/lambda#Explanation says:
A variable can be used without being captured if it does not have automatic storage duration (i.e. it is not a local variable or it is static or thread local)
Your variable is "static local" thus should be captured automatically.
Additionally Microsoft gives this example of using a local static storage element:
void fillVector(vector<int>& v)
{
// A local static variable.
static int nextValue = 1;
// The lambda expression that appears in the following call to
// the generate function modifies and uses the local static
// variable nextValue.
generate(v.begin(), v.end(), [] { return nextValue++; });
//WARNING: this is not thread-safe and is shown for illustration only
}
This does work as intended although your example does not.
This is a visual-studio-2015 bug. I can report that it has been cleared up in visual-studio-2017. So while you can feel free to report this as a bug, I'd suggest just upgrading to Visual Studio 2017. If you do choose to report it, I'd encourage you to link the bug report here.
Interestingly, it looks like mag is set to 0 on compilation; Visual Studio doesn't seem to realise that it's actually used in the lambda expression.
You could change mag to be a const int only; if you've defined it in a function, then there's arguably little benefit to be gained from defining an int as a static const compared to just a const.
Otherwise, if you're adamant about keeping the static qualifier, you could pretend to use mag right after you define it, just so its value isn't optimized away upon compilation:
static const int mag = 10;
(void)mag; // (pretend to use mag)

Why is it that my second snippet below shows undefined behavior?

Both clang and g++ seem to be compliant with the last version of the paragraph [expr.const]/5 in the C++ Standard. The following snippet prints 11 for both compilers. See live example:
#include <iostream>
void f(void) {
static int n = 11;
static int* temp = &n;
static constexpr int *&&r = std::move(temp);
std::cout << *r << '\n';
}
int main()
{
f();
}
According to my understanding of this paragraph both compilers should print 2016 for the code below. But they don't. Therefore, I must conclude that the code shows undefined behavior, as clang prints an arbitrary number and g++ prints 0. I'd like to know why is it UB, taking into consideration, for example, the draft N4527 of the Standard? Live example.
#include <iostream>
void f(void) {
static int n = 11;
static int m = 2016;
static int* temp = &n + 1;
static constexpr int *&&r = std::move(temp);
std::cout << *r << '\n';
}
int main()
{
f();
}
Edit
I have a habit of not being satisfied with an answer that just says the code is UB, or shows undefined behavior. I always like to investigate a little deeper, and sometimes, as now, I happen to be lucky enough to understand a little bit more, how compilers are built. And that's what I found out in this case:
Both clang and GCC seem to eliminate any unused variable, like m, from the code, for any optimization level greater than -O0. GCC seems to order local variables with static storage duration, the same way variables are placed on the stack, i.e., from higher to lower addresses.
Thus, in clang, if we change the optimization level to -O0 we get the number 2016 printed as expected.
In GCC, if in addition to that, we also change the definition of
static int* temp = &n + 1;
to
static int* temp = &n - 1;
we will also get the number 2016 printed by the code.
I don't think there's anything subtle here. &n + 1 points one-past-the-end of the array-of-one as which you may consider the location n, and so it does not constitute a dereferenceable pointer, although it is a perfectly valid pointer. Thus temp and r are perfectly fine constexpr variables.
You could use r like this:
for (int * p = &n; p != r; ++p) { /* ... */ }
This loop could even appear in a constexpr function.
The behaviour is of course undefined when you attempt to dereference r, but that has nothing to do with constant expressions.
You've apparently expected that you can:
obtain a pointer to a static storage duration object
add one to it
get a pointer to the "next" static storage duration object (in declaration order)
This is nonsense.
You'd have to eschew all standard-backed guarantees, relying only on an unholy combination of UB and implementation documentation. Clearly you have crossed the UB threshold long before we ever even entertain discussions about constexpr and std::move, so I'm not sure what relevance they were intended to hold in this question.
Pointers are not "memory addresses" that you can use to navigate your declaration space.

References in c++ with function

Could anyone please elaborate the behaviour of reference in this code and why is it printing 12 in first line instead of 11.
Below is the code
http://ideone.com/l9qaBp
#include <cstdio>
using namespace std;
int &fun()
{
static int x = 10;
x++;
return x;
}
int main()
{
int *ptr=&fun();
int *ptr1=&fun();
printf("%p %p \t %d %d",(void*)ptr,(void*)ptr1,*ptr,*ptr1);
return 0;
}
the output of the code is
134519132 134519132 12 12
Please explain why 12 is getting printed on first call not 11 i understand when second call is made it should print 12
ptr and ptr1 are pointing to the same variable static int x. The second call changed the value of static int x to 12, then you print out the value by derefernce ptr and ptr1, the same result will be printed out.
The int reference returned by fun() is the same for both calls (ref to the static x), hence the address of that reference is the same for both calls. Hence the resulting dereference of that identical address is the current identical value.
Your error seems to be in thinking that printf() prints *ptr as soon as it comes available. It does not; printf() is not called until both ptr and ptr1 are computed. Since both ptr and ptr1 point to the same memory location, which is a static variable, and that location is updated after both the first call to fun() and the second, the address holds that value.
static variables have lifetime extent and are stored in statically allocated memory. It means that the storage of static local variables inside a function is not allocated and deallocated on call-stack.
Once x is initialized at compile time, the value of x is kept stored between the invocations of function fun.
As C++ statements are executed sequentially, printf will be executed after the invocations of two function calls in the given lines
int *ptr=&fun();
int *ptr1=&fun();
and therefore the value of x will be 12 before the execution of printf statements.
Keep in mind that
int *ptr=&fun();
int *ptr1=&fun();
is not equivalent to
int& (*ptr)() = &fun;
int& (*ptr1)() = &fun;
In second snippet ptr and ptr1 both holding the address of function fun. In this case you need to call function directly or by using these pointers as
int a = ptr();
int b = ptr1();
after this invocation value of a and b will be 12.

Pointer corrupted while returning from a function

TL;DR: When I run my C++ program on a Mac under OS X Yosemite, a pointer gets corrupted while a function is returning. How do I stop it from happening? (and why?)
In this sample program, I have a data structure of type category_map<T> which is effectively just a
map<string, list<pair<string, T> > >
The category_map class has a couple of methods, including get(string& name) which pulls the list stored under the given name and returns the T from the first element of that list. In my case, T is a pointer type. The pointer that the code retrieves from the first pair in the list - that'd be p in the code listing below - is valid. A debugger session shows that the value of p on the last line of the function - the closing brace, before destructors run - is a valid memory location like, say, 0x100809c00.
T& get(const string& name) const {
cerr << "searching for " << name << endl;
typename super::const_iterator map_iterator = super::find(name);
// the real code doesn't assume it will be found
list_type the_list = map_iterator->second;
T& p = the_list.front().second;
cerr << "found " << val_loc_string<T>(p) << endl;
return p;
}
However, when I compile and run the code on a Mac (OS X Yosemite), but not on Linux, somewhere in the process of cleaning up from this function, something writes to the same location in memory, so that the returned pointer - stored in the variable ip in the next code listing below - is corrupted. For example, it might become 0x3000100809c00 or 0x5000100809c00. The corrupted pointer is always the original pointer with one or a few extra bits set in the 2nd-most-significant byte of the 8-byte address.
int main(const int argc, const char** argv) {
category_map<int*> imap;
int a;
imap.add("Q1", "m", &a);
imap.add("Q1", "r", &a);
imap.add("Q2", "m", &a);
int* ip = imap.get("Q1");
cerr << "return value: " << val_loc_string<int*>(ip) << endl;
cout << *ip << endl;
}
Using GDB (installed through MacPorts) I've identified the specific instruction that writes the extra bits to the memory location.
0x00007fff93188279: cmp $0x2,%eax
0x00007fff9318827c: jb 0x7fff9318828d
0x00007fff9318827e: shl $0x4,%rax
=> 0x00007fff93188282: mov %r10w,-0x2(%rax,%rdx,1)
0x00007fff93188288: mov %r10w,0x10(%rdx)
0x00007fff9318828d: test %r10w,%r10w
0x00007fff93188291: jne 0x7fff93188299
(more context) but this is not much help because it's not part of a C/C++ function, I'm not fluent enough in assembly to understand what it's doing on a large scale, and the backtrace is garbage so I can't put the code in context. (I've also captured the values of the registers just prior to the instruction that corrupts the pointer, in case that helps for some reason.)
Since I instantiate category_map<T> only with pointer types, I could change the return type of get to T (instead of T&) and that does appear to solve (or at least work around) the problem. But it makes the data structure more generally useful if it can hold large objects and return them by reference, and I would think that should be possible. Plus, whatever error I've made in coding this, I would like to understand so I don't make it again. Can anyone point out what I did wrong, and the right way to fix it without changing the API?
With
list_type the_list = map_iterator->second;
you make a copy of map_iterator->second. the_list is a function-local object. Then
T& p = the_list.front().second;
return p;
returns a reference to something that lives as long as this function-local object and is destroyed when the function is left. The reference dangles.
It looks to me as though you didn't intend to make a copy of the list, so
// +------ const because get() is const-qualified
// v v-- reference
list_type const &the_list = map_iterator->second;
// v-- const because the_list is const
T const& p = the_list.front().second;
should fix it, if you can make get() const return a T const &1. Otherwise you have the problem of attempting to return a reference to non-const member from a const member function; this would break const-correctness and is therefore forbidden (if it were allowed, you would be able to change constant objects through that reference).
1 You could also make get const() return a value rather than a reference, but there doesn't seem to be a reason to force that copy.

Why strange behavior with casting back pointer to the original class?

Assume that in my code I have to store a void* as data member and typecast it back to the original class pointer when needed. To test its reliability, I wrote a test program (linux ubuntu 4.4.1 g++ -04 -Wall) and I was shocked to see the behavior.
struct A
{
int i;
static int c;
A () : i(c++) { cout<<"A() : i("<<i<<")\n"; }
};
int A::c;
int main ()
{
void *p = new A[3]; // good behavior for A* p = new A[3];
cout<<"p->i = "<<((A*)p)->i<<endl;
((A*&)p)++;
cout<<"p->i = "<<((A*)p)->i<<endl;
((A*&)p)++;
cout<<"p->i = "<<((A*)p)->i<<endl;
}
This is just a test program; in actual for my case, it's mandatory to store any pointer as void* and then cast it back to the actual pointer (with help of template). So let's not worry about that part. The output of the above code is,
p->i = 0
p->i = 0 // ?? why not 1
p->i = 1
However if you change the void* p; to A* p; it gives expected behavior. WHY ?
Another question, I cannot get away with (A*&) otherwise I cannot use operator ++; but it also gives warning as, dereferencing type-punned pointer will break strict-aliasing rules. Is there any decent way to overcome warning ?
Well, as the compiler warns you, you are violating the strict aliasing rule, which formally means that the results are undefined.
You can eliminate the strict aliasing violation by using a function template for the increment:
template<typename T>
void advance_pointer_as(void*& p, int n = 1) {
T* p_a(static_cast<T*>(p));
p_a += n;
p = p_a;
}
With this function template, the following definition of main() yields the expected results on the Ideone compiler (and emits no warnings):
int main()
{
void* p = new A[3];
std::cout << "p->i = " << static_cast<A*>(p)->i << std::endl;
advance_pointer_as<A>(p);
std::cout << "p->i = " << static_cast<A*>(p)->i << std::endl;
advance_pointer_as<A>(p);
std::cout << "p->i = " << static_cast<A*>(p)->i << std::endl;
}
You have already received the correct answer and it is indeed the violation of the strict aliasing rule that leads to the unpredictable behavior of the code. I'd just note that the title of your question makes reference to "casting back pointer to the original class". In reality your code does not have anything to do with casting anything "back". Your code performs reinterpretation of raw memory content occupied by a void * pointer as a A * pointer. This is not "casting back". This is reinterpretation. Not even remotely the same thing.
A good way to illustrate the difference would be to use and int and float example. A float value declared and initialized as
float f = 2.0;
cab be cast (explicitly or implicitly converted) to int type
int i = (int) f;
with the expected result
assert(i == 2);
This is indeed a cast (a conversion).
Alternatively, the same float value can be also reinterpreted as an int value
int i = (int &) f;
However, in this case the value of i will be totally meaningless and generally unpredictable. I hope it is easy to see the difference between a conversion and a memory reinterpretation from these examples.
Reinterpretation is exactly what you are doing in your code. The (A *&) p expression is nothing else than a reinterpretation of raw memory occupied by pointer void *p as pointer of type A *. The language does not guarantee that these two pointer types have the same representation and even the same size. So, expecting the predictable behavior from your code is like expecting the above (int &) f expression to evaluate to 2.
The proper way to really "cast back" your void * pointer would be to do (A *) p, not (A *&) p. The result of (A *) p would indeed be the original pointer value, that can be safely manipulated by pointer arithmetic. The only proper way to obtain the original value as an lvalue would be to use an additional variable
A *pa = (A *) p;
...
pa++;
...
And there's no legal way to create an lvalue "in place", as you attempted to by your (A *&) p cast. The behavior of your code is an illustration of that.
As others have commented, your code appears like it should work. Only once (in 17+ years of coding in C++) I ran across something where I was looking straight at the code and the behavior, like in your case, just didn't make sense. I ended up running the code through debugger and opening a disassembly window. I found what could only be explained as a bug in VS2003 compiler because it was missing exactly one instruction. Simply rearranging local variables at the top of the function (30 lines or so from the error) made the compiler put the correct instruction back in. So try debugger with disassembly and follow memory/registers to see what it's actually doing?
As far as advancing the pointer, you should be able to advance it by doing:
p = (char*)p + sizeof( A );
VS2003 through VS2010 never give you complaints about that, not sure about g++