This question already has answers here:
Can a local variable's memory be accessed outside its scope?
(20 answers)
Closed 7 years ago.
EDIT: The question about why the code in this question works has been answered by the linked question in the duplicate marking. The question about string literal lifetime is answered in the answer to this question.
I am trying to understand how and when the string pointed to by const char * gets deallocated.
Consider:
const char **p = nullptr;
{
const char *t = "test";
p = &t;
}
cout << *p;
After leaving the inner scope I would expect p to be a dangling pointer to const char *. However in my tests it is not. That would imply that the value of t actually continues to be valid and accessible even after t gets out of scope.
It could be due to prolonging the lifetime of the temporary by binding it to const reference. But I do no such thing and even by saving the reference to t in a member variable and printing the value from different function later still gives me its correct value.
class CStringTest
{
public:
void test1()
{
const char *t = "test";
m_P = &t;
test2();
}
void test2()
{
cout << *m_P;
}
private:
const char **m_P = nullptr;
};
So what is the lifetime of the t's value here? I would say I am invoking undefined behaviour by dereferencing a pointer to a value of a variable that went out of scope. But it works every time so I think that is not the case.
When trying some other type like QString:
QString *p = nullptr;
{
QString str = "test";
p = &str;
}
cout << *p;
the code always prints the value correctly too even though it should not. str went out of scope with its value and I have not prolonged its lifetime by binding it to const reference either.
Interestingly the class example with QString behaves as I would expect and test2() prints gibberish because the value indeed went out of scope and m_P became dangling pointer.
So what is the actual lifetime of const char *'s value?
The variables p and t are stack variables that you declared, so they have a lifetime that ends at the end of their enclosing block.
But the value of t is the address of the string literal "test", and that is not a variable you declared, it's not on the stack. It's a string literal, which is a constant defined in the program (similar to the integer literal 99 or the floating point literal 0.99). Literals don't go out of scope as you expect, because they are not created or destroyed, they just are.
The standard says:
Evaluating a string-literal results in a string literal object with static storage duration, initialized from the given characters as specified above.
So the object that the compiler creates to represent the literal "test" has static storage duration, which is the same duration as globals and static variables, meaning it doesn't go out of scope like a stack variable.
The value of p is the address of t, which does become an invalid pointer when t goes out of scope, but that doesn't mean that the value stored at that address suddenly becomes inaccessible or gets wiped. The expression *p is undefined behaviour, but it appears to work because nothing has reused that memory location yet so *p still contains the address of the string literal. For more details on that see the top answer to Can a local variable's memory be accessed outside its scope?
Compilers put literal strings into a statically allocated space which is loaded into a protected segment of the virtual memory, such that those strings can be shared over the entire lifetime of the process (the value is a constant, so no need to take the overhead of constantly springing them into existence). Looking for something like that to be deallocated is a waste of time, since it never actually happens.
The variables are stack-allocated. The string constant should be thought of as just that: a string constant...like the number 3.
String literals get allocated in the static storage.
If you mention a string literal anywhere in your program, it's as if you did:
static const char someUniqueIdentifier[]="the data";
in the global scope.
const char* str = "some string"; means make sure "some string" exists as a constant null-terminated array in the static section of the program and point str to it.
You're however referencing your automatic (=on the stack) pointer in the first example, not the static storage string. That indeed has a lifetime limited to it's scope, however when you invoke test2(), the scope of test1() hasn't ended yet.
Related
This question already has answers here:
What are the differences between a pointer variable and a reference variable?
(44 answers)
Closed 1 year ago.
Maybe there is an answer to this, but I haven't found it, probably because I do not know what the correct title of my question is.
I'm starting to learn C++, and noticed that when initializing, modifying, and accessing, the behavior is the same in both of these lines.
int *p = &a;
and
int &p = a;
The only difference I see is that later when I use p in the first case, I have to write *p everytime, otherwise I get the address of (probably a since its value equals &a), whereas in the second case I can just write p without the asterisk.
Are those just different syntax for the same thing, or are they different but just happen to give me the same results (in my very basic tests)? Is the compiler doing the same thing in both cases?
Actually, that depends on your program!
For example:
a = 123;
b = 456;
int *p = &a;
// bunch of statements here
if (condition) { p = &b; }
do_stuff(*p);
You can't do this with a reference: Once it's set, it's set.
Another difference of pointers and references in C++ is that (const) references can extend the lifetime of temporary objects:
foo f() {
foo inner_foo;
return inner_foo;
}
const foo_ref& = f();
if the function were to return a pointer to inner_foo:
foo* f() {
foo inner_foo;
return &inner_foo;
}
const foo* p = f();
you would have a pointer to a destructed foo, in a place on the stack that may get used by other variables.
And there are other differences between pointers and references.
You may also want to read the following items in the C++ Super-FAQ:
Why does C++ have both pointers and references?
When should I use references, and when should I use pointers?
The first is a pointer to an object, in your example to an int named a. A pointer holds the memory address of an object in memory. You would have to dereference the pointer in order to access the object it is pointing at.
The second is a reference to an object, in your example to a. A reference is an alias 1, ie another name, for an object. So the two names are effectively referring to, and thus are treated as, the same object, which is why you can use p to access a without dereferencing p.
1: though, most compilers will implement a reference using a pointer, but this is an implementation detail and should not be relied on in code logic.
Another difference is that a pointer can be set to nullptr, and can also be re-assigned to point at a different memory address, eg:
int a, b;
int *p = nullptr;
...
p = &a;
...
p = &b;
Whereas a reference cannot do either of those things. Once initialized, a reference cannot be changed to refer to another object.
As such, taking the address of a reference with operator& will return the address of the object it refers to, whereas taking the address of a pointer will return the address of the pointer itself, not the object it points at.
Likewise, assigning a value to a reference will assign to the object it refers to, whereas assigning a value to a pointer will assign to the pointer itself, not the object it points at.
int main(){
char * ch = "hi";
char ** chh = &ch;
chh[100] = "hh";
}
What is chh[100] = "hh" doing?
Where is the address of "hh" being stored?
"hi" and "hh" are string literals, both of type const char[3], which decay into const char * pointers to their 1st char elements. The arrays are stored in the program's read-only static memory at compile-time.
At runtime, the address of "hi" in that static memory is being stored in the ch pointer (which is illegal in C++11 and later, since a non-const char* pointer can't point to const char data), then the address of ch is being stored in the chh pointer.
The statement chh[100]="hh" is undefined behavior. chh is pointing at ch, which is stored on the stack of the calling thread. chh is not a pointer to an array of 101+ char* pointers, so indexing into the 100th slot of chh is illegal (only index 0 is valid, as all variables can be treated as an array of 1 element). This code is reaching into stack memory that you don't own, and writing the address of "hh" into the 100th slot of chh is corrupting random memory on that stack (if it doesn't crash outright with an AccessViolation/SegFault).
The expression:
char* ch = "hi";
Is ill-formed because C++ forbids assigning string literals to char*, the correct statement must be:
const char* ch = "hi";
The same goes for:
const char ** chh = &ch;
Both ch and chh must be const qualified.
What is chh[100]="hh" doing?
Nothing it should be doing, chh[100] is a pointer to memory that does not belong to chh, this leads to undefined behavior.
Where is the address of hh being stored?
"hh" is a string literal and is normally stored in read-only memory, that is the reason why C++ forbids its assignment to a non-const pointer, as atempts to change it will lead to, you guessed it, undefined behavior.
Don't confuse this with assignment to a char array, that is legal because a copy of a string literal is assigned instead.
it tries to put a pointer to "hh" array on the stack at the location where chh+100 is pointing. you should never do that because you have no idea what will be happening (maybe this memory is used by an other bit of your code. you may trigger a segfault or a buffer overflow.
I know that "literals" (c strings, int or whatever) are stored somewhere (in a read only data section apparently .rodata) maybe this is not accurate...
I want to understand why this code causes a runtime error:
#include <iostream>
using namespace std;
const int& foo()
{
return 2;
}
const char* bar()
{
return "Hello !";
}
int main() {
cout << foo() << endl; // crash
cout << bar() << endl; // OK
return 0;
}
foo returns a const reference on a literal (2) why does this cause a crash ? is the integer 2 stored in the stack of foo() ?
See also : Why are string literals l-value while all other literals are r-value?
I see why this is confusing so I will try to break it down.
First case:
const int& foo()
{
return 2;
}
The return statement makes a temporary object which is a copy of the literal 2. So its address is either non-extant or different from the location of the literal 2 (assuming literal 2 has a location - not guaranteed).
It is that temporary whose reference is returned.
Second case:
const char* bar()
{
return "Hello !";
}
The return statement makes a temporary object which is a copy of a pointer to the address of the first element of the literal char array. That pointer contains the actual address of the literal array and that address is returned by copy to the caller.
So to sum up. The second one works because the return statement takes a copy of the literal's address and not a copy of the literal itself. It doesn't matter that the storage for the address is temporary because the address still points to the correct place after the temporary holding its value collapses.
That is indeed very confusing, and in order to understand what's happening, one has to dive very deep in the language specification.
But before we do this, let me remind you that compiler warnings are your friends. With a sufficient level of warnings, you should see following when compiling your example:
In function 'const int& foo()': 3 : warning: returning reference to
temporary [-Wreturn-local-addr] return 2; ^
Now, what is happening in your first example? One can not really take an address of the integral literal, since they do not really exist as objects. However, one is allowed to bind constant references to literals. How is it possible, when everybody knows that references are akin to pointers? The reason is that when you bind a const reference to the literal, you do not really bind it to the literal. Instead, compiler creates a temporary variable, and binds your reference to it. And that variable is an object, albeit short-lived one. Once you function returns, the temporary object is destroyed, and you end up with dangling reference -> crash.
In the second example, "hello" is a literal, but you are not returning the literal - you are returning a pointer to the string. And a pointer remains valid, because the string it points to remains valid.
I'm getting a weird problem and I want to know why it behaves like that. I have a class in which there is a member function that returns std::string. My goal to convert this string to const char*, so I did the following
const char* c;
c = robot.pose_Str().c_str(); // is this safe??????
udp_slave.sendData(c);
The problem is I'm getting a weird character in Master side. However, if I do the following
const char* c;
std::string data(robot.pose_Str());
c = data.c_str();
udp_slave.sendData(c);
I'm getting what I'm expecting. My question is what is the difference between the two aforementioned methods?
It's a matter of pointing to a temporary.
If you return by value but don't store the string, it disappears by the next sequence point (the semicolon).
If you store it in a variable, then the pointer is pointing to something that actually exists for the duration of your udp send
Consider the following:
int f() { return 2; }
int*p = &f();
Now that seems silly on its face, doesn't it? You are pointing at a value that is being copied back from f. You have no idea how long it's going to live.
Your string is the same way.
.c_str() returns the the address of the char const* by value, which means it gets a copy of the pointer. But after that, the actual character array that it points to is destroyed. That is why you get garbage. In the latter case you are creating a new string with that character array by copying the characters from actual location. In this case although the actual character array is destroyed, the copy remains in the string object.
You can't use the data pointed to by c_str() past the lifetime of the std::string object from whence it came. Sometimes it's not clear what the lifetime is, such as the code below. The solution is also shown:
#include <string>
#include <cstddef>
#include <cstring>
std::string foo() { return "hello"; }
char *
make_copy(const char *s) {
std::size_t sz = std::strlen(s);
char *p = new char[sz];
std::strcpy(p, s);
return p;
}
int
main() {
const char *p1 = foo().c_str(); // Whoops, can't use p1 after this statement.
const char *p2 = make_copy(foo().c_str()); // Okay, but you have to delete [] when done.
}
From c_str():
The pointer obtained from c_str() may be invalidated by:
Passing a non-const reference to the string to any standard library function, or
Calling non-const member functions on the string, excluding operator[], at(), front(), back(), begin(), rbegin(), end() and
rend().
Which means that, if the string returned by robot.pose_Str() is destroyed or changed by any non-const function, the pointer to the string will be invalidated. Since you may be returning a temporary copy to from robot.pose_Str(), the return of c_str() on it shall be invalid right after that call.
Yet, if you return a reference to the inner string you may be holding, instead of a temporary copy, you can either:
be sure it is going to work, in case your function udp_send is synchronous;
or rely on an invalid pointer, and thus experience undefined behavior if udp_send may finish after some possible modification on the inner contents of the original string.
Q
const char* c;
c = robot.pose_Str().c_str(); // is this safe??????
udp_slave.sendData(c);
A
This is potentially unsafe. It depends on what robot.pose_Str() returns. If the life of the returned std::string is longer than the life of c, then it is safe. Otherwise, it is not.
You are storing an address in c that is going to be invalid right after the statement is finished executing.
std::string s = robot.pose_Str();
const char* c = s.c_str(); // This is safe
udp_slave.sendData(c);
Here, you are storing an address in c that will be valid unit you get out of the scope in which s and c are defined.
This question already has answers here:
Is returning a pointer to a static local variable safe?
(7 answers)
Closed 8 years ago.
The following case makes more confuse to me. As far as I know, local variables don't return by pointers or reference. for example
char * foo()
{
return "Hello world";
}
int* fooo() {
static int i = 100;
return &i;
}
What would happen in both cases ?
String literals are stored statically and of course the static int i is static too. You can return pointers to static variables from functions because they are not local variables, and they are not destroyed when you exit the function as stack allocated variables do. On the other hand, your first example should return a const char *,
From the C++ standard section lex.string:
A string literal ... has type "array of n const char" and static storage duration (basic.stc), where n is the size of the string as defined below, and is initialized with the given characters...
The first code will not compile. You can cast the string to char array and return back. The second code will compile, but your variable will go out of scope. So the reference you will have when the function returns will not be valid anymore. It is bad to return a local variable by address. Is there any specific reason for wanting this behavior?
You can new/malloc a char array or int, and then return it back. You are guaranteed to have the variable in scope, as long as you do not manually free/delete the memory. Then you can happily access the memory from other functions.
Hope this helps.
your first function is not valid, you should return
const char*
and yes, you can return adresses of static variables: they are not destroyed when function returns because they are alocalted in static data memory segment