The following code uses the heap:
char* getResult(int length) {
char* result = new char[length];
// Fill result...
return result;
}
int main(void) {
char* result = getResult(100);
// Do something...
delete result;
}
So result has to be deleted somewhere, preferably by the owner.
The code below, from what I understand, use an extension called VLA, which is part of C99, and not part of the C++ standard (but supported by GCC, and other compilers):
char* getResult(int length) {
char result[length];
// Fill result...
return result;
}
int main(void) {
char* result = getResult(100);
// Do something...
}
Am I correct in assuming that result is still allocated on the stack in this case?
Is result a copy, or is it a reference to garbage memory? Is the above code safe?
Am I correct in assuming that result is still allocated on the stack in this case?
Correct. VLA have automatic storage duration.
Is result a copy, or is it a reference to garbage memory? Is the above code safe?
The code is not safe. The address returned by getResult is an invalid address. Dereferencing the pointer invokes undefined behavior.
You can not return it, in C it will have automatic storage duration(the object will not be valid once you leave the scope) and returning it will invoke undefined behavior, from the C99 draft standard section 6.2.4 Storage durations of objects paragraph 6:
For such an object that does have a variable length array type, its lifetime extends from the declaration of the object until execution of the program leaves the scope of the
declaration.27) If the scope is entered recursively, a new instance of the object is created each time. The initial value of the object is indeterminate.
In C++ we have to rely on the docs since it is extension in that case and the gcc docs on VLA says that it is deallocated when the scope ends:
These arrays are declared like any other automatic arrays, but with a length that is not a constant expression. The storage is allocated at the point of declaration and deallocated when the block scope containing the declaration exits.
When you return from getResult(), the char array result will go out of scope and be deallocated along with the stack frame for the function call. If you want to preserve the function structure, you'll have to call malloc and later free the memory.
Related
How do we know pointers are not initialized to NULL by default?
There is a similar questions directed at Why aren't pointers initialized with NULL by default?
Just for checking, here is a very simple code just to see if a pointer is set to NULL by default.
#include <iostream>
using namespace std;
int main()
{
int* x;
if(!x)
cout << "nullptr" << endl;
return 0;
}
and at the output, I received nullptr message. I appreciate if someone can clarify this.
How do we know pointers are not initialized to NULL by default?
Because we know that the standard says that default initialised pointer has an indeterminate value if it has automatic or dynamic storage. Quote from the standard (draft):
[dcl.init] If no initializer is specified for an object, the object is default-initialized. When storage for an object
with automatic or dynamic storage duration is obtained, the object has an indeterminate value, and if
no initialization is performed for the object, that object retains an indeterminate value until that value
is replaced. ...
And further:
[dcl.init] To default-initialize an object of type T means:
— If T is a (possibly cv-qualified) class type [pointer isn't a class, so we don't care]
— If T is an array type [pointer isn't an array, so we don't care]
— Otherwise, no initialization is performed.
I have declared a char (and also int) pointer without initializing it , and I got null pointers.
Reading an indeterminate value has undefined behaviour. Quote from the standard (draft):
[dcl.init] ... If an indeterminate value is produced by an evaluation, the behavior is undefined except in the
following cases: [cases which don't apply here]
The question you linked to handles variables with local storage duration exclusively, so I assume you refer to these as well.
Such variables are not initialised if you don't do so yourself, so they get the value of whatever was written in their memory location before (standard wording: their value is 'indeterminate') – nothing speaks against, though, that this memory already is zero – by pure accident!
You can try the following:
void test()
{
int* p; // uninitialized
std::cout << p << std::endl; // undefined behaviour!!!
// (that's what you most likely did already...)
// now something new: change the memory...
p = reinterpret_cast<int*>(static_cast<uintptr_t(0xaddadaad));
}
int main()
{
test();
// again something new: call it a SECOND time:
test();
}
As this is undefined behaviour there are no guarantees at all that you will get any meaningful output – chances are, though that the memory of first function call is reused in second one and you might get output ressembling to the following:
00000000
addadaad
So even if there just happened to be all zero memory at programme start, it might differ from that at some later point while your programme is running...
After reading many posts about this, I want to clarify the next point:
A* a = new A();
A* b = a;
delete a;
A* c = a; //illegal - I know it (in c++ 11)
A* d = b; //I suppose it's legal, is it true?
So the question is about using the value of copy of deleted pointer.
I've read, that in c++ 11 reading the value of a leads to undefined behaviour - but what about reading the value of b?
Trying to read the value of the pointer (note: this is different to
dereferencing it) causes implementation-defined behaviour since C++14,
which may include generating a runtime fault. (In C++11 it was
undefined behaviour)
What happens to the pointer itself after delete?
Both:
A* c = a;
A* d = b;
are undefined in C++11 and implementation defined in C++14. This is because a and b are both "invalid pointer values" (as they point to deallocated storage space), and "using an invalid pointer value" is either undefined or implementation defined, depending on the C++ version. ("Using" includes "copying the value of").
The relevant section ([basic.stc.dynamic.deallocation]/4) in C++11 reads (emphasis added):
If the argument given to a deallocation function in the standard library is a pointer that is not the null pointer value (4.10), the deallocation function shall deallocate the storage referenced by the pointer, rendering invalid all pointers referring to any part of the deallocated storage. The effect of using an invalid pointer value (including passing it to a deallocation function) is undefined.
with a non-normative note stating:
On some implementations, it causes a system-generated runtime
In C++14 the same section reads:
If the argument given to a deallocation function in the standard library is a pointer that is not the null pointer value (4.10), the deallocation function shall deallocate the storage referenced by the pointer, rendering invalid all pointers referring to any part of the deallocated storage. Indirection through an invalid pointer value and passing an invalid pointer value to a deallocation function have undefined behavior. Any other use of an invalid pointer value has implementation-defined behavior.
with a non-normative note stating:
Some implementations might define that copying an invalid pointer value causes a system-generated runtime fault
These 2 lines do not have any difference (meaning legality for C++):
A* c = a; //illegal - I know it (in c++ 11)
A* d = b; //I suppose it's legal, is it true?
Your mistake (and it is pretty common) to think if you call delete on a it makes it any different than b. You should remember that when you call delete on a pointer you pass argument by value, so memory, where a points to after delete is not usable anymore, but that call does not make a any different than b in your example.
You should not use the pointer after delete. My below example with acessing a is based on implementation-defined behaviour.
(thanks to for M.M and Mankarse for pointing this)
I feel that it is not the variable a (or b, c, d) that is important here, but that the value (=the memory address of a deallocated block) which in some implementations can trigger a runtime fault when used in some 'pointer context'.
This value may be an rvalue/expression, not necessarily the value stored in a variable - so I do not believe the value of a ever changes (I am using the loose 'pointer context' to distinguish from using the same value, i.e. the same set of bits, in non-pointer related expressions - which will not cause a runtime fault).
------------My original post is below.---------------
Well, you are almost there with your experiment. Just add some cout's like here:
class A {};
A* a = new A();
A* b = a;
std::cout << a << std::endl; // <--- added here
delete a;
std::cout << a << std::endl; // <--- added here. Note 'a' can still be used!
A* c = a;
A* d = b;
Calling delete a does not do anything to the variable a. This is just a library call. The library that manages dynamic memory allocation keeps a list of allocated memory blocks and uses the value passed by variable a to mark one of the previously allocated blocks as freed.
While it is true what Mankarse cites from C++ documentation, about: "rendering invalid all pointers referring to any part of the deallocated storage" - note that the value of variable a remains untouched (you did not pass it by reference, but by value !).
So to sum up and to try to answer your question:
Variable a still exists in the scope after delete. The variable a still contains the same value, which is the address of the beginning of the memory block allocated (and now already deallocated) for an object of class A. This value of a technically can be used - you can e.g. print it like in my above example – however it is hard to find a more reasonable use for it than printing/logging the past...
What you should not do is trying to de-reference this value (which you also keep in variables b, c, and d) – as this value is not a valid memory pointer any longer.
You should never rely on the object being in the deallocated storage (while it is quite probable that it will remain there for some while, as C++ does not require to clear the storage freed after use) - you have no guarantees and no safe way to check this).
I wonder why I would need the second version?
int* p; // version 1
int* p = new int; // version 2
In the first version, the pointer isn't pointing at anything, it is undefined. Version 2 allocated memory and points p to that new memory. You are not allocating space for the pointer itself but memory for the pointer to point at. (In both versions the pointer itself is on the stack)
Assuming that the code appears in a function:
The first one defines a local variable of type int* (that is, a pointer). The variable is not initialized, which means the pointer doesn't have a value. It doesn't point at anything. It's nearly useless, about the only thing you can do with it is assign a pointer value to it[*]. So you think to yourself, "can I hold off defining the variable until I have a value to assign to it?"
The second one defines a local variable of type int* (that is a pointer), and also dynamically allocates an object of type int and assigns the address of that object to the pointer variable. So the pointer points to the int.
Dynamically allocating one int is nearly always a bad idea. It's not useless in the sense that you do at least have an int and a means to access it. But you've created a problem for yourself in that you have to keep track of it and free it.
[*] other things you can do with an uninitialized int* variable: take the address of the variable; bind it to a reference of type int*&; convert the address of the variable to char* and examine the memory one byte at a time, just to see what your implementation has put in that uninitialized variable. Nothing exciting and, crucially, nothing involving any int objects because you have none.
The first pointer
The first pointer, declared as:
int* p;
only allocates the memory needed to to store a pointer to int. The actual size is implementation defined. What the p object contains is indeterminate as per 8.5/12:
If no initializer is specified for an object, the object is default-initialized. When storage for an object with automatic or dynamic storage duration is obtained, the object has an indeterminate value, and if no initialization is performed for the object, that object retains an indeterminate value until that value is replaced (5.17).
This means that dereferencing the pointer will lead to undefined behavior.
The second pointer
The second pointer, declared as:
int* p = new int;
dynamically allocates an int. This means that the lifetime of the object will terminate either at the exit of the program (not sure if the standard actually enforces this, but I'm pretty sure the underlying OS will take back the unused memory once the program is done executing) or when you free it.
This pointer can be dereferenced safely, unless operator new failed to allocate memory (in which case it will throw std::bad_alloc or, since C++11, another exception derived from std::bad_alloc).
Why the second pointer shouldn't be used in most cases
Memory management is an hard topic. The main tip that I can give you, is to avoid new and delete like a plague. Whenever you can do something in any other standard way, you should prefer it.
For example, in this case, the only reason I can come up with to justify such a technique is to have an optional parameter. You could, and should, std::optional instead.
My understanding is that in C and C++, creating a character array by calling:
char *s = "hello";
actually creates two objects: a read-only character array that is created in static space, meaning that it lives for the entire duration of the program, and a pointer to that memory. The pointer is a local variable to its scope then dies.
My question is what happens to the array when the pointer dies? If I execute the code above inside a function, does this mean I have a memory leak after I exit the function?
it lives for the entire duration of the program
Exactly, formally it has static storage duration.
what happens to the array when the pointer dies?
Nothing.
If I execute the code above inside a function, does this mean I have a memory leak after I exit the function?
No, because of (1). (The array is only "freed" when the program exits.)
No, there is no leak.
The literal string is stored in the program's data section, which is typically loaded into a read-only memory page. All equivalent string literals will typically point to the same memory location -- it's a singleton, of sorts.
char const *a = "hello";
char const *b = "hello";
printf("%p %p\n", a, b);
This should display identical values for the two pointers, and successive calls to the same function should print the same values too.
(Note that you should declare such variables as char const * -- pointer to constant character -- since the data is shared. Modifying a string literal via a pointer is undefined behavior. At best you will crash your program if the memory page is read-only, and at worst you will change the value of every occurrence of that string literal in the entire program.)
const char* s = "Hello"; is part of the code (program) - hence a constant never altered (unless you have some nasty mechanism altering code at runtime)
My question is what happens to the array when the pointer dies? If I
execute the code above inside a function, does this mean I have a
memory leak after I exit the function?
No there will be no memory leak and nothing happens to the array when the pointer dies.
A memory leak could be possible only with dynamic allocation, via malloc(). When you're malloc() something, you have to free() it later. If you don't, there will be a memory leak.
In your case, it's a "static allocation": the allocation and free of this memory space will be freed automatically and you don't have to handle that.
does this mean I have a memory leak after I exit the function?
No, there is no memory leak, string literals have static duration and will be freed when the program is done. Quote from the C++ draft standard section 2.14.5 String literals subsection 8:
Ordinary string literals and UTF-8 string literals are also referred to as narrow string literals. A narrow string literal has type “array of n const char”, where n is the size of the string as defined below, and has static storage duration
Section 3.7.1 Static storage duration says:
[...] The storage for these entities shall last for the duration of the program
Note in C++, this line:
char *s = "hello";
uses a deprecated conversion see C++ warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] for more details.
The correct way would be as follows:
const char *s = "hello";
you only have to free if you use malloc or new
EDIT:
char* string = "a string"; memory allocation is static, and not good practice (if it will be constant the declaration should be a const char*)
because this is in the stack when the function ends it should be destroyed along with the rest of the local variables and arguments.
you need to use specific malloc/free and new/delete when you allocate the memory for your variable like:
char *string = new char[64]; --> delete string;
char *string = malloc(sizeof(char) * 64); --> free(string); //this is not best practice unless you have to use C
Is this always going to run as expected?
char *x;
if (...) {
int len = dynamic_function();
char x2[len];
sprintf(x2, "hello %s", ...);
x = x2;
}
printf("%s\n", x);
// prints hello
How does the compiler (GCC in my case) implement variably sized arrays, in each of C and C++?
No. x2 is local to the if statement's scope and you access it outside of it using a pointer. This results in undefined behaviour.
By the way, VLAs have been made optional in C11 and had never been part of C++. So it's better to avoid it.
The scope is explained here:
Jumping or breaking out of the scope of the array name deallocates the
storage. Jumping into the scope is not allowed; you get an error
message for it.
In your case the array is out of scope.
No, for two separate reasons:
C++: The code isn't valid C++. Arrays in C++ must have a compile-time constant size.
C: No, because the array only lives until the end of the block in which it was declared, and thus dereferencing x is undefined behaviour.
From C11, 6.2.4/2:
If an object is referred to outside of its lifetime, the behavior is undefined.
And 6.2.4/7 says that the variable-length array lives from its declaration until the end of its enclosing scope:
For such an object that does have a variable length array type, its lifetime extends from
the declaration of the object until execution of the program leaves the scope of the
declaration.