Related
I am used to programming in C, and in that language I would just return a pointer to the data, and then the caller would be responsible for freeing the data, however, from what I've read, the vector's destructor will be called as soon as it goes out of scope, causing it's data to be de-allocated.
Once the reference is returned, the size of the contents will not change, so if I could just have a pointer to the data that I could manually delete afterwards that would be ideal. What I really do not want to do is copy all of the data into a new container, since this vector is going to grow to be very large.
Any help would be appreciated. Every solution I have seen so far involves either copying by value and relying on the compiler to optimize it, or using additional classes to wrap the vector.
Edit: To be clear, the only part of the vector which I want to keep is a pointer to its data (I.E. the pointer you get using the vector.data() method), I don't need to keep any other information about the original vector.
Good (safe and fast) solution for the use case that you describe: Don't return a reference, return the vector to the outside of the scope instead.
Is it possible to return an a reference to the data in an std::vector, and preserve that reference after the vector goes out of scope?
Yes... if you use static storage. The lifetime of objects with static storage duration extends until the end of the program. Therefore they stay alive when they go out of scope. Note that static storage is global state, which is often problematic. Avoid if you can and use with care.
With automatic storage: No.
I would just return a pointer to the data, and then the caller would be responsible for freeing the data
This is a problematic approach. How does the caller know that they are responsible for freeing the data? How does the caller know how to free the data? How does the caller know when they can free the data (in case there are other users of the data)? This all relies on the caller to read the documentation, understand it, and not make a mistake. This is a source of many memory leaks, accesses through invalid pointers and double free crashes.
std::vector solves those problems (to some degree; there are no interfaces in C or C++ that cannot be misused if one tries hard enough) by keeping the lifetime tied to the container object.
If I could just have a pointer to the data that I could manually delete afterwards that would be ideal.
You cannot have that. std::vector always destroys its data, and it is not possible to steal the data to the outside of the vector (except into another vector through move or swap).
The whole point of a reference is that it doesn't own the object. It's perfectly valid to use a std::unique_ptr here, which owns the object and can be returned to your caller.
Your function would be defined as follows (I assume a vector of integers), and C++14 or newer:
std::unique_ptr<std::vector<int>> getVector() {
auto vec = std::make_unique<std::vector<int>>(/*any ctor args you want*/);
// for example
vec->push_back(1);
vec->push_back(2);
return vec;
}
A caller can then do as follows:
int main() {
auto vec = getVector();
std::cout << vec->size() << std::endl;
}
and the vector will be safely deleted as the unique_ptr goes out of scope. Note that if you are using C++11, you won't have std::make_unique and will need to do something like:
std::unique_ptr<std::vector<int>> vec(new std::vector<int>(/* ctor args */));
Just write your code like this:
std::vector<blah> my_function_that_returns_a_vector ()
{
std::vector <blah> v;
... code to populate v ...
return v;
}
NVRO will then eliminate the copy. Instead, the vector returned is constructed directly in the caller's stack frame.
It is not possible to preserve the reference to a vector after it goes out of scope because, and you mentioned, it will be deleted. Once the vector is deleted, the memory stored at that address will be inaccessible. Alternatively, you can declare the variable outside of the scope, edit it within the scope. You mention returning by reference - is it possible that you can pass the vector into the function by reference, and simply return void? This way, you can edit the same vector in different scopes.
I come from a C background. I used to allocate memory/array, for example, sending it to somewhere else, and the pointer stayed there, even after, the scope where it was allocated was destroyed.
Now, if I do the same with vectors : initialize, pass it by reference, and then keep that reference saved somewhere. After the initial method, that actually created the vector, goes out of scope, would it be destroyed ? Or as the pointer to it is still saved, it will be kept.
std::vector frees the memory it holds when it is destroyed. Holding a reference to an object that gets destroyed is very bad. Accessing it is UB. General rule: Do not store references, only use them where you can be sure that the object exists for the entire scope, e.g. as parameter of a function.
If you want to keep a copy of the data, simply copy the std::vector. No reference needed.
If you want to be able to access the data from different locations, and have it live as long as at least one location still has a reference/pointer to it, don't use std::vector, use std::shared_ptr.
If you want to combine the benefits of std::vector with the benefits of shared memory that lives until the last place lets go of it, combine them: std::shared_ptr<std::vector<...>>.
If you want to have the std::vector live in one place for a bit, and then live in another place but not in the first place anymore, use move semantics:
std::vector<int> foo = {1, 2, 3};
std::vector<int> bar = std::move(foo); // bar now holds the data that foo held, foo is empty, no copy was performed
A pointer to an array on the stack will not keep the array alive, I suppose thats the same in C. Now a vector is not an array. It is a wrapper around a heap allocated array which frees the memory of the array when it goes out of scope.
... because a pointer to it is still saved it will still be kept?
No! A pointer or reference to a std::vector will not keep the vector alive. The canonical wrong example is:
std::vector<T>& foo() {
std::vector<T> x;
return x;
} // baaam !
The reference returned from the function is dangling. YOu cannot do anything with it. The correct way would be to return by value and rely on return value optimization:
std::vector<T> foo() {
std::vector<T> x;
return x;
}
If you do:
auto y = foo();
No copying is involved, thanks to NRVO.
PS: Compilers should warn you about returning a reference to a local variable.
No if I do the same with vectors, I initialize a vector, pass it by reference, then keep the reference saved somewhere. After the initial method that actually created the vector goes out of scope, would it be destoryed?
Yes.
or because a pointer to it is still saved it will still be kept?
No.
You can have dangling pointers in C++ just like you can in C. It's exactly the same.
This is also often the case for references (though in some cases the lifetime of the object is extended for a little bit).
It is true that the vector's data is internally managed by the vector, but that's by-the-by since you're asking about the vector itself. Forget the vector and ask the same question about an int, then you'll realise the answer is just as you'd expect it to be in C.
The existing answers are good, will add here:
block A
{
vector<int> my_vec(10, 0);
//...something...
}
block B
The my_vec vector will go out of scope and be destroyed at the closing bracket. Now that's stl (Standard Template Library) vectors.
You can also use C-style arrays (but with different syntax). For very large arrays, I have found the allocation time with a dynamically-allocated array to be faster than STL vectors.
//static, goes out of scope and memory handled at the end of code block
int arr0[10];
//dynamic, not destroyed unless delete called.
int* arr1 = new int[10];
//...work with arr1...
delete [] arr1;
Just like in C, you need to take care of de-allocating any memory you create using new.
I was thinking about a this situation not for a real implementation but to understand better how pointers works.
class foo(){
foo();
~foo();
void doComplexThings(const std::vector<int*>& v){
int* copy;
for(int i = 0; i < v.size(); i++){
copy = v[i];
// do some stuffs
}
}
}
main(){
std::vector<int*> myVector; // suppose we have 100 elements
doComplexThings(myVector);
for(int i = 0; i < myVector.size(); i++){
delete myVector[i];
}
myVector.clear();
}
Ok, I know that have no sense to copy v[i] inside an other pointer, but I was thinking: copy do a memory leak?
After the execution of doComplexThings(), copy will continue to exist and will occupy space in the heap?
After deleting all elements it will continue to exist and point to a deallocated memory?
So logically if I do this things with complex objects I'll keep occupy the memory with unreference object? Or copy is saved in the stack because I don't use new? And at the end of doComplexThings it will be deleted?
I'm a bit confused, thanks!
There is some confusion on the topic of pointers in the C++ community. While it is true that smart pointers have been added to the library to alleviate problems with dynamic memory allocation, raw pointers are not obsolete. In fact, whenever you want to inspect another object without owning it, you should use a reference or raw pointer, depending on which suits your needs. If the concept of ownership is unclear to you, think of an object as being owned by another object if the latter is responsible for cleaning up afterwards (deleting the former).
For example most uses of new and delete should be replaces with the following (omitting std for brevity):
{
auto ptr_to_T = make_unique<T>(//constructor params);
do_stuff_with_smart_ptr(ptr_to_T);
do_stuff_with_T(*ptr_to_T);
do_stuff_with_raw_ptr(ptr_to_T.get());
} // automatic release of memory allocated with make_unique()
Notice how a function that takes a T* doesn't need a smart pointer if it doesn't keep a copy of the T* it is given, because it doesn't affect the lifetime of the object. The object is guaranteed to be alive past the return point of do_stuff_with_T() and its function signature signals that it doesn't own the object by taking a raw pointer.
On the other hand, if you need to pass the pointer to an object that is allowed to keep the pointer and reference it later, it is unclear when the object will need to be destroyed and most importantly by whom. This is solved via a shared pointer.
ClassThatNeedsSharedOwnership shared_owner;
{
auto ptr_to_T = make_shared<T>(//constructor params);
shared_owner.set_T(ptr_to_T);
// do a lot of stuff
}
// At this point ptr_to_T is destroyed, but shared_owner might keep the object alive
So how does the above factor in to your code. First of all, if the vector is supposed to own (keep alive) the ints it points to, it needs to hold unique_ptr<int> or shared_ptr<int>. If it is just pointing to ints held by something else, and they are guaranteed to be alive until after the vector is destroyed, you are fine with int*. In this case, it should be evident that a delete is never necessary, because by definition your vector and the function working on the vector are not responsible for cleaning-up!
Finally, you can make your code more readable by changing the loop to this (C++11 which you've tagged in the post):
for (auto copy : v){
// equivalent to your i-indexed loop with copy = v[i];
// as long as you don't need the value of i
do_stuff_to_int_ptr(copy);
// no delete, we don't own the pointee
}
Again this is only true if some other object holds the ints and releases them, or they are on the stack but guaranteed to be alive for the whole lifetime of vector<int*> that points to them.
No additional memory is allocated on the heap when you do this:
copy = v[i];
variable copy points to the same address as v[i], but no additional array is allocated, so there would be no memory leak.
A better way of dealing with the situation is to avoid raw pointers in favor of C++ smart pointers or containers:
std::vector<std::vector<int>> myVector;
Now you can remove the deletion loop, which is an incorrect way of doing it for arrays allocated with new int[length] - it should use delete[] instead:
delete[] myVector[i];
Basically you're illustrating the problem with C pointers which lead to the introduction of C++ unique and shared pointers. If you pass a vector of allocated pointers to an opaque member function, you've no way of knowing whether that function hangs onto them or not, so you don't know whether to delete the pointer. In fact in your example you don't seem to, "copy" goes out of scope.
The real answer is that you should only seldom use allocated pointers in C++ at all. The stl vector will serve as a safer, easier to use version of malloc / new. Then you should pass them about as const & to prevent functions from changing them. If you do need an allocated pointer, make one unique_ptr() and then you know that the unique_ptr() is the "owner" of the memory.
It is the first time I am using STL and I am confused about how should I deallocate the the memory used by these containers. For example:
class X {
private:
map<int, int> a;
public:
X();
//some functions
}
Now let us say I define the constructor as:
X::X() {
for(int i=0; i<10; ++i) {
map[i]=i;
}
}
Now my question is should I write the destructor for this class or the default C++ destructor will take care of deallocating the memory(completely)?
Now consider the modification to above class
class X {
private:
map<int, int*> a;
public:
X();
~X();
//some functions
}
Now let us say I define the constructor as:
X::X() {
for(int i=0; i<10; ++i) {
int *k= new int;
map[i]=k;
}
}
Now I understand that for such a class I need to write a destructor as the the memory allocated by new cannot be destructed by the default destructor of map container(as it calls destructor of objects which in this case is a pointer). So I attempt to write the following destructor:
X::~X {
for(int i=0; i<10; ++i) {
delete(map[i]);
}
//to delete the memory occupied by the map.
}
I do not know how to delete the memory occupied by the map. Although clear function is there but it claims to bring down the size of the container to 0 but not necessarily deallocate the memory underneath. Same is the case with vectors too(and I guess other containers in STL but I have not checked them).
Any help appreciated.
should I write the destructor for this class or the default C++ destructor will take care of deallocating the memory(completely)?
Yes it will. All the standard containers follow the principle of RAII, and manage their own dynamic resources. They will automatically free any memory they allocated when they are destroyed.
I do not know how to delete the memory occupied by the map.
You don't. You must delete something if and only if you created it with new. Most objects have their memory allocated and freed automatically.
The map itself is embedded in the X object being destroyed, so it will be destroyed automatically, and its memory will be freed along with the object's, once the destructor has finished.
Any memory allocated by the map is the responsibility of the map; it will deallocate it in its destructor, which is called automatically.
You are only responsible for deleting the dynamically allocated int objects. Since it is difficult to ensure you delete these correctly, you should always use RAII types (such as smart pointers, or the map itself) to manage memory for you. (For example, you have a memory leak in your constructor if a use of new throws an exception; that's easily fixed by storing objects or smart pointers rather than raw pointers.)
When a STL collection is destroyed, the corresponding destructor of the contained object is called.
This means that if you have
class YourObject {
YourObject() { }
~YourObject() { }
}
map<int, YourObject> data;
Then the destructor of YourObject is called.
On the other hand, if you are storing pointers to object like in
map<int, YourObject*> data
Then the destruct of the pointer is called, which releases the pointer itself but without calling the pointed constructor.
The solution is to use something that can hold your object, like a shared_ptr, that is a special object that will care about calling the holded item object when there are no more references to it.
Example:
map<int, shared_ptr<YourObject>>
If you ignore the type of container you're dealing with an just think of it as a container, you'll notice that anything you put in the container is owned by whomever owns the container. This also means that it's up to the owner to delete that memory. Your approach is sufficient to deallocate the memory that you allocated. Because the map object itself is a stack-allocated object, it's destructor will be called automatically.
Alternatively, a best practice for this type of situation is to use shared_ptr or unique_ptr, rather than a raw pointer. These wrapper classes will deallocate the memory for you, automatically.
map<int shared_ptr<int>> a;
See http://en.cppreference.com/w/cpp/memory
The short answer is that the container will normally take care of deleting its contents when the container itself is destroyed.
It does that by destroying the objects in the container. As such, if you wanted to badly enough, you could create a type that mismanaged its memory by allocating memory (e.g., in its ctor) but doesn't release it properly. That should obviously be fixed by correcting the design of those objects though (e.g., adding a dtor that releases the memory they own). Alternatively, you could get the same effect by just storing a raw pointer.
Likewise, you could create an Allocator that didn't work correctly -- that allocated memory but did nothing when asked to release memory.
In every one of these cases, the real answer is "just don't do that".
If you have to write a destructor (or cctor or op=) it indicates you likely do something wrong. If you do it to deallocate a resource more likely so.
The exception is the RAII handler for resources, that does nothing else.
In regular classes you use proper members and base classes, so your dtor has no work of its own.
STL classes all handle themselves, so having a map you have no obligations. Unless you filled it with dumb pointers to allocated memory or something like that -- where the first observation kicks in.
You second X::X() sample is broken in many ways, if exception is thrown on the 5th new you leak the first 4. And if you want to handle that situatuion by hand you end up with mess of a code.
That is all avoided if you use a proper smart thing, like unique_ptr or shared_ptr instead of int*.
How does container object like vector in stl get destroyed even though they are created in heap?
EDIT
If the container holds pointers then how to destroy those pointer objects
An STL container of pointer will NOT clean up the data pointed at. It will only clean up the space holding the pointer. If you want the vector to clean up pointer data you need to use some kind of smart pointer implementation:
{
std::vector<SomeClass*> v1;
v1.push_back(new SomeClass());
std::vector<boost::shared_ptr<SomeClass> > v2;
boost::shared_ptr<SomeClass> obj(new SomeClass);
v2.push_back(obj);
}
When that scope ends both vectors will free their internal arrays. v1 will leak the SomeClass that was created since only the pointer to it is in the array. v2 will not leak any data.
If you have a vector<T*>, your code needs to delete those pointers before delete'ing the vector: otherwise, that memory is leaked.
Know that C++ doesn't do garbage collection, here is an example of why (appologies for syntax errors, it has been a while since I've written C++):
typedef vector<T*> vt;
⋮
vt *vt1 = new vt, *vt2 = new vt;
T* t = new T;
vt1.push_back(t);
vt2.push_back(t);
⋮
delete vt1;
The last line (delete vt1;) clearly should not delete the pointer it contains; after all, it's also in vt2. So it doesn't. And neither will the delete of vt2.
(If you want a vector type that deletes pointers on destroy, such a type can of course be written. Probably has been. But beware of delete'ing pointers that someone else is still holding a copy of.)
When a vector goes out of scope, the compiler issues a call to its destructor which in turn frees the allocated memory on the heap.
This is somewhat of a misnomer. A vector, as with most STL containers, consists of 2 logical parts.
the vector instance
the actual underlying array implementation
While configurable, #2 almost always lives on the heap. #1 however can live on either the stack or heap, it just depends on how it's allocated. For instance
void foo() {
vector<int> v;
v.push_back(42);
}
In this case part #1 lives on the stack.
Now how does #2 get destroyed? When a the first part of a vector is destroyed it will destroy the second part as well. This is done by deleting the underlying array inside the destructor of the vector class.
If you store pointers in STL container classes you need to manually delete them before the object gets destroyed. This can be done by looping through the whole container and deleting each item, or by using some kind of smart pointer class. However do not use auto_ptr as that just does not work with containers at all.
A good side effect of this is that you can keep multiple containers of pointers in your program but only have those objects owned by one of those containers, and you only need to clean up that one container.
The easiest way to delete the pointers would be to do:
for (ContainerType::iterator it(container.begin()); it != container.end(); ++it)
{
delete (*it);
}
Use either smart pointers inside of the vector, or use boost's ptr_vector. It will automatically free up the allocated objects inside of it. There are also maps, sets, etc.
http://www.boost.org/doc/libs/1_37_0/libs/ptr_container/doc/ptr_vector.html
and the main site:
http://www.boost.org/doc/libs/1_37_0/libs/ptr_container/doc/ptr_container.html
As with any other object in the heap, it must be destroyed manually (with delete).
To answer your first question:
There's nothing special about STL classes (I hope). They function exactly like other template classes. Thus, they are not automatically destroyed if allocated on the heap, because C++ has no garbage collection on them (unless you tell it to with some fancy autoptr business or something). If you allocate it on the stack (without new) it will most likely be managed by C++ automatically.
For your second question, here's a very simple ArrayOfTen class to demonstrate the basics of typical memory management in C++:
/* Holds ten Objects. */
class ArrayOfTen {
public:
ArrayOfTen() {
m_data = new Object[10];
}
~ArrayOfTen() {
delete[] m_data;
}
Object &operator[](int index) {
/* TODO Range checking */
return m_data[index];
}
private:
Object *m_data;
ArrayOfTen &operator=(const ArrayOfTen &) { }
};
ArrayOfTen myArray;
myArray[0] = Object("hello world"); // bleh
Basically, the ArrayOfTen class keeps an internal array of ten Object elements on the heap. When new[] is called in the constructor, space for ten Objects is allocated on the heap, and ten Objects are constructed. Simiarly, when delete[] is called in the destructor, the ten Objects are deconstructed and then the memory previously allocated is freed.
For most (all?) STL types, resizing is done behind the scenes to make sure there's enough memory set asside to fit your elements. The above class only supports arrays of ten Objects. It's basically a very limiting typedef of Object[10].
To delete the elements pointed at, I wrote a simple functor:
template<typename T>
struct Delete {
void operator()( T* p ) const { delete p; }
};
std::vector< MyType > v;
// ....
std::for_each( v.begin(), v.end(), Delete<MyType>() );
But you should fallback on shared pointers when the vector's contents are to be ... ehm... shared. Yes.
A functor that deletes pointers from STL sequence containers
The standard STL containers place a copy of the original object into the container, using the copy constructor. When the container is destroyed the destructor of each object in the container is also called to safely destroy the object.
Pointers are handled the same way.
The thing is pointers are POD data. The copy constructor for a pointer is just to copy the address and POD data has no destructor. If you want the container to manage a pointer you need to:
Use a container of smart pointers. (eg shared pointer).
Use a boost ptr container.
I prefer the pointer container:
The pointer containers are the same as the STL containers except you put pointers into them, but the container then takes ownership of the object the pointer points at and will thus deallocate the object (usually by calling delete) when the container is destroyed.
When you access members of a ptr container they are returned via reference so they behave just like a standard container for use in the standard algorithms.
int main()
{
boost::ptr_vector<int> data;
data.push_back(new int(5));
data.push_back(new int(6));
std::cout << data[0] << "\n"; // Prints 5.
std::cout << data[1] << "\n"; // Prints 6.
} // data deallocated.
// This will also de-allocate all pointers that it contains.
// by calling delete on the pointers. Therefore this will not leak.
One should also point out that smart pointers in a container is a valid alternative, unfortunately std::auto_ptr<> is not a valid choice of smart pointer for this situation.
This is because the STL containers assume that the objects they contain are copyable, unfortunately std::auto_ptr<> is not copyable in the traditional sense as it destroys the original value on copy and thus the source of the copy can not be const.
STL containers are like any other objects, if you instantiate one it is created on the stack:
std::vector<int> vec(10);
Just like any other stack variable, it only lives in the scope of the function it is defined in, and doesn't need to be manually deleted. The destructor of STL containers will call the destructor of all elements in the container.
Keeping pointers in a container is a dicey issue. Since pointers don't have destructors, I would say you would never want to put raw pointers into an STL container. Doing this in an exception safe way will be very difficult, you'd have to litter your code with try{}finally{} blocks to ensure that the contained pointers are always deallocated.
So what should you put into containers instead of raw pointers? +1 jmucchiello for bringing up boost::shared_ptr. boost::shared_ptr is safe to use in STL containers (unlike std::auto_ptr). It uses a simple reference counting mechanism, and is safe to use for data structures that don't contain cycles.
What would you need for data structures that contain cycles? In that case you probably want to graduate to garbage collection, which essentially means using a different language like Java. But that's another discussion. ;)