Use unique_ptr and appropriate container to do memory management

Use unique_ptr and appropriate container to do memory management - c++

First of all, my motivation is to do efficient memory management on top of a C like computational kernel. And I tried to use the std::unique_ptr and std::vector, my code looks like below
// my data container
typedef std::unique_ptr<double> my_type;
std::vector<my_type> my_storage;
// when I need some memory for computation kernel
my_storage.push_back(my_type());
my_storage.back.reset(new double[some_length]);
// get pointer to do computational stuff
double *p_data=my_storage.back.get();
Notice here in practice p_data may be stored in some other container(e.g. map) to indexing each allocated array according to the domain problem, nevertheless, my main questions are
Here is std::vector a good choice? what about other container like std::list/set?
Is there fundamental problem with my allocation method?
Suppose after I use p_data for some operations, now I want to release the memory chunk pointed by the raw pointer p_data, what is the best practice here?

First of all, if you are allocating an array you need to use the specialization std::unique_ptr<T[]> or you won't get a delete [] on memory release but a simple delete.
std::vector is a good choice unless you have any explicit reason to use something different. For example, if you are going to move many elements inside the container then a std::list could perform better (less memmove operations to shift things around).
Regarding how to manage memory, it depends mainly on the pattern of utilization. If my_storage is mainly responsible for everything (which in your specification it is, since unique_ptr expresses ownership), it means that it will be the only one who can release memory. Which could be done simply by calling my_storage[i].reset().
Mind that storing raw pointers of managed objects inside other collections leads to dangling pointers if memory is released, for example:
using my_type = std::unique_ptr<double[]>;
using my_storage = std::vector<my_type>;
my_storage data;
data.push_back(my_type(new double[100]));
std::vector<double*> rawData;
rawData.push_back(data[0].get());
data.clear(); // delete [] is called on array and memory is released
*rawData[0] = 1.2; // accessing a dangling pointer -> bad
This could be a problem or not, if data is released by last then there are no problems, otherwise you could store const references to std::unique_ptr so that at least you'd be able to check if memory is still valid, e.g.:
using my_type = std::unique_ptr<double[]>;
using my_managed_type = std::reference_wrapper<const my_type>;
std::vector<my_managed_type> rawData;

Using std::unique_ptr with any STL container , including std::vector, is fine in general. But you are not using std::unique_ptr the correct way (you are not using the array specialized version of it), and you don't need to resort to using back.reset() at all. Try this instead:
// my data container
typedef std::unique_ptr<double[]> my_type;
// or: using my_type = std::unique_ptr<double[]>;
std::vector<my_type> my_storage;
my_type ptr(new double[some_length]);
my_storage.push_back(std::move(ptr));
// or: my_storage.push_back(my_type(new double[some_length]));
// or: my_storage.emplace_back(new double[some_length]);

Related

Returning a vector in C++

I just read this post on SO, that discusses where in memory, STL vectors are stored. According to the accepted answer,
vector<int> temp;
the header info of the vector on the stack but the contents on the heap.
In that case, would the following code be erroneous?
vector<int> some_function() {
vector<int> some_vector;
some_vector.push_back(10);
some_vector.push_back(20);
return some_vector;
}
Should I have used vector<int> *some_vector = new vector<int> instead? Would the above code result in some code of memory allocation issues? Would this change if I used an instance of a custom class instead of int?

Your code is precisely fine.
Vectors manage all the memory they allocate for you.
It doesn't matter whether they store all their internal data using dynamic allocations, or hold some metadata as direct members (with automatic storage duration). Any dynamic allocations performed internally will be safely cleaned-up in the vector's destructor, copy constructor, and other similar special functions.
You do not need to do anything as all of that is abstracted away from your code. Your code has no visibility into that mechanism, and dynamically allocating the vector itself will not have any effect on it.
That is the purpose of them!

If you decide for dynamic allocation of the vector, you will have really hard time destroying it correctly even in very simple cases (do not forget about exceptions!). Do avoid dynamic allocation at all costs whenever possible.
In other words, your code is perfectly correct. I would not worry about copying the returned vector in memory. In these simple cases compilers (in release builds) should use return value optimization / RVO (http://en.wikipedia.org/wiki/Return_value_optimization) and create some_vector at memory of the returned object. In C++11 you can use move semantics.
But if you really do not trust the compiler using RVO, you can always pass a reference to a vector and fill it in inside the function.
//function definition
void some_function(vector<int> &v) { v.push_back(10); v.push_back(20); }
//function usage
vector<int> vec;
some_function(vec);
And back to dynamic allocation, if you really need to use it, try the pattern called RAII. Or use smart pointers.

It is not important where internally vectors define their data because you return the vector by copy.:) (by value) It is the same as if you would return an integer
int some_function()
{
int x = 10;
return x;
}

Storing pointer to heap objects in an STL container for later deallocation

How can one store an arbitrary number of dynamically created instances (of different types) in an STL container so that the memory can be freed later only having the container?
It should work like this:
std::vector< void * > vec;
vec.push_back( new int(10) );
vec.push_back( new float(1.) );
Now, if vec goes out of scope the pointers to the instances are destructed, but the memory for int and float are not freed. And obviously I can't do:
for( auto i : vec )
delete *i;
because void* is not a pointer-to-object type.
You could object and argue that this isn't a good idea because one can not access the elements of the vector. That is right, and I don't access them myself. The NVIDIA driver will access them as it just needs addresses (void* is fine) for it parameters to a kernel call.
I guess the problem here is that it can be different types that are stored. Wondering if a union can do the trick in case one wants to pass this as arguments to a cuda kernel.
The kernel takes parameters of different types and are collected by traversing an expression tree (expression templates) where you don't know the type beforehand. So upon visiting the leaf you store the parameter. it can only be void*, and built-in types int, float, etc.
The vector can be deleted right after the kernel launch (the launch is async but the driver copies the parameters first then continues host thread). 2nd question: Each argument is passed a void* to the driver. Regardless if its an int, float or even void*. So I guess one can allocate more memory than needed. I think the union thingy might be worth looking at.

You can use one vector of each type you want to support.
But while that's a great improvement on the idea of a vector of void*, it still quite smelly.
This does sound like an XY-problem: you have a problem X, you envision a solution Y, but Y obviously doesn't work without some kind of ingenious adaption, so ask about Y. When instead, should be asking about the real problem X. Which is?

Ok, FWIW
I would recomend using an in-place new combined with malloc. what this would do is allow you store the pointers created as void* in your vector. Then when the vector is finished with it can simply be iterated over and free() called.
I.E.
void* ptr = malloc(sizeof(int));
int* myNiceInt = new (ptr) int(myNiceValue);
vec.push_back(ptr);
//at some point later iterate over vec
free( *iter );
I believe that this will be the simplest solution to the problem in this case but do accept that this is a "C" like answer.
Just sayin' ;)

"NVIDIA driver" sounds like a C interface anyway, so malloc is not a crazy suggestion.
Another alternative, as you suggest, is to use a union... But you will also need to store "tags" in a parallel vector to record the actual type of the element, so that you can cast to the appropriate type on deletion.
In short, you must cast void * to an appropriate type before you can delete it. The "C++ way" would be to have a base class with a virtual destructor; you can call delete on that when it points to an instance of any sub-class. But if the library you are using has already determined the types, then that is not an option.

If you have control over the types you can create an abstract base class for them. Give that class a virtual destructor. Then you can have your std::vector<Object*> and iterate over it to delete anything which inherits from Object.
You probably need to have a second std::vector<void*> with pointers to the actual values, since the Object* probably hits the vtable first. A second virtual function like virtual void* ptr() { return &value; } would be useful here. And if it needs the size of the object you can add that too.
You could use the template pattern like this:
template<typename T>
class ObjVal : public Object {
public:
T val;
virtual void* ptr() { return &this->val; }
virtual size_t size() { return sizeof(this->val); }
};
Then you only have to type it once.
This is not particularly memory efficient because every Object picks up at least one extra pointer for the vtable.
However, new int(3) is not very memory efficient either because your allocator probably uses more than 4 bytes for it. Adding that vtable pointer may be essentially free.

Use more than 1 vector. Keep the vector<void*> around to talk to the API (which I'm guessing requires a contiguous block of void*s of non-uniform types?), but also have a vector<std::unique_ptr<int>> and vector<std::unique_ptr<float>> which owns the data. When you create a new int, push a unique_ptr that owns the memory into your vector of ints, and then stick it on the API-compatible vector as a void*. Bundle the three vectors into one struct so that their lifetimes are tied together if possible (and it probably is).
You can also do this with a single vector that stores the ownership of the variables. A vector of roll-your-own RAII pseudo-unique_ptr, or shared_ptr with custom destroyers, or a vector of std::function<void()> that your "Bundle"ing struct's destroyer invokes, or what have you. But I wouldn't recommend these options.

Storing object references in a simple container

I am looking for a way to insert multiple objects of type A inside a container object, without making copies of each A object during insertion. One way would be to pass the A objects by reference to the container, but, unfortunately, as far as I've read, the STL containers only accept passing objects by value for insertions (for many good reasons). Normally, this would not be a problem, but in my case, I DO NOT want the copy constructor to be called and the original object to get destroyed, because A is a wrapper for a C library, with some C-style pointers to structs inside, which will get deleted along with the original object...
I only require a container that can return one of it's objects, given a particular index, and store a certain number of items which is determined at runtime, so I thought that maybe I could write my own container class, but I have no idea how to do this properly.
Another approach would be to store pointers to A inside the container, but since I don't have a lot of knowledge on this subject, what would be a proper way to insert pointers to objects in an STL container? For example this:
std::vector<A *> myVector;
for (unsigned int i = 0; i < n; ++i)
{
A *myObj = new myObj();
myVector.pushBack(myObj);
}
might work, but I'm not sure how to handle it properly and how to dispose of it in a clean way. Should I rely solely on the destructor of the class which contains myVector as a member to dispose of it? What happens if this destructor throws an exception while deleting one of the contained objects?
Also, some people suggest using stuff like shared_ptr or auto_ptr or unique_ptr, but I am getting confused with so many options. Which one would be the best choice for my scenario?

You can use boost or std reference_wrapper.
#include <boost/ref.hpp>
#include <vector>
struct A {};
int main()
{
A a, b, c, d;
std::vector< boost::reference_wrapper<A> > v;
v.push_back(boost::ref(a)); v.push_back(boost::ref(b));
v.push_back(boost::ref(c)); v.push_back(boost::ref(d));
return 0;
}
You need to be aware of object lifetimes when using
reference_wrapper to not get dangling references.
int main()
{
std::vector< boost::reference_wrapper<A> > v;
{
A a, b, c, d;
v.push_back(boost::ref(a)); v.push_back(boost::ref(b));
v.push_back(boost::ref(c)); v.push_back(boost::ref(d));
// a, b, c, d get destroyed by the end of the scope
}
// now you have a vector full of dangling references, which is a very bad situation
return 0;
}
If you need to handle such situations you need a smart pointer.
Smart pointers are also an option but it is crucial to know which one to use. If your data is actually shared, use shared_ptr if the container owns the data use unique_ptr.
Anyway, I don't see what the wrapper part of A would change. If it contains pointers internally and obeys the rule of three, nothing can go wrong. The destructor will take care of cleaning up. This is the typical way to handle resources in C++: acquire them when your object is initialized, delete them when the lifetime of your object ends.
If you purely want to avoid the overhead of construction and deletion, you might want to use vector::emplace_back.

In C++11, you can construct container elements in place using emplace functions, avoiding the costs and hassle of managing a container of pointers to allocated objects:
std::vector<A> myVector;
for (unsigned int i = 0; i < n; ++i)
{
myVector.emplace_back();
}
If the objects' constructor takes arguments, then pass them to the emplace function, which will forward them.
However, objects can only be stored in a vector if they are either copyable or movable, since they have to be moved when the vector's storage is reallocated. You might consider making your objects movable, transferring ownership of the managed resources, or using a container like deque or list that doesn't move objects as it grows.
UPDATE: Since this won't work on your compiler, the best option is probably std::unique_ptr - that has no overhead compared to a normal pointer, will automatically delete the objects when erased from the vector, and allows you to move ownership out of the vector if you want.
If that's not available, then std::shared_ptr (or std::tr1::shared_ptr or boost::shared_ptr, if that's not available) will also give you automatic deletion, for a (probably small) cost in efficiency.
Whatever you do, don't try to store std::auto_ptr in a standard container. It's destructive copying behaviour makes it easy to accidentally delete the objects when you don't expect it.
If none of these are available, then use a pointer as in your example, and make sure you remember to delete the objects once you've finished with them.

C++ New vs Malloc for dynamic memory array of Objects

I have a class Bullet that takes several arguments for its construction. However, I am using a dynamic memory array to store them. I am using C++ so i want to conform to it's standard by using the new operator to allocate the memory. The problem is that the new operator is asking for the constructor arguments when I'm allocating the array, which I don't have at that time. I can accomplish this using malloc to get the right size then fill in form there, but that's not what i want to use :) any ideas?
pBulletArray = (Bullet*) malloc(iBulletArraySize * sizeof(Bullet)); // Works
pBulletArray = new Bullet[iBulletArraySize]; // Requires constructor arguments
Thanks.

You can't.
And if you truly want to conform to C++ standards, you should use std::vector.
FYI, it would probably be even more expensive than what you're trying to achieve. If you could do this, new would call a constructor. But since you'll modify the object later on anyway, the initial construction is useless.

1) std::vector
A std::vector really is the proper C++ way to do this.
std::vector<Bullet> bullets;
bullets.reserve(10); // allocate memory for bullets without constructing any
bullets.push_back(Bullet(10.2,"Bang")); // put a Bullet in the vector.
bullets.emplace_back(10.2,"Bang"); // (C++11 only) construct a Bullet in the vector without copying.
2) new [] operator
It is also possible to do this with new, but you really shouldn't. Manually managing resources with new/delete is an advanced task, similar to template meta-programming in that it's best left to library builders, who'll use these features to build efficient, high level libraries for you. In fact to do this correctly you'll basically be implementing the internals of std::vector.
When you use the new operator to allocate an array, every element in the array is default initialized. Your code could work if you added a default constructor to Bullet:
class Bullet {
public:
Bullet() {} // default constructor
Bullet(double,std::string const &) {}
};
std::unique_ptr<Bullet[]> b = new Bullet[10]; // default construct 10 bullets
Then, when you have the real data for a Bullet you can assign it to one of the elements of the array:
b[3] = Bullet(20.3,"Bang");
Note the use of unique_ptr to ensure that proper clean-up occurs, and that it's exception safe. Doing these things manually is difficult and error prone.
3) operator new
The new operator initializes its objects in addition to allocating space for them. If you want to simply allocate space, you can use operator new.
std::unique_ptr<Bullet,void(*)(Bullet*)> bullets(
static_cast<Bullet*>(::operator new(10 * sizeof(Bullet))),
[](Bullet *b){::operator delete(b);});
(Note that the unique_ptr ensures that the storage will be deallocated but no more. Specifically, if we construct any objects in this storage we have to manually destruct them and do so in an exception safe way.)
bullets now points to storage sufficient for an array of Bullets. You can construct an array in this storage:
new (bullets.get()) Bullet[10];
However the array construction again uses default initialization for each element, which we're trying to avoid.
AFAIK C++ doesn't specify any well defined method of constructing an array without constructing the elements. I imagine this is largely because doing so would be a no-op for most (all?) C++ implementations. So while the following is technically undefined, in practice it's pretty well defined.
bool constructed[10] = {}; // a place to mark which elements are constructed
// construct some elements of the array
for(int i=0;i<10;i+=2) {
try {
// pretend bullets points to the first element of a valid array. Otherwise 'bullets.get()+i' is undefined
new (bullets.get()+i) Bullet(10.2,"Bang");
constructed = true;
} catch(...) {}
}
That will construct elements of the array without using the default constructor. You don't have to construct every element, just the ones you want to use. However when destroying the elements you have to remember to destroy only the elements that were constructed.
// destruct the elements of the array that we constructed before
for(int i=0;i<10;++i) {
if(constructed[i]) {
bullets[i].~Bullet();
}
}
// unique_ptr destructor will take care of deallocating the storage
The above is a pretty simple case. Making non-trivial uses of this method exception safe without wrapping it all up in a class is more difficult. Wrapping it up in a class basically amounts to implementing std::vector.
4) std::vector
So just use std::vector.

It's possible to do what you want -- search for "operator new" if you really want to know how. But it's almost certainly a bad idea. Instead, use std::vector, which will take care of all the annoying details for you. You can use std::vector::reserve to allocate all the memory you'll use ahead of time.

Bullet** pBulletArray = new Bullet*[iBulletArraySize];
Then populate pBulletArray:
for(int i = 0; i < iBulletArraySize; i++)
{
pBulletArray[i] = new Bullet(arg0, arg1);
}
Just don't forget to free the memory using delete afterwards.

The way C++ new normally works is allocating the memory for the class instance and then calling the constructor for that instance. You basically have already allocated the memory for your instances.
You can call only the constructor for the first instance like this:
new((void*)pBulletArray) Bullet(int foo);
Calling the constructor of the second one would look like this (and so on)
new((void*)pBulletArray+1) Bullet(int bar);
if the Bullet constructor takes an int.

If what you're really after here is just fast allocation/deallocation, then you should look into "memory pools." I'd recommend using boost's implementation, rather than trying to roll your own. In particular, you would probably want to use an "object_pool".

preventing data from being freed when vector goes out of scope

Is there a way to transfer ownership of the data contained in a std::vector (pointed to by, say T*data) into another construct, preventing having "data" become a dangling pointer after the vector goes out of scope?
EDIT: I DON'T WANT TO COPY THE DATA (which would be an easy but ineffective solution).
Specifically, I'd like to have something like:
template<typename T>
T* transfer_ownership(vector<T>&v){
T*data=&v[0];
v.clear();
...//<--I'd like to make v's capacity 0 without freeing data
}
int main(){
T*data=NULL;
{
vector<double>v;
...//grow v dynamically
data=transfer_ownership<double>(v);
}
...//do something useful with data (user responsible for freeing it later)
// for example mxSetData(mxArray*A,double*data) from matlab's C interface
}
The only thing that comes to my mind to emulate this is:
{
vector<double>*v=new vector<double>();
//grow *v...
data=(*v)[0];
}
and then data will later either be freed or (in my case) used as mxSetData(mxArrayA,doubledata). However this results in a small memory leak (data struct for handling v's capacity, size, etc... but not the data itself of course).
Is it possible without leaking ?

A simple workaround would be swapping the vector with one you own:
vector<double> myown;
vector<double> someoneelses = foo();
std::swap( myown, someoneelses );
A tougher but maybe better approach is write your own allocator for the vector, and let it allocate out of a pool you maintain. No personal experience, but it's not too complicated.

The point of using a std::vector is not to have to worry about the data in it:
Keep your vector all along your application;
Pass it by const-ref to other functions (to avoid unnecessary copies);
And feed functions expecting a pointer-to-T with &v[0].
If you really don't want to keep your vector, you will have to copy your data -- you can't transfer ownership because std::vector guarantees it will destroy its content when going out-of-scope. In that case, use the std::copy() algorithm.

If your vector contains values you can only copy them (which happens when you call std::copy, std::swap, etc.). If you keep non-primitive objects in a vector and don't want to copy them (and use in another data structure), consider storing pointers

Does something like this work for you?
int main()
{
double *data = 0;
{
vector<double> foo;
// insert some elements to foo
data = new double[foo.size()];
std::copy(foo.begin(), foo.end(), &data[0]);
}
// Pass data to Matlab function.
delete [] data;
return 0;
}

Since you don't want to copy data between containers, but want to transfer ownership of data between containers, I suggest using a container of smart pointers as follows.
void f()
{
std::vector<boost::shared_ptr<double> > doubles;
InitVector(doubles);
std::vector<boost::shared_ptr<double> > newDoubles(doubles);
}
You really can't transfer ownership of data between standard containers without making a copy of it, since standard containers always copy the data they encapsulate. If you want to minimize the overhead of copying expensive objects, then it is a good idea to use a reference-counted smart pointer to wrap your expensive data structure. boost::shared_ptr is suitable for this task since it is fairly cheap to make a copy of it.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js