A std::vector of pointers? - c++

Here is what I'm trying to do. I have a std::vector with a certain number of elements, it can grow but not shrink. The thing is that its sort of cell based so there may not be anything at that position. Instead of creating an empty object and wasting memory, I thought of instead just NULLing that cell in the std::vector. The issue is that how do I get pointers in there without needing to manage my memory? How can I take advantage of not having to do new and keep track of the pointers?

How large are the objects and how sparse do you anticipate the vector will be? If the objects are not large or if there aren't many holes, the cost of having a few "empty" objects may be lower than the cost of having to dynamically allocate your objects and manage pointers to them.
That said, if you do want to store pointers in the vector, you'll want to use a vector of smart pointers (e.g., a vector<shared_ptr<T>>) or a container designed to own pointers (e.g., Boost's ptr_vector<T>).

If you're going to use pointers something will need to manage the memory.
It sounds like the best solution for you would be to use boost::optional. I believe it has exactly the semantics that you are looking for. (http://www.boost.org/doc/libs/1_39_0/libs/optional/doc/html/index.html).
Actually, after I wrote this, I realized that your use case(e.g. expensive default constructor) is used by the boost::optional docs: http://www.boost.org/doc/libs/1_39_0/libs/optional/doc/html/boost_optional/examples.html#boost_optional.examples.bypassing_expensive_unnecessary_default_construction

You can use a deque to hold an ever-increasing number of objects, and use your vector to hold pointers to the objects. The deque won't invalidate pointers to existing objects it holds if you only add new objects to the end of it. This is far less overhead than allocating each object separately. Just ensure that the deque is destroyed after or at the same time as the vector so you don't create dangling pointers.
However, based on the size of the 3-D array you mentioned in another answer's comment, you may have difficulty storing that many pointers. You might want to look into a sparse array implementation so that you mainly use memory for the portions of the array where you have non-null pointers.

You could use a smart pointer. For example boost::shared_ptr.

The issue is that how do I get pointers in there without needing to manage my memory?
You can do certainly do this using the shared_ptr or other similar techniques mentioned here. But in near future you will come across some problem where you will have to manage your own memory. So please get yourself comfortable with the pointer concept.
Normally if you see in big servers the memory management of object itself is considered as a responsibility and specially for this purpose you will create a class. This is known as pool. Whenever you need an object you ask the pool to give you the object and whenever you are done with the object you tell the pool that I am done. Now it is the responsibility of the pool to see what can be done with that object.
But the basic idea is your main program still deals with pointers but do not care about the memory. There is some other object who cares about it.

Related

Allocation and fragmentation on the heap

I have a complex structure I need to allocate on the heap. It's made of some basic types and custom objects. Those custom objects are made out of some basic types and some other custom objects. Those custom objects are made out of some basic types and some other custom objects, et cetara, et cetera...
The way I've been doing it is storing the basic types as automatic variables while making the custom objects (smart) pointers.
But, since the main object is created as a (smart) pointer, all of this is allocated on the heap anyway, no? But every time I use another (smart) pointer, it does another allocation and fragments the memory, right?
So I shouldn't really be using pointers, save for that initial one to put it on a heap, no? All of the objects that have changeable sizes have the changeable parts stored in a map or a vector, which do allocate stuff on their own, but at this point, this is a necessity anyway since I don't know how many, if any, there will be.
Anyway, am I right to think this way?
The less the pointers are used, the better?
Use whatever makes more sense from a design point of view.
If it turns out to be to slow (it probably won't, anyway), profile and optimize the bottlenecks.

RAII approach for a container of containers?

Assuming that my T is a vector typedef std::vector<ofSomething> T; ( it's usually a vector around 4-5 MB, it's expensive to recreate and store as it is in a data structure )
so, considering :
pointers
references
smart pointers
I have to create a container of vectors, or I have to put all this vectors together somehow, I'm wondering what is the best approach according to the RAII philosophy .
std::container<T*>
or
std::container<T&>
or
std::container<unique_ptr<T>>
with pointers I need to call the destructor explicitly, and this doesn't really look and sound like RAII at all.
with references it's basically the same as with pointers.
with smart pointers I get what I want if I delete or just "drop" the object representing the smart pointer.
Is a collection of smart pointers a really good idea for a container of containers ? I don't know, they are here to express ownership not for automatic memory management, it sounds like I'm doing something wrong with the wrong philosophy, at the same time I have multiple big containers to handle until they "expire" or they are not needed anymore.
What you suggest ?
If you want a vector of vectors, and you want RAII, then the answer is so simple:
std::vector<std::vector<T>> v;
… no references or pointers in sight.
If you're concerned about the inner containers being moved around as the outer vector grows, flatten it:
std::vector<T>
and wrap the indexing, so that i = x*W+y.
As you say, with a raw pointer you'll need to do your own memory management - scratch that. You can't store a reference in a standard container because allocators are not defined for reference types, and you'll still need to allocate your objects on the heap somehow - scratch that. The std::unique_ptr will perform memory management for you and actually compile - this wins... from your choices.
But what about std::container<T>? This will also work fine and not have any issues with manual memory management. It will also benefit from move semantics, so you don't need to worry about the vectors being copied. You also avoid an extra level of indirection.
Obviously using a std::unique_ptr restricts what you can do with your container. If you need to be able to copy it and/or its items, you'll want std::shared_ptr instead.

The "this" pointer and containers

For my game I have built a small framework which among other things has:
Entities that own components.
Systems that hold pointers to the entities.
An Engine that owns the systems.
An EntityManager that owns the entities.
Every time I add a Component, the Entity passes it's "this" pointer to the Systems through an Engine pointer that it holds and they decide whether to register it or ignore it.
Now, since the Entities are elements of the EntityManager's container, am I right in assuming that if an insert operation to it causes shifts or reallocation, the systems won't hold valid pointers any more?
If so, what's a good container that can be used to prevent this from happening? If I understand things correctly this is similar to what happens with iterators and the same rules should apply when requiring non-invalidation with insertion.
If you store a vector of entities and then just store their iterators to access them: yes, a reallocation might invalidate all your data.
The suggested way is to store a vector of pointers (if you need memory collection capabilities you might want to go for a vector of smart pointers). This way you will be sure that the pointers are valid (assuming nothing else touched the objects) at every insertion/deletion regardless of the reallocation of the container's space.
From the question isn't clear but a word of advice if you're just storing objects in your containers instead of pointers: when inserting elements into a container like with
std::vector<T>::push_back()
you're storing a copy of the object. This is usually undesirable since brings additional copy overhead and might create problems if things aren't properly set up. See "shallow copies" and "deep copies" to learn more about this problem.
Your pointer value will only change if a relocation of the actual value happens.
This is the case where you manipulate arrays of objects instead of arrays of pointers to these objects. You should definitely not do the former.
I would suggest using standard collections like std::array or std::vector to manage the objects. With those, and provided you have instanciated the objects on the heap (read: with new), you won't have to worry about the value of this.

Memory allocation of values in a std::map

I've had some experience in C++ from school works. I've learned, among other things, that objects should be stored in a container (vector, map, etc) as pointers. The main reason being that we need the use of the new-operator, along with a copy constructor, in order to create a copy on the heap (otherwise called dynamic memory) of the object. This method also necessitates defining a destructor.
However, from what I've read since then, it seems that STL containers already store the values they contain on the heap. Thus, if I were to store my objects as values, a copy (using the copy constructor) would be made on the heap anyway, and there would be no need to define a destructor. All in all, a copy on the heap would be made anyway???
Also, if(true), then the only other reason I can think of for storing objects using pointers would be to alleviate resource needs for copying the container, as pointers are easier to copy than whole objects. However, this would require the use of std::shared_ptr instead of regular pointers, since you don't want elements in the copied container to be deleted when the original container is destroyed. This method would also alleviate the need for defining a destructor, wouldn't it?
Edit : The destructor to be defined would be for the class using the container, not for the class of the objects stored.
Edit 2 : I guess a more precise question would be : "Does it make a difference to store objects as pointers using the new-operator, as opposed to plain values, on a memory and resources used standpoint?"
The main reason to avoid storing full objects in containers (rather than pointers) is because copying or moving those objects is expensive. In that case, the recommended alternative is to store smart pointers in the container.
So...
vector<something_t> ................. Usually perfectly OK
vector<shared_ptr<something_t>> ..... Preferred if you want pointers
vector<something_t*> ................ Usually best avoided
The problem with raw pointers is that, when a raw pointer disappears, the object it points to hangs around causing memory and resource leaks - unless you've explicitly deleted it. C++ doesn't have garbage collection, and when a pointer is discarded, there's no way to know if other pointers may still be pointing to that object.
Raw pointers are a low-level tool - mostly used to write libraries such as vector and shared_ptr. Smart pointers are a high-level tool.
However, particularly with C++11 move semantics, the costs of moving items around in a vector is normally very small even for huge objects. For example, a vector<string> is fine even if all the strings are megabytes long. You mostly worry about the cost of moving objects if sizeof(classname) is big - if the object holds lots of data inside itself rather than in separate heap-allocated memory.
Even then, you don't always worry about the cost of moving objects. It doesn't matter that moving an object is expensive if you never move it. For example, a map doesn't need to move items around much. When you insert and delete items, the nodes (and contained items) stay where they are, it's just the pointers that link the nodes that change.

Should I store entire objects, or pointers to objects in containers?

Designing a new system from scratch. I'll be using the STL to store lists and maps of certain long-live objects.
Question: Should I ensure my objects have copy constructors and store copies of objects within my STL containers, or is it generally better to manage the life & scope myself and just store the pointers to those objects in my STL containers?
I realize this is somewhat short on details, but I'm looking for the "theoretical" better answer if it exists, since I know both of these solutions are possible.
Two very obvious disadvantage to playing with pointers:
1) I must manage allocation/deallocation of these objects myself in a scope beyond the STL.
2) I cannot create a temp object on the stack and add it to my containers.
Is there anything else I'm missing?
Since people are chiming in on the efficency of using pointers.
If you're considering using a std::vector and if updates are few and you often iterate over your collection and it's a non polymorphic type storing object "copies" will be more efficent since you'll get better locality of reference.
Otoh, if updates are common storing pointers will save the copy/relocation costs.
This really depends upon your situation.
If your objects are small, and doing a copy of the object is lightweight, then storing the data inside an stl container is straightforward and easier to manage in my opinion because you don't have to worry about lifetime management.
If you objects are large, and having a default constructor doesn't make sense, or copies of objects are expensive, then storing with pointers is probably the way to go.
If you decide to use pointers to objects, take a look at the Boost Pointer Container Library. This boost library wraps all the STL containers for use with dynamically allocated objects.
Each pointer container (for example ptr_vector) takes ownership of an object when it is added to the container, and manages the lifetime of those objects for you. You also access all the elements in a ptr_ container by reference. This lets you do things like
class BigExpensive { ... }
// create a pointer vector
ptr_vector<BigExpensive> bigVector;
bigVector.push_back( new BigExpensive( "Lexus", 57700 ) );
bigVector.push_back( new BigExpensive( "House", 15000000 );
// get a reference to the first element
MyClass& expensiveItem = bigList[0];
expensiveItem.sell();
These classes wrap the STL containers and work with all of the STL algorithms, which is really handy.
There are also facilities for transferring ownership of a pointer in the container to the caller (via the release function in most of the containers).
If you're storing polymporhic objects you always need to use a collection of base class pointers.
That is if you plan on storing different derived types in your collection you must store pointers or get eaten by the slicing deamon.
Sorry to jump in 3 years after the event, but a cautionary note here...
On my last big project, my central data structure was a set of fairly straightforward objects. About a year into the project, as the requirements evolved, I realised that the object actually needed to be polymorphic. It took a few weeks of difficult and nasty brain surgery to fix the data structure to be a set of base class pointers, and to handle all the collateral damage in object storage, casting, and so on. It took me a couple of months to convince myself that the new code was working. Incidentally, this made me think hard about how well-designed C++'s object model is.
On my current big project, my central data structure is a set of fairly straightforward objects. About a year into the project (which happens to be today), I realised that the object actually needs to be polymorphic. Back to the net, found this thread, and found Nick's link to the the Boost pointer container library. This is exactly what I had to write last time to fix everything, so I'll give it a go this time around.
The moral, for me, anyway: if your spec isn't 100% cast in stone, go for pointers, and you may potentially save yourself a lot of work later.
Why not get the best of both worlds: do a container of smart pointers (such as boost::shared_ptr or std::shared_ptr). You don't have to manage the memory, and you don't have to deal with large copy operations.
Generally storing the objects directly in the STL container is best as it is simplest, most efficient, and is easiest for using the object.
If your object itself has non-copyable syntax or is an abstract base type you will need to store pointers (easiest is to use shared_ptr)
You seem to have a good grasp of the difference. If the objects are small and easy to copy, then by all means store them.
If not, I would think about storing smart pointers (not auto_ptr, a ref counting smart pointer) to ones you allocate on the heap. Obviously, if you opt for smart pointers, then you can't store temp stack allocated objects (as you have said).
#Torbjörn makes a good point about slicing.
Using pointers will be more efficient since the containers will be only copying pointers around instead of full objects.
There's some useful information here about STL containers and smart pointers:
Why is it wrong to use std::auto_ptr<> with standard containers?
If the objects are to be referred to elsewhere in the code, store in a vector of boost::shared_ptr. This ensures that pointers to the object will remain valid if you resize the vector.
Ie:
std::vector<boost::shared_ptr<protocol> > protocols;
...
connection c(protocols[0].get()); // pointer to protocol stays valid even if resized
If noone else stores pointers to the objects, or the list doesn't grow and shrink, just store as plain-old objects:
std::vector<protocol> protocols;
connection c(protocols[0]); // value-semantics, takes a copy of the protocol
This question has been bugging me for a while.
I lean to storing pointers, but I have some additional requirements (SWIG lua wrappers) that might not apply to you.
The most important point in this post is to test it yourself, using your objects
I did this today to test the speed of calling a member function on a collection of 10 million objects, 500 times.
The function updates x and y based on xdir and ydir (all float member variables).
I used a std::list to hold both types of objects, and I found that storing the object in the list is slightly faster than using a pointer. On the other hand, the performance was very close, so it comes down to how they will be used in your application.
For reference, with -O3 on my hardware the pointers took 41 seconds to complete and the raw objects took 30 seconds to complete.