I've had some experience in C++ from school works. I've learned, among other things, that objects should be stored in a container (vector, map, etc) as pointers. The main reason being that we need the use of the new-operator, along with a copy constructor, in order to create a copy on the heap (otherwise called dynamic memory) of the object. This method also necessitates defining a destructor.
However, from what I've read since then, it seems that STL containers already store the values they contain on the heap. Thus, if I were to store my objects as values, a copy (using the copy constructor) would be made on the heap anyway, and there would be no need to define a destructor. All in all, a copy on the heap would be made anyway???
Also, if(true), then the only other reason I can think of for storing objects using pointers would be to alleviate resource needs for copying the container, as pointers are easier to copy than whole objects. However, this would require the use of std::shared_ptr instead of regular pointers, since you don't want elements in the copied container to be deleted when the original container is destroyed. This method would also alleviate the need for defining a destructor, wouldn't it?
Edit : The destructor to be defined would be for the class using the container, not for the class of the objects stored.
Edit 2 : I guess a more precise question would be : "Does it make a difference to store objects as pointers using the new-operator, as opposed to plain values, on a memory and resources used standpoint?"
The main reason to avoid storing full objects in containers (rather than pointers) is because copying or moving those objects is expensive. In that case, the recommended alternative is to store smart pointers in the container.
So...
vector<something_t> ................. Usually perfectly OK
vector<shared_ptr<something_t>> ..... Preferred if you want pointers
vector<something_t*> ................ Usually best avoided
The problem with raw pointers is that, when a raw pointer disappears, the object it points to hangs around causing memory and resource leaks - unless you've explicitly deleted it. C++ doesn't have garbage collection, and when a pointer is discarded, there's no way to know if other pointers may still be pointing to that object.
Raw pointers are a low-level tool - mostly used to write libraries such as vector and shared_ptr. Smart pointers are a high-level tool.
However, particularly with C++11 move semantics, the costs of moving items around in a vector is normally very small even for huge objects. For example, a vector<string> is fine even if all the strings are megabytes long. You mostly worry about the cost of moving objects if sizeof(classname) is big - if the object holds lots of data inside itself rather than in separate heap-allocated memory.
Even then, you don't always worry about the cost of moving objects. It doesn't matter that moving an object is expensive if you never move it. For example, a map doesn't need to move items around much. When you insert and delete items, the nodes (and contained items) stay where they are, it's just the pointers that link the nodes that change.
Related
I'm starting with the assumption that, generally, it is a good idea to allocate small objects in the stack, and big objects in dynamic memory. Another assumption is that I'm possibly confused while trying to learn about memory, STL containers and smart pointers.
Consider the following example, where I have an object that is necessarily allocated in the free store through a smart pointer, and I can rely on clients getting said object from a factory, for instance. This object contains some data that is specifically allocated using an STL container, which happens to be a std::vector. In one case, this data vector itself is dynamically allocated using some smart pointer, and in the other situation I just don't use a smart pointer.
Is there any practical difference between design A and design B, described below?
Situation A:
class SomeClass{
public:
SomeClass(){ /* initialize some potentially big STL container */ }
private:
std::vector<double> dataVector_;
};
Situation B:
class SomeOtherClass{
public:
SomeOtherClass() { /* initialize some potentially big STL container,
but is it allocated in any different way? */ }
private:
std::unique_ptr<std::vector<double>> pDataVector_;
};
Some factory functions.
std::unique_ptr<SomeClass> someClassFactory(){
return std::make_unique<SomeClass>();
}
std::unique_ptr<SomeOtherClass> someOtherClassFactory(){
return std::make_unique<SomeOtherClass>();
}
Use case:
int main(){
//in my case I can reliably assume that objects themselves
//are going to always be allocated in dynamic memory
auto pSomeClassObject(someClassFactory());
auto pSomeOtherClassObject(someOtherClassFactory());
return 0;
}
I would expect that both design choices have the same outcome, but do they?
Is there any advantage or disadvantage for choosing A or B? Specifically, should I generally choose design A because it's simpler or are there more considerations? Is B morally wrong because it can dangle for a std::vector?
tl;dr : Is it wrong to have a smart pointer pointing to a STL container?
edit:
The related answers pointed to useful additional information for someone as confused as myself.
Usage of objects or pointers to objects as class members and memory allocation
and Class members that are objects - Pointers or not? C++
And changing some google keywords lead me to When vectors are allocated, do they use memory on the heap or the stack?
std::unique_ptr<std::vector<double>> is slower, takes more memory, and the only advantage is that it contains an additional possible state: "vector doesn't exist". However, if you care about that state, use boost::optional<std::vector> instead. You should almost never have a heap-allocated container, and definitely never use a unique_ptr. It actually works fine, no "dangling", it's just pointlessly slow.
Using std::unique_ptr here is just wasteful unless your goal is a compiler firewall (basically hiding the compile-time dependency to vector, but then you'd need a forward declaration to standard containers).
You're adding an indirection but, more importantly, the full contents of SomeClass turns into 3 separate memory blocks to load when accessing the contents (SomeClass merged with/containing unique_ptr's block pointing to std::vector's block pointing to its element array). In addition you're paying one extra superfluous level of heap overhead.
Now you might start imagining scenarios where an indirection is helpful to the vector, like maybe you can shallow move/swap the unique_ptrs between two SomeClass instances. Yes, but vector already provides that without a unique_ptr wrapper on top. And it already has states like empty that you can reuse for some concept of validity/nilness.
Remember that variable-sized containers themselves are small objects, not big ones, pointing to potentially big blocks. vector isn't big, its dynamic contents can be. The idea of adding indirections for big objects isn't a bad rule of thumb, but vector is not a big object. With move semantics in place, it's worth thinking of it more like a little memory block pointing to a big one that can be shallow copied and swapped cheaply. Before move semantics, there were more reasons to think of something like std::vector as one indivisibly large object (though its contents were always swappable), but now it's worth thinking of it more like a little handle pointing to big, dynamic contents.
Some common reasons to introduce an indirection through something like unique_ptr is:
Abstraction & hiding. If you're trying to abstract or hide the concrete definition of some type/subtype, Foo, then this is where you need the indirection so that its handle can be captured (or potentially even used with abstraction) by those who don't know exactly what Foo is.
To allow a big, contiguous 1-block-type object to be passed around from owner to owner without invoking a copy or invalidating references/pointers (iterators included) to it or its contents.
A hasty kind of reason that's wasteful but sometimes useful in a deadline rush is to simply introduce a validity/null state to something that doesn't inherently have it.
Occasionally it's useful as an optimization to hoist out certain less frequently-accessed, larger members of an object so that its commonly-accessed elements fit more snugly (and perhaps with adjacent objects) in a cache line. There unique_ptr can let you split apart that object's memory layout while still conforming to RAII.
Now wrapping a shared_ptr on top of a standard container might have more legitimate applications if you have a container that can actually be owned (sensibly) by more than one owner. With unique_ptr, only one owner can possess the object at a time, and standard containers already let you swap and move each other's internal guts (the big, dynamic parts). So there's very little reason I can think of to wrap a standard container directly with a unique_ptr, as it's already somewhat like a smart pointer to a dynamic array (but with more functionality to work with that dynamic data, including deep copying it if desired).
And if we talk about non-standard containers, like say you're working with a third party library that provides some data structures whose contents can get very large but they fail to provide those cheap, non-invalidating move/swap semantics, then you might superficially wrap it around a unique_ptr, exchanging some creation/access/destruction overhead to get those cheap move/swap semantics back as a workaround. For the standard containers, no such workaround is needed.
I agree with #MooingDuck; I don't think using std::unique_ptr has any compelling advantages. However, I could see a use case for std::shared_ptr if the member data is very large and the class is going to support COW (copy-on-write) semantics (or any other use case where the data is shared across multiple instances).
For my game I have built a small framework which among other things has:
Entities that own components.
Systems that hold pointers to the entities.
An Engine that owns the systems.
An EntityManager that owns the entities.
Every time I add a Component, the Entity passes it's "this" pointer to the Systems through an Engine pointer that it holds and they decide whether to register it or ignore it.
Now, since the Entities are elements of the EntityManager's container, am I right in assuming that if an insert operation to it causes shifts or reallocation, the systems won't hold valid pointers any more?
If so, what's a good container that can be used to prevent this from happening? If I understand things correctly this is similar to what happens with iterators and the same rules should apply when requiring non-invalidation with insertion.
If you store a vector of entities and then just store their iterators to access them: yes, a reallocation might invalidate all your data.
The suggested way is to store a vector of pointers (if you need memory collection capabilities you might want to go for a vector of smart pointers). This way you will be sure that the pointers are valid (assuming nothing else touched the objects) at every insertion/deletion regardless of the reallocation of the container's space.
From the question isn't clear but a word of advice if you're just storing objects in your containers instead of pointers: when inserting elements into a container like with
std::vector<T>::push_back()
you're storing a copy of the object. This is usually undesirable since brings additional copy overhead and might create problems if things aren't properly set up. See "shallow copies" and "deep copies" to learn more about this problem.
Your pointer value will only change if a relocation of the actual value happens.
This is the case where you manipulate arrays of objects instead of arrays of pointers to these objects. You should definitely not do the former.
I would suggest using standard collections like std::array or std::vector to manage the objects. With those, and provided you have instanciated the objects on the heap (read: with new), you won't have to worry about the value of this.
I am starting to learn c++ and have a simple question.
When i have std::vector which schould hold some custom objects.
Is it better to create those object with the new operator or should i just normally instantiate the objects and pass it to the vector? I am just wondering, because in Java i do not need to care about this.
i.e. i am creating a bunch of objects in the main class. Then i pass these objects in a vector which is contained in another class Is it okay to instantiate the object on the stack? Or should i always do it with the new operator (then i have to take care that the objects get deleted somewore). or is the simple answer: it depends on your program?
cheers
Simple is better. Just store the values directly in the vector, unless the values are (a) huge, (b) non-copyable (even then we have std::move in C++11), or (c) owned by another object (use shared_ptr or raw pointers).
Are the objects polymorphic (i.e. are you making nontrival use of inheritance)? Do you plan to share object references? Are they noncopyable? If the answer to these questions are no, you're likely best off storing them by value (use emplace).
If you are storing them by reference, you should be using some variety of smart pointers.
It really depends on the objects. If they are just small structs holding a couple of simple values then you don't really need to use new. But if your objects are very large or have other classes inside of them, then I would pass pointers of them to your vector.
The thing to realize is that std::vector stores copies of the objects, so if they are smaller then you don't care about copying. Also if these objects have any dynamic memory usage inside them, then you should probably overload the copy-constructor, because this is what is used to make these copies.
I have a piece of C++ classes and I have the raw pointer to the objects. The pointer to the object would get passed down to the function. The problem is the underlying function might need to store the pointer at times in an STL container in which pointer to the object would be used later on. If I am not using shared_ptr, I am thinking of adding a bool flag to the class which indicates whether the caller of the function is responsible for deleting the object memory. Would that be fine?
Thanks.
Messy. And rife with many potential bugs that will keep you at work well past midnight on a Saturday.
Be clear and consistent about resource ownership. Either the vector owns the pointers, or some specific function owns the pointers, or smart pointers own pointers. Any mixing of these semantics will have the ultimate result of you tearing your hair out late at night.
The best solution is usually to use a reference-counted smart pointer. (As you probably already know all to well, you can't use std::auto_ptr) Barring that, create a class whose sole purpose in life is to allocate, deallocate and grant access to the vector's contained pointers. Any function that needs the contained object would go through your manager class to get to them.
STL containers will almost certainly take a copy of the object which you insert into it. When the object is removed from the container, the object will be destroyed.
If the default copy constructor is not sufficient (i.e. you need to do a deep copy of the object), you need to ensure you implement your own version which does the copy properly.
Designing a new system from scratch. I'll be using the STL to store lists and maps of certain long-live objects.
Question: Should I ensure my objects have copy constructors and store copies of objects within my STL containers, or is it generally better to manage the life & scope myself and just store the pointers to those objects in my STL containers?
I realize this is somewhat short on details, but I'm looking for the "theoretical" better answer if it exists, since I know both of these solutions are possible.
Two very obvious disadvantage to playing with pointers:
1) I must manage allocation/deallocation of these objects myself in a scope beyond the STL.
2) I cannot create a temp object on the stack and add it to my containers.
Is there anything else I'm missing?
Since people are chiming in on the efficency of using pointers.
If you're considering using a std::vector and if updates are few and you often iterate over your collection and it's a non polymorphic type storing object "copies" will be more efficent since you'll get better locality of reference.
Otoh, if updates are common storing pointers will save the copy/relocation costs.
This really depends upon your situation.
If your objects are small, and doing a copy of the object is lightweight, then storing the data inside an stl container is straightforward and easier to manage in my opinion because you don't have to worry about lifetime management.
If you objects are large, and having a default constructor doesn't make sense, or copies of objects are expensive, then storing with pointers is probably the way to go.
If you decide to use pointers to objects, take a look at the Boost Pointer Container Library. This boost library wraps all the STL containers for use with dynamically allocated objects.
Each pointer container (for example ptr_vector) takes ownership of an object when it is added to the container, and manages the lifetime of those objects for you. You also access all the elements in a ptr_ container by reference. This lets you do things like
class BigExpensive { ... }
// create a pointer vector
ptr_vector<BigExpensive> bigVector;
bigVector.push_back( new BigExpensive( "Lexus", 57700 ) );
bigVector.push_back( new BigExpensive( "House", 15000000 );
// get a reference to the first element
MyClass& expensiveItem = bigList[0];
expensiveItem.sell();
These classes wrap the STL containers and work with all of the STL algorithms, which is really handy.
There are also facilities for transferring ownership of a pointer in the container to the caller (via the release function in most of the containers).
If you're storing polymporhic objects you always need to use a collection of base class pointers.
That is if you plan on storing different derived types in your collection you must store pointers or get eaten by the slicing deamon.
Sorry to jump in 3 years after the event, but a cautionary note here...
On my last big project, my central data structure was a set of fairly straightforward objects. About a year into the project, as the requirements evolved, I realised that the object actually needed to be polymorphic. It took a few weeks of difficult and nasty brain surgery to fix the data structure to be a set of base class pointers, and to handle all the collateral damage in object storage, casting, and so on. It took me a couple of months to convince myself that the new code was working. Incidentally, this made me think hard about how well-designed C++'s object model is.
On my current big project, my central data structure is a set of fairly straightforward objects. About a year into the project (which happens to be today), I realised that the object actually needs to be polymorphic. Back to the net, found this thread, and found Nick's link to the the Boost pointer container library. This is exactly what I had to write last time to fix everything, so I'll give it a go this time around.
The moral, for me, anyway: if your spec isn't 100% cast in stone, go for pointers, and you may potentially save yourself a lot of work later.
Why not get the best of both worlds: do a container of smart pointers (such as boost::shared_ptr or std::shared_ptr). You don't have to manage the memory, and you don't have to deal with large copy operations.
Generally storing the objects directly in the STL container is best as it is simplest, most efficient, and is easiest for using the object.
If your object itself has non-copyable syntax or is an abstract base type you will need to store pointers (easiest is to use shared_ptr)
You seem to have a good grasp of the difference. If the objects are small and easy to copy, then by all means store them.
If not, I would think about storing smart pointers (not auto_ptr, a ref counting smart pointer) to ones you allocate on the heap. Obviously, if you opt for smart pointers, then you can't store temp stack allocated objects (as you have said).
#Torbjörn makes a good point about slicing.
Using pointers will be more efficient since the containers will be only copying pointers around instead of full objects.
There's some useful information here about STL containers and smart pointers:
Why is it wrong to use std::auto_ptr<> with standard containers?
If the objects are to be referred to elsewhere in the code, store in a vector of boost::shared_ptr. This ensures that pointers to the object will remain valid if you resize the vector.
Ie:
std::vector<boost::shared_ptr<protocol> > protocols;
...
connection c(protocols[0].get()); // pointer to protocol stays valid even if resized
If noone else stores pointers to the objects, or the list doesn't grow and shrink, just store as plain-old objects:
std::vector<protocol> protocols;
connection c(protocols[0]); // value-semantics, takes a copy of the protocol
This question has been bugging me for a while.
I lean to storing pointers, but I have some additional requirements (SWIG lua wrappers) that might not apply to you.
The most important point in this post is to test it yourself, using your objects
I did this today to test the speed of calling a member function on a collection of 10 million objects, 500 times.
The function updates x and y based on xdir and ydir (all float member variables).
I used a std::list to hold both types of objects, and I found that storing the object in the list is slightly faster than using a pointer. On the other hand, the performance was very close, so it comes down to how they will be used in your application.
For reference, with -O3 on my hardware the pointers took 41 seconds to complete and the raw objects took 30 seconds to complete.