C++ - shared_ptr<vector<T>> vs. vector<shared_ptr<T>> - c++

I see a lot of cases where people use vector<shared_ptr<T>>. When and why would you use shared_ptr<vector<T>> instead? For me, the latter seems more efficient both in performance and memory-usage. Is it wrong to share a single vector of objects across the application?
Thanks

This use: vector<shared_ptr<T>> will allow you to pass instances of type T from this vector to some other parts of code without fear that they will not be freed. Even if your vector will no longer exist.
shared_ptr<vector<T>> on the other hand protects only vector, its elements of type T are not protected against memory leaks. I assume here that T is of pointer type, if T is non-pointer, then of course you don't have a problem with making memory leak here. Well someone could make T = shared_ptr<T> actually.
Its actually more common to use vector<shared_ptr<T>>, I don't really remember using shared_ptr<vector<T>>.
The point is to never keep, in your code, bare pointers to allocated memory, always keep them in some kind of smart pointer. Its perfectly fine if you implement your own allocate/deallocate mechanism, i.e.. using RAII.

Related

Realloc creates dangling pointers?

I wrote some array code that allocates memory and each value in the array is a type. I then have another array that consists of in to the first array for references.
Both arrays can grow. It uses realloc. Because the 2nd array contains pointers in to the first, they are surely not updated when the first array changes(I don't do it manually and there is no GC). Surely all the pointers in the 2nd array are invalid! (they point to memory that was free'ed by realloc).
This is the case right?
This seems like it would make persistent pointers to blocks of memory that may move very dangerous?
What is the standard solution? Don't use "global" pointers? Using pointers to pointers to pointers? I think I could make the 2nd array use **'s and could probably get things to work.
In a MT environment, things are even worse. Local pointers access may be moved in the middle, then the memory changed, and the local pointer is now wrong. (Which, of course might be solved by preventing the moves by lock, etc...)
Go with functional programming?
Yes, the realloc can invalidate your references. If there is no continuous space for relocating your array will be moved.
Consider using a container as the std::deque.
1) This is the case right?
Yes.
2) This seems like it would make persistent pointers to blocks of memory that may move very dangerous?
Yes.
3) What is the standard solution?
You design your application such that the life-times of your objects is well defined so that you do not refer to them after they are no longer required.
4) In a MT environment, things are even worse. Local pointers access may be moved in the middle, then the memory changed, and the local pointer is now wrong. (Which, of course might be solved by preventing the moves by lock, etc...)
Obviously you should never use a pointer that no-longer pointers to its resource. Managing shared resources in a MT environment is non trivial and there are a whole bunch of tools and techniques to achieve it.
5) Go with functional programming?
It is always advisable to avoid pointers if you can.
Without a specific problem it is hard to give a specific solution. But in order to achieve "not pointing at disappeared resources" we have various tools to employ. We have smart pointers, we have containers and we have value semantics. We need to understand how to use all of those but also we need to design with object lifetime in mind as a major consideration.
Object life-time should always be an important factor. However some languages (like Java for instance) mitigate against bad-design by providing a "safer" environment. C++, on the other hand, is rather less forgiving. However it does have a whole bunch of sophisticated tools for the task. That means a steeper learning curve but more efficiency and better control.

For a data member, is there any difference between dynamically allocating this variable(or not) if the containing object is already in dynamic memory?

I'm starting with the assumption that, generally, it is a good idea to allocate small objects in the stack, and big objects in dynamic memory. Another assumption is that I'm possibly confused while trying to learn about memory, STL containers and smart pointers.
Consider the following example, where I have an object that is necessarily allocated in the free store through a smart pointer, and I can rely on clients getting said object from a factory, for instance. This object contains some data that is specifically allocated using an STL container, which happens to be a std::vector. In one case, this data vector itself is dynamically allocated using some smart pointer, and in the other situation I just don't use a smart pointer.
Is there any practical difference between design A and design B, described below?
Situation A:
class SomeClass{
public:
SomeClass(){ /* initialize some potentially big STL container */ }
private:
std::vector<double> dataVector_;
};
Situation B:
class SomeOtherClass{
public:
SomeOtherClass() { /* initialize some potentially big STL container,
but is it allocated in any different way? */ }
private:
std::unique_ptr<std::vector<double>> pDataVector_;
};
Some factory functions.
std::unique_ptr<SomeClass> someClassFactory(){
return std::make_unique<SomeClass>();
}
std::unique_ptr<SomeOtherClass> someOtherClassFactory(){
return std::make_unique<SomeOtherClass>();
}
Use case:
int main(){
//in my case I can reliably assume that objects themselves
//are going to always be allocated in dynamic memory
auto pSomeClassObject(someClassFactory());
auto pSomeOtherClassObject(someOtherClassFactory());
return 0;
}
I would expect that both design choices have the same outcome, but do they?
Is there any advantage or disadvantage for choosing A or B? Specifically, should I generally choose design A because it's simpler or are there more considerations? Is B morally wrong because it can dangle for a std::vector?
tl;dr : Is it wrong to have a smart pointer pointing to a STL container?
edit:
The related answers pointed to useful additional information for someone as confused as myself.
Usage of objects or pointers to objects as class members and memory allocation
and Class members that are objects - Pointers or not? C++
And changing some google keywords lead me to When vectors are allocated, do they use memory on the heap or the stack?
std::unique_ptr<std::vector<double>> is slower, takes more memory, and the only advantage is that it contains an additional possible state: "vector doesn't exist". However, if you care about that state, use boost::optional<std::vector> instead. You should almost never have a heap-allocated container, and definitely never use a unique_ptr. It actually works fine, no "dangling", it's just pointlessly slow.
Using std::unique_ptr here is just wasteful unless your goal is a compiler firewall (basically hiding the compile-time dependency to vector, but then you'd need a forward declaration to standard containers).
You're adding an indirection but, more importantly, the full contents of SomeClass turns into 3 separate memory blocks to load when accessing the contents (SomeClass merged with/containing unique_ptr's block pointing to std::vector's block pointing to its element array). In addition you're paying one extra superfluous level of heap overhead.
Now you might start imagining scenarios where an indirection is helpful to the vector, like maybe you can shallow move/swap the unique_ptrs between two SomeClass instances. Yes, but vector already provides that without a unique_ptr wrapper on top. And it already has states like empty that you can reuse for some concept of validity/nilness.
Remember that variable-sized containers themselves are small objects, not big ones, pointing to potentially big blocks. vector isn't big, its dynamic contents can be. The idea of adding indirections for big objects isn't a bad rule of thumb, but vector is not a big object. With move semantics in place, it's worth thinking of it more like a little memory block pointing to a big one that can be shallow copied and swapped cheaply. Before move semantics, there were more reasons to think of something like std::vector as one indivisibly large object (though its contents were always swappable), but now it's worth thinking of it more like a little handle pointing to big, dynamic contents.
Some common reasons to introduce an indirection through something like unique_ptr is:
Abstraction & hiding. If you're trying to abstract or hide the concrete definition of some type/subtype, Foo, then this is where you need the indirection so that its handle can be captured (or potentially even used with abstraction) by those who don't know exactly what Foo is.
To allow a big, contiguous 1-block-type object to be passed around from owner to owner without invoking a copy or invalidating references/pointers (iterators included) to it or its contents.
A hasty kind of reason that's wasteful but sometimes useful in a deadline rush is to simply introduce a validity/null state to something that doesn't inherently have it.
Occasionally it's useful as an optimization to hoist out certain less frequently-accessed, larger members of an object so that its commonly-accessed elements fit more snugly (and perhaps with adjacent objects) in a cache line. There unique_ptr can let you split apart that object's memory layout while still conforming to RAII.
Now wrapping a shared_ptr on top of a standard container might have more legitimate applications if you have a container that can actually be owned (sensibly) by more than one owner. With unique_ptr, only one owner can possess the object at a time, and standard containers already let you swap and move each other's internal guts (the big, dynamic parts). So there's very little reason I can think of to wrap a standard container directly with a unique_ptr, as it's already somewhat like a smart pointer to a dynamic array (but with more functionality to work with that dynamic data, including deep copying it if desired).
And if we talk about non-standard containers, like say you're working with a third party library that provides some data structures whose contents can get very large but they fail to provide those cheap, non-invalidating move/swap semantics, then you might superficially wrap it around a unique_ptr, exchanging some creation/access/destruction overhead to get those cheap move/swap semantics back as a workaround. For the standard containers, no such workaround is needed.
I agree with #MooingDuck; I don't think using std::unique_ptr has any compelling advantages. However, I could see a use case for std::shared_ptr if the member data is very large and the class is going to support COW (copy-on-write) semantics (or any other use case where the data is shared across multiple instances).

Usage of Smart Pointers as a Programming Standard?

More and more I hear, that I should use smart pointers instead of naked pointers, despite I have effective memory leak system implemented.
What is the correct programming approach on using smart pointers please? Should they really be used, even if I check memory leaks on allocated memory blocks? Is it still up to me? If I do not use them, can this be considered as programming weakness?
If the smart pointers(ex: std::auto_ptr) are strongly recommended, should I use them instead of every naked pointer?
You should use RAII to handle all resource allocations.
Smart pointers are just one common special case of that rule.
And smart pointers are more than just shared_ptr. There are different smart pointers with different ownership semantics. Use the one that suits your needs. (The main ones are scoped_ptr, shared_ptr, weak_ptr and auto_ptr/unique_ptr (prefer the latter where available). Depending on your compiler, they may be available in the standard library, as part of TR1, or not at all, in which case you can get them through the Boost libraries.
And yes, you should absolutely use these. It costs you nothing (if done correctly, you lose zero performance), and it gains you a lot (memory and other resources are automatically freed, and you don't have to remember to handle it manually, and your code using the resource gets shorter and more concise)
Note that not every pointer usage represents some kind of resource ownership, and so not all raw pointer usage is wrong. If you simply need to point to an object owned by someone else, a raw pointer is perfectly suitable. But if you own the object, then you should take proper ownership of it, either by giving the class itself RAII semantics, or by wrapping it in a smart pointer.
You can't just blindly substitute std::auto_ptr for every raw pointer. In particular, auto_ptr transfers ownership on assignment, which is great for some purposes but definitely not for others.
There is a real reason there are several varieties of smart pointers (e.g., shared_ptr, weak_ptr, auto_ptr/unique_ptr, etc.) Each fulfills a different purpose. One major weakness of a "raw" pointer is that it has so many different uses (and has that versatility largely because it does little or nothing to assist in any one purpose). Smart pointers tend to be more specialized, which means they can be more intelligent about doing one thing well, but also means you have to pick the right one for the job or it'll end up dong the wrong things entirely.
Smart pointers allows to define automatically the life-time of objects it refers to. That's the main thing to understand.
So, no, you shouldn't use smart pointers everywhere, only when you want to automate life-time of your objects instead of having, for example, an object managing those objects inside from birth to death. It's like any tool : it solves specific kind of problems, not all problems.
For each object, you should think about the life cycle it will go through, then choose one of the simplest correct and efficient solution. Sometimes it will be shared_ptr because you want the object to be used by several components and to be automatically destroyed once not used anymore. Sometimes you need the object only in the current scope/parent-object, so scoped_ptr might be more appropriate. Sometimes you need only one owner of the instance, so unique_ptr is appropriate. Maybe you'll find cases where you know an algorithm that might define/automate the lifetime of an object, so you'll write your own smart pointer for it.
For example of opposite case, using pools forbids you to use smart_ptr. Naked pointers might be a more welcome simple and efficient solution in this particular (but common in embedded software) case.
See this answer (from me) for more explainations : https://softwareengineering.stackexchange.com/questions/57581/in-c-is-it-a-reflection-of-poor-software-design-if-objects-are-deleted-manuall/57611#57611
Should they really be used, even if I check memory leaks on allocated memory blocks?
YES
The whole purpose of smart pointers is, it help you implement RAII(SBRM), which basically lets the resource itself take the responsibility of its deallocation and the resource doesn't have to rely on you explicitly remembering to deallocate it.
If I do not use them, can this be considered as programming weakness?
NO,
It is not a weakness but a inconvenience or unnecessary hassle to explicitly manage the resources by yourself if you are not using Smart pointers(RAII). The purpose of smart pointers to implement RAII is to provide efficient and hassle free way of handling resources and you would just not be making use of it if you are not using it. It is highly recommended to use it purely for the numerous advantages it provides.
If the smart pointers(ex: std::auto_ptr)are strongly recommended, should I use them instead of every naked pointer?
YES
You should use smart pointers wherever possible because simply there is no drawback of using them and just numerous advantages to use them.
Don't use auto_ptr though because it is already deprecated!! There are various other smart pointers available that you can use depending on the requirement. You can refer the link above to know more about them.
It's a tricky question, and the fact that there is currently a mode to
use smart pointers everywhere doesn't make things any easier. Smart
pointers can help in certain situations, but you certainly can't just
use them everywhere, without thinking. There are many different types
of smart pointers, and you have to think about which one is appropriate
in every case; and even then, most of your pointers (at least in typical
applications in the domains I've worked in) should be raw pointers.
Regardless of the approach, several points are worth mentionning:
Don't use dynamic allocation unless you have to. In many
applications, the only things that need to be allocated dynamically
are objects with specific lifetimes, determined by the application
logic. Don't use dynamic allocation for objects with value semantics.
With regards to entity object, those which model something in the
application domain: these should be created and destructed according
to the program logic. Irregardless of whether there are pointers to
them or not. If their destruction causes a problem, then you have an
error in your program logic somewhere (not handling an event correctly,
etc.), and using smart pointers won't change anything.
A typical example of an entity object might be client connection in a
server, is created when the client connects, and destructed when the
client disconnects. In many such cases, the most appropriate management
will be a delete this, since it is the connection which will receive
the disconnection event. (Objects which hold pointers to such an object
will have to register with it, in order to be informed of its
destruction. But such pointers are purely for navigation, and shouldn't
be smart pointers.)
What you'll usually find when people try to use smart pointers
everywhere is that memory leaks; typical reference counters don't
handle cycles, and of course, typical applications are full of cycles: a
Connection will point to the Client which is connected to it, and
the Client will contain a list of Connection where it is connected.
And if the smart pointer is boost::shared_ptr, there's also a definite
risk of dangling pointers: it's far to easy to create two
boost::shared_ptr to the same address (which results in two counters
for the references).
If the smart pointers(ex: std::auto_ptr) are strongly recommended, should I use them instead of every naked pointer?
In my opinion, yes, you should it for every pointer that you own.
Here are my ideas on resource management in C++ (feel free to disagree):
Good resource management requires thinking in terms of ownership.
Resources should be managed managed by objects (RAII).
Usually single ownership is preferred over shared ownership.
Ideally the creator is also the owner of the object. (However, there are situations where ownership transfer is in order.)
This leads to the following practices:
Make boost::scoped_ptr the default choice for local and member variables. Do keep in mind that using scoped_ptr for member variables will make your class non-copyable. If you don't want this see next point.
Use boost::shared_ptr for containers or to enable shared ownership:
// Container of MyClass* pointers:
typedef boost::shared_ptr<MyClass> MyClassPtr;
std::vector<MyClassPtr> vec;
The std::auto_ptr (C++03) can be used for ownership transfer. For example as the return value of factory or clone methods:
// Factory method returns auto_ptr
std::auto_ptr<Button> button = Button::Create(...);
// Clone method returns auto_ptr
std::auto_ptr<MyClass> copy = obj->clone();
// Use release() to transfer the ownership to a scoped_ptr or shared_ptr
boost::scoped_ptr<MyClass> copy(obj->clone().release());
If you need to store a pointer that you don't own then you can use a raw pointer:
this->parent = inParentObject;
In certain situations a boost::weak_pointer is required. See the documentation for more information.
In general you should prefer smart pointers, but there are a couple of exceptions.
If you need to recast a pointer, for example to provide a const version, that becomes nearly impossible with smart pointers.
Smart pointers are used to control object lifetime. Often when you are passing a pointer to a function, the function will not affect the lifetime; the function does not try to delete the object, and it does not store a copy of the pointer. The calling code cannot delete the object until the function returns. In that case a dumb pointer is perfectly acceptable.
Yes. Assuming you have C++0x available to you, use unique_ptr or shared_ptr (as appropriate) to wrap all the raw pointers you new up. With the help of make_shared, shared_ptr is highly performant. If you don't need reference counting then unique_ptr will get you better perf. Both of them behave properly in collections and other circumstances where auto_ptr was a dumb pointer.
Using smart pointers (shared_ptr or otherwise) EVERYWHERE is a bad idea. It's good to use shared_ptr to manage the lifetime of objects/resources but it's not a good idea to pass them as parameters to functions etc. That increases the likelihood of circular references and other extremely hard to track bugs (Personal experience: Try figuring out who should not be holding onto a resource in 2 millions lines of code if every function invocation changes the reference count - you will end up thinking the guys who do this kind of thing are m***ns). Better to pass a raw pointer or a reference.
The situation is even worse when combined with lazy instantiation.
I would suggest that developers should know the lifecycle of the objects they write and use shared_ptr to control that (RAII) but not extend shared_ptr use beyond that.

Should I use smart pointers on everything and forget about classic normal pointers?

I've been using C++ for a long time and know very well about carefulness in allocating and deallocating memory, especially not forgetting to delete unused instances.
Now, I've just recently used boost and with a problem I am facing I'm forced to used smart pointers (specifically the shared_ptr) one. So, if I am going to use shared_ptr for this problem, should I use smart pointers to all my normal pointer codebase?
You should use smart pointers careful. They are not the silver bullet when considering memory management. Circular references are still an issue.
When making the class design, always think who has the ownership of an object (has the responsibility to destroy that object). Complement that with smart pointers, if necessary, but don't forget about the ownership.
Yes, you should greatly prefer smart pointers over bare pointers for almost everything. But that does not mean you should be using ::boost::shared_ptr for most of those cases. In fact I think you should use shared_ptr sparingly and carefully.
But for those pointers you don't use shared_ptr for you should be using ::std::auto_ptr or, if you have C++0x ::std::unique_ptr. And if they aren't appropriate, you should find a smart pointer type that is.
Now this isn't always the right answer, just almost always. Smart pointers are invaluable for tracking and freeing memory resources appropriately even in the face of exceptions or other such oddities of control flow.
There are cases in which you will need to either write your own smart pointer class or not use one. These cases are very rare, but they exist.
For example, using a smart pointer in your own smart pointer class is probably not the right thing to be doing. Pointing at stuff allocated in a C library would probable require a custom smart pointer that called free instead of delete. Maybe you want a stretchy buffer you can call realloc on and don't want to use a ::std::vector for some reason.
But generally, a smart pointer (though not usually ::boost::shared_ptr (or in C++0x ::std::shared_ptr)) is the right answer to your resource management problem.
Yes. You should be using the Boost smart pointers for most everything if you have them available. Remember this, while you can and from your comments do use pointers effectively, the person/people that come after you may not. Using the smart pointers can and will (hopefully, can't guard against bad code) mitigate some of these issues.
I'd also recommend using scoped_ptr inside your classes even if your classes are designed well. It'll help guard against issues the next guy could/might/most likely will introduce when faced with a naked pointer. It'll encourage them to use them too, by example. The last thing you want is to have to track down a memory issue because someone forgot to initialize the pointer to NULL and it's passing the if statement check.
No, you should not use smart pointers for everything.
What you should consider whenever typing new:
What is the lifetime of this object?
Which object owns it?
How is that object going to manage its lifetime?
Sometimes the solution is a smart pointer but also the lazy answer. "I don't want to be bothered to work out which object owns this pointer and what its lifetime should be hence I'll make it a shared_pointer!"
The important question is; What is managing the lifetime of this object?, the answer can be a smart pointer, but oftentimes, it doesn't need to be.
No. It depends on what you're doing.
Smart pointers have a performance overhead. In desktop applications, this is typically not a concern, but depending on what you do, it might be.
Smart pointers will not work right if you have reference cycles, i.e. A pointing to B and B pointing to A, or even something pointing to itself.

Once you've adopted boost's smart pointers, is there any case where you use raw pointers?

I'm curious as I begin to adopt more of the boost idioms and what appears to be best practices I wonder at what point does my c++ even remotely look like the c++ of yesteryear, often found in typical examples and in the minds of those who've not been introduced to "Modern C++"?
I don't use shared_ptr almost at all, because I avoid shared ownership in general. Therefore, I use something like boost::scoped_ptr to "own" an object, but all other references to it will be raw pointers. Example:
boost::scoped_ptr<SomeType> my_object(new SomeType);
some_function(my_object.get());
But some_function will deal with a raw pointer:
void some_function(SomeType* some_obj)
{
assert (some_obj);
some_obj->whatever();
}
Just a few off the top of my head:
Navigating around in memory-mapped files.
Windows API calls where you have to over-allocate (like a LPBITMAPINFOHEADER).
Any code where you're munging around in arbitrary memory (VirtualQuery() and the like).
Just about any time you're using reinterpret_cast<> on a pointer.
Any time you use placement-new.
The common thread here is "any situation in which you need to treat a piece of memory as something other than a resource over which you have allocation control".
These days I've pretty much abandoned all use of raw pointers. I've even started looking through our code base for places where raw pointers were used and switched them to a smart pointer variant. It's amazing how much code I've been able to delete by doing this simple act. There is so much code wasted on lifetime management of raw C++ pointers.
The only places where I don't use pointers is for a couple of interop scenarios with other code bases I don't have control over.
I find the primary difference between 'modern' C++ and the old* stuff is careful use of class invariants and encapsulation. Well organised code tends naturally to have fewer pointers flying around. I'm almost as nervous swimming in shared_ptrs as I would be in news and deletes.
I'm looking forward to unique_ptr in C++0x. I think that will tidy away the few (smart) pointers that do still roam the wild.
*still unfortunately very common
Certainly any time you're dealing with a legacy library or API you'll need to pass a raw pointer, although you'll probably just extract it from your smart pointer temporarily.
In fact it is always safe to pass a raw pointer to a function, as long as the function does not try to keep a copy of the pointer in a global or member variable, or try to delete it. With these restrictions in place, the function cannot affect the lifetime of the object, and the only reason for a smart pointer is to manage the object lifetime.
I still use regular pointers in resource-sensitive code or other code that needs tiny footprint, such as certain exceptions, where I cannot assume that any data is valid and must also assume that I am running out of memory too.
Managed memory is almost always superior to raw otherwise, because it means that you don't have to deal with deleting it at the right place, but still have great control over the construction and destruction points of your pointers.
Oh, and there's one other place to use raw pointers:
boost::shared_ptr<int> ptr(new int);
I still use raw pointers on devices that have memory mapped IO, such as embedded systems, where having a smart pointer doesn't really make sense because you will never need or be able to delete it.
If you have circular data structures, e.g., A points to B and B points back to A, you can't use naively use smart pointers for both A and B, since then the objects will only be freed extra work. To free the memory, you have to manually clear the smart pointers, which is about as bad as the delete the smart pointers get rid of.
You might thing this doesn't happen very often, but suppose you have Parent object that has smart pointers to a bunch of Child objects. Somewhere along the way someone needs to look up a the Parent for a Child, so they add a smart pointer member to Child that points back to the parent. Silently, memory is no longer freed.
Some care is required. Smart pointers are not equivalent to garbage collection.
I'm writing C++ that has to co-exist with Objective C (using Objective C++ to bridge).
Because C++ objects declared as part of Objective C++ classes don't have constructors or destructors called you can't really hold them there in smart pointers.
So I tend to use raw pointers, although often with boost::intrustive_ptr and an internal ref count.
Not that I would do it, but you need raw pointers to implement, say, a linked list or a graph. But it would be much smarter to use std::list<> or boost::graph<>.