Raw Pointer Management in C++

Raw Pointer Management in C++ - c++

I have a piece of performance critical code. The class and objects are fairly large, hence, it would be stored as pointers in the STL container. Problems arise when the pointers to objects need to be stored in multiple different containers based on some logic. It is very messy in handling the ownership of the object as I couldn't isolate the ownership of the object to a single containers(which I could just delete from the single container). Other than using smart pointer (since it is performance critical and smart pointer might affects the performance), what could I do?
Thanks.

You are asking for impossible - from one point, you are asking for exceptional performance, such that smart pointers you claim cannot provide, and you also happen to ask for safety and tidiness. Well, one comes at the cost of another, actually. You could, of course try to write your own shared pointer, which would be more lightweight than boost's, but still provide the basic functionality. Incidentally, have you actually tried boost::shared_ptr? Did it actually slow down the performance?

Your question is very awkward: you ask for performance with a messy logic ?
shared_ptr have incredible performance, really, and though you can get better it is probably your best choice: it works.
You could look up another Boost smart pointer though: boost::intrusive_ptr.
This is done at the cost of foregoing: weak_ptr and in exchange allows to have a single memory allocation for both the counter and the object. Packing up the two yields a SMALL increase in performance.
If you do not have cyclic reference, try and check it out, it might be what you were looking for.

Intrusive smart pointers are typically more efficient than normal smart pointers, yet easier than dumb pointers. For instance, check out boost::intrusive_ptr<T>

If the objects don't refer to one another, manual reference counting might be worth trying. There would be more cost when adding to or removing from a list, but no overhead to the actual object access. (This can be painful to diagnose when you get it wrong, so I recommend being careful to get it right.)
If there is dead time between operations, consider a sort of garbage collect. Maintain the list of all objects (an intrusive list would probably do). When you have time to spare, cross-reference this against the other lists; any objects that aren't in a list can get deleted. You don't need an extra array to do this (just a global counter and a last-seen counter on each object) so it could be reasonably efficient.
Another option would be to use a smart pointer that provides access to the underlying pointer. If you're trying to avoid the overhead of calling an overloaded operator-> a lot then this might be worth trying. Store the smart pointers in the lists (this does the lifetime management for you) and then when running through the objects you can retrieve the raw pointer for each and operate using that (so you don't incur the overhead of any overloaded operator-> etc.). For example:
std::vector<smart_ptr<T> > objects;
if(!objects.empty()) {
smart_ptr<T> *objects_raw=&objects[0];
for(size_t n=objects.size(),i=0;i<n;++i) {
T *object=objects_raw[i].get_ptr();
// do stuff
}
}
This is the sort of approach I prefer personally. Long-term storage gets a smart pointer, short-term storage gets a plain pointer. The object lifetimes are easy to manage, and you don't fall foul of 1,000,000 tiny overheads (more important for keeping the debug build runnable than it is for the release build, but it can easily add to wasted time nonetheless).

Related

Realloc creates dangling pointers?

I wrote some array code that allocates memory and each value in the array is a type. I then have another array that consists of in to the first array for references.
Both arrays can grow. It uses realloc. Because the 2nd array contains pointers in to the first, they are surely not updated when the first array changes(I don't do it manually and there is no GC). Surely all the pointers in the 2nd array are invalid! (they point to memory that was free'ed by realloc).
This is the case right?
This seems like it would make persistent pointers to blocks of memory that may move very dangerous?
What is the standard solution? Don't use "global" pointers? Using pointers to pointers to pointers? I think I could make the 2nd array use **'s and could probably get things to work.
In a MT environment, things are even worse. Local pointers access may be moved in the middle, then the memory changed, and the local pointer is now wrong. (Which, of course might be solved by preventing the moves by lock, etc...)
Go with functional programming?

Yes, the realloc can invalidate your references. If there is no continuous space for relocating your array will be moved.
Consider using a container as the std::deque.

1) This is the case right?
Yes.
2) This seems like it would make persistent pointers to blocks of memory that may move very dangerous?
Yes.
3) What is the standard solution?
You design your application such that the life-times of your objects is well defined so that you do not refer to them after they are no longer required.
4) In a MT environment, things are even worse. Local pointers access may be moved in the middle, then the memory changed, and the local pointer is now wrong. (Which, of course might be solved by preventing the moves by lock, etc...)
Obviously you should never use a pointer that no-longer pointers to its resource. Managing shared resources in a MT environment is non trivial and there are a whole bunch of tools and techniques to achieve it.
5) Go with functional programming?
It is always advisable to avoid pointers if you can.
Without a specific problem it is hard to give a specific solution. But in order to achieve "not pointing at disappeared resources" we have various tools to employ. We have smart pointers, we have containers and we have value semantics. We need to understand how to use all of those but also we need to design with object lifetime in mind as a major consideration.
Object life-time should always be an important factor. However some languages (like Java for instance) mitigate against bad-design by providing a "safer" environment. C++, on the other hand, is rather less forgiving. However it does have a whole bunch of sophisticated tools for the task. That means a steeper learning curve but more efficiency and better control.

Storing boost_shared pointers in a vector - Is it expensive

I know that vectors tend to make a copy of all objects pushed into them. My question is whether it would make sense to store a pointer to a boost::shared_ptr in a vector rather than the shared ptr itself

The only 100% correct answer to any performance questions is "profile and see."
However, in this particular case, you should just keep the shared_ptrs in there directly (by value). The overhead of copying a shared pointer is low. And if your compiler supports moves, it will even be moved instead - although then you'd probably be using std::shared_ptr, right?
How would you do it to store just pointers in there, anyway? Allocate the shared_ptrs dynamically? How would you control their own ownership?

shared_ptr are meant to manage the life-time of raw pointers. Storing and accessing them thru containers is hardly an overhead. If you store a pointer to shared_ptr into a container then you would defeat the whole purpose of shared_ptr.

Why would you ask about "expensive"? Your first concern apart from correctness should be "is it safe". Your second concern should be "is it readable and maintainable". Cost only is in the top 5 of your concerns if you know (by profiling) that you are working in a performance sensitive area. And even then there's "safety first" in almost every case.
So:
don't use pointers to shared_ptr, because it's hard to get that safe and correct.
don't use pointers to shared_ptr, because it's brittle, hard to descipher and makes everyone wonder why you did that (maintainability)
don't use pointers to shared_ptr unless you really know that copying the shared_ptrs is a performance issue, and you know by profiling and you don't have other, safer, more maintainable and equally performant solutions
tl;dr don't use pointers to shared_ptr.

Usage of Smart Pointers as a Programming Standard?

More and more I hear, that I should use smart pointers instead of naked pointers, despite I have effective memory leak system implemented.
What is the correct programming approach on using smart pointers please? Should they really be used, even if I check memory leaks on allocated memory blocks? Is it still up to me? If I do not use them, can this be considered as programming weakness?
If the smart pointers(ex: std::auto_ptr) are strongly recommended, should I use them instead of every naked pointer?

You should use RAII to handle all resource allocations.
Smart pointers are just one common special case of that rule.
And smart pointers are more than just shared_ptr. There are different smart pointers with different ownership semantics. Use the one that suits your needs. (The main ones are scoped_ptr, shared_ptr, weak_ptr and auto_ptr/unique_ptr (prefer the latter where available). Depending on your compiler, they may be available in the standard library, as part of TR1, or not at all, in which case you can get them through the Boost libraries.
And yes, you should absolutely use these. It costs you nothing (if done correctly, you lose zero performance), and it gains you a lot (memory and other resources are automatically freed, and you don't have to remember to handle it manually, and your code using the resource gets shorter and more concise)
Note that not every pointer usage represents some kind of resource ownership, and so not all raw pointer usage is wrong. If you simply need to point to an object owned by someone else, a raw pointer is perfectly suitable. But if you own the object, then you should take proper ownership of it, either by giving the class itself RAII semantics, or by wrapping it in a smart pointer.

You can't just blindly substitute std::auto_ptr for every raw pointer. In particular, auto_ptr transfers ownership on assignment, which is great for some purposes but definitely not for others.
There is a real reason there are several varieties of smart pointers (e.g., shared_ptr, weak_ptr, auto_ptr/unique_ptr, etc.) Each fulfills a different purpose. One major weakness of a "raw" pointer is that it has so many different uses (and has that versatility largely because it does little or nothing to assist in any one purpose). Smart pointers tend to be more specialized, which means they can be more intelligent about doing one thing well, but also means you have to pick the right one for the job or it'll end up dong the wrong things entirely.

Smart pointers allows to define automatically the life-time of objects it refers to. That's the main thing to understand.
So, no, you shouldn't use smart pointers everywhere, only when you want to automate life-time of your objects instead of having, for example, an object managing those objects inside from birth to death. It's like any tool : it solves specific kind of problems, not all problems.
For each object, you should think about the life cycle it will go through, then choose one of the simplest correct and efficient solution. Sometimes it will be shared_ptr because you want the object to be used by several components and to be automatically destroyed once not used anymore. Sometimes you need the object only in the current scope/parent-object, so scoped_ptr might be more appropriate. Sometimes you need only one owner of the instance, so unique_ptr is appropriate. Maybe you'll find cases where you know an algorithm that might define/automate the lifetime of an object, so you'll write your own smart pointer for it.
For example of opposite case, using pools forbids you to use smart_ptr. Naked pointers might be a more welcome simple and efficient solution in this particular (but common in embedded software) case.
See this answer (from me) for more explainations : https://softwareengineering.stackexchange.com/questions/57581/in-c-is-it-a-reflection-of-poor-software-design-if-objects-are-deleted-manuall/57611#57611

Should they really be used, even if I check memory leaks on allocated memory blocks?
YES
The whole purpose of smart pointers is, it help you implement RAII(SBRM), which basically lets the resource itself take the responsibility of its deallocation and the resource doesn't have to rely on you explicitly remembering to deallocate it.
If I do not use them, can this be considered as programming weakness?
NO,
It is not a weakness but a inconvenience or unnecessary hassle to explicitly manage the resources by yourself if you are not using Smart pointers(RAII). The purpose of smart pointers to implement RAII is to provide efficient and hassle free way of handling resources and you would just not be making use of it if you are not using it. It is highly recommended to use it purely for the numerous advantages it provides.
If the smart pointers(ex: std::auto_ptr)are strongly recommended, should I use them instead of every naked pointer?
YES
You should use smart pointers wherever possible because simply there is no drawback of using them and just numerous advantages to use them.
Don't use auto_ptr though because it is already deprecated!! There are various other smart pointers available that you can use depending on the requirement. You can refer the link above to know more about them.

It's a tricky question, and the fact that there is currently a mode to
use smart pointers everywhere doesn't make things any easier. Smart
pointers can help in certain situations, but you certainly can't just
use them everywhere, without thinking. There are many different types
of smart pointers, and you have to think about which one is appropriate
in every case; and even then, most of your pointers (at least in typical
applications in the domains I've worked in) should be raw pointers.
Regardless of the approach, several points are worth mentionning:
Don't use dynamic allocation unless you have to. In many
applications, the only things that need to be allocated dynamically
are objects with specific lifetimes, determined by the application
logic. Don't use dynamic allocation for objects with value semantics.
With regards to entity object, those which model something in the
application domain: these should be created and destructed according
to the program logic. Irregardless of whether there are pointers to
them or not. If their destruction causes a problem, then you have an
error in your program logic somewhere (not handling an event correctly,
etc.), and using smart pointers won't change anything.
A typical example of an entity object might be client connection in a
server, is created when the client connects, and destructed when the
client disconnects. In many such cases, the most appropriate management
will be a delete this, since it is the connection which will receive
the disconnection event. (Objects which hold pointers to such an object
will have to register with it, in order to be informed of its
destruction. But such pointers are purely for navigation, and shouldn't
be smart pointers.)
What you'll usually find when people try to use smart pointers
everywhere is that memory leaks; typical reference counters don't
handle cycles, and of course, typical applications are full of cycles: a
Connection will point to the Client which is connected to it, and
the Client will contain a list of Connection where it is connected.
And if the smart pointer is boost::shared_ptr, there's also a definite
risk of dangling pointers: it's far to easy to create two
boost::shared_ptr to the same address (which results in two counters
for the references).

If the smart pointers(ex: std::auto_ptr) are strongly recommended, should I use them instead of every naked pointer?
In my opinion, yes, you should it for every pointer that you own.
Here are my ideas on resource management in C++ (feel free to disagree):
Good resource management requires thinking in terms of ownership.
Resources should be managed managed by objects (RAII).
Usually single ownership is preferred over shared ownership.
Ideally the creator is also the owner of the object. (However, there are situations where ownership transfer is in order.)
This leads to the following practices:
Make boost::scoped_ptr the default choice for local and member variables. Do keep in mind that using scoped_ptr for member variables will make your class non-copyable. If you don't want this see next point.
Use boost::shared_ptr for containers or to enable shared ownership:
// Container of MyClass* pointers:
typedef boost::shared_ptr<MyClass> MyClassPtr;
std::vector<MyClassPtr> vec;
The std::auto_ptr (C++03) can be used for ownership transfer. For example as the return value of factory or clone methods:
// Factory method returns auto_ptr
std::auto_ptr<Button> button = Button::Create(...);
// Clone method returns auto_ptr
std::auto_ptr<MyClass> copy = obj->clone();
// Use release() to transfer the ownership to a scoped_ptr or shared_ptr
boost::scoped_ptr<MyClass> copy(obj->clone().release());
If you need to store a pointer that you don't own then you can use a raw pointer:
this->parent = inParentObject;
In certain situations a boost::weak_pointer is required. See the documentation for more information.

In general you should prefer smart pointers, but there are a couple of exceptions.
If you need to recast a pointer, for example to provide a const version, that becomes nearly impossible with smart pointers.
Smart pointers are used to control object lifetime. Often when you are passing a pointer to a function, the function will not affect the lifetime; the function does not try to delete the object, and it does not store a copy of the pointer. The calling code cannot delete the object until the function returns. In that case a dumb pointer is perfectly acceptable.

Yes. Assuming you have C++0x available to you, use unique_ptr or shared_ptr (as appropriate) to wrap all the raw pointers you new up. With the help of make_shared, shared_ptr is highly performant. If you don't need reference counting then unique_ptr will get you better perf. Both of them behave properly in collections and other circumstances where auto_ptr was a dumb pointer.

Using smart pointers (shared_ptr or otherwise) EVERYWHERE is a bad idea. It's good to use shared_ptr to manage the lifetime of objects/resources but it's not a good idea to pass them as parameters to functions etc. That increases the likelihood of circular references and other extremely hard to track bugs (Personal experience: Try figuring out who should not be holding onto a resource in 2 millions lines of code if every function invocation changes the reference count - you will end up thinking the guys who do this kind of thing are m***ns). Better to pass a raw pointer or a reference.
The situation is even worse when combined with lazy instantiation.
I would suggest that developers should know the lifecycle of the objects they write and use shared_ptr to control that (RAII) but not extend shared_ptr use beyond that.

Getting started with smart pointers in C++

I have a C++ application which makes extensively use of pointers to maintain quite complex data structures. The application performs mathematical simulations on huge data sets (which could take several GB of memory), and is compiled using Microsoft's Visual Studio 2010.
I am now reworking an important part of the application. To reduce errors (dangling pointers, memory leaks, ...) I would want to start using smart pointers. Sacrificing memory or performance is acceptible as long as it is limited.
In practice most of the classes are maintained in big pools (one pool per class) and although the classes can refer to each other, you could consider the pool as owner of all the instances of that class. However, if the pool decides to delete an instance, I don't want any of the other classes that still refers to the deleted instance to have a dangling pointer.
In another part I keep a collection of pointers to instances that are delivered by other modules in the application. In practice the other modules maintain ownership of the passed instance, but in some cases, modules don't want to take care of the ownership and just want to pass the instance to the collection, telling it "it's yours now, manage it".
What is the best way to start introducing smart pointers? Just replacing pointers [at random] with smart pointers doesn't seem a correct way, and probably doesn't deliver all the (or any of the) advantages of smart pointers. But what is a better method?
Which types of smart pointers should I further investigate? I sometimes use std::auto_ptr for the deallocation of locally allocated memory, but this seems to be deprected in C++0x. Is std::unique_ptr a better alternative? Or should I go straight to shared pointers or other types of smart pointers?
The question Replacing existing raw pointers with smart pointers seems similar but instead of asking how easy it is, I am asking what the best approach would be, and which kind of smart pointers are suited best.
Thanks in advance for your ideas and suggestions.

I recommend using unique_ptr when possible (this may require some program analysis) and shared_ptr when this is impossible. When in doubt, use a shared_ptr to maximize safety: when handing off control to a container, the reference count will simply go to two and then back to one and the container will eventually delete the associated object automatically. When performance becomes an issue, consider using boost::intrusive_ptr.

Here are the 3 varieties found in the new C++11 standard (unique_ptr replaces auto_ptr)
http://www.stroustrup.com/C++11FAQ.html#std-unique_ptr
http://www.stroustrup.com/C++11FAQ.html#std-shared_ptr
http://www.stroustrup.com/C++11FAQ.html#std-weak_ptr
You can read the text for each pointer and there is an explanation of when to use which in there. For local memory management unique_ptr is the choice. It is non-copyable but movable so as you move it around the receiver takes ownership of it.
Shared_ptr is used if you want to share an object instance around with no one really owning the object and to make sure it doesn't get deleted while someone still has a reference to it. Once the last user of an object destroys the shared_ptr container, the contained object will be deleted.
weak_ptr is used in conjunction with shared_ptr. It enables one to "lock" to see if the reference shared_ptr object still exists before trying to access the internal object.

Are there any reasons not to use Boost::shared_ptrs?

I've asked a couple questions (here and here) about memory management, and invariably someone suggests that I use boost::shared_ptrs.
Given how useful they seem to be, I'm seriously considering switching over my entire application to use boost::shared_ptrs.
However, before I jump in with both feet and do this, I wanted to ask -- Has anyone had any bad experiences with boost::shared_ptrs? Is there some pitfall to using them that I need to watch out for?
Right now, they seem almost too good to be true - taking care of most of my garbage collection concerns automatically. What's the downside?

The downside is they're not free. You especially shouldn't use shared_ptr/shared_array when scoped_ptr/scoped_array (or plain old stack allocation) will do. You'll need to manually break cycles with weak_ptr if you have any. The vector question you link to is one case where I might reach for a shared_ptr, the second question I would not. Not copying is a premature optimization, especially if the string class does it for you already. If the string class is reference counted, it will also be able to implement COW properly, which can't really be done with the shared_ptr<string> approach. Using shared_ptr willy-nilly will also introduce "interface friction" with external libraries/apis.

Boost shared pointers or any other technique of memory management in C++ is not a panacea. There is no substitution for careful coding. If you dive into using boost::shared_ptr be aware of object ownership and avoid circular references. You are going to need to explicitly break cycles or use boost::weak_ptr where necessary.
Also be careful to always use boost::shared_ptr for an instance from the moment it is allocated. That way you are sure you won't have dangling references. One way you can ensure that is to use factory methods that return your newly created object in a shared_ptr.
typedef boost::shared_ptr<Widget> WidgetPtr;
WidgetPtr myWidget = Widget::Create();

I use shared_ptr's often.
Since Shared_ptr's are copied by-value, you can incur the cost of copying both the pointer value and a reference count, but if boost::intrusive_ptr is used, the reference count must be added to your class, and there is no additional overhead above that of using a raw pointer.
However, in my experience, more than 99% of the time, the overhead of copying boost::shared_ptr instances throughout your code is insignificant. Usually, as C. A. R. Hoare noted, premature optimization is pointless - most of the time other code will use significantly more time than the time to copy small objects. Your mileage may vary. If profiling show the copying is an issue, you can switch to intrusive pointers.
As already noted above, cycles must be broken by using a weak_ptr, or there will be a memory leak. This will happen with data structures such as some graphs, but if, for example, you are making a tree structure where the leaves never point backwards, you can just use shared_pointers for nodes of the tree without any issues.
Using shared_ptr's properly greatly simplifies code, makes it easier to read, and easier to maintain. In many cases using them is the right choice.
Of course, as already mentioned, in some cases, using scoped_ptr (or scoped_array) is the right choice. If the pointee isn't being shared, don't use shared pointers!
Finally, the most recent C++ standard provides the std::tr1::shared_ptr template, which is now on most platforms, although I don't think there is an intrusive pointer type for tr1 (or rather, there might be, but I have not heard of it myself).

Dynamic memory overhead (i.e., extra allocations) plus all the overhead associated with reference counted smart pointers.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js