Related
Is it allowed to delete this; if the delete-statement is the last statement that will be executed on that instance of the class? Of course I'm sure that the object represented by the this-pointer is newly-created.
I'm thinking about something like this:
void SomeModule::doStuff()
{
// in the controller, "this" object of SomeModule is the "current module"
// now, if I want to switch over to a new Module, eg:
controller->setWorkingModule(new OtherModule());
// since the new "OtherModule" object will take the lead,
// I want to get rid of this "SomeModule" object:
delete this;
}
Can I do this?
The C++ FAQ Lite has a entry specifically for this
https://isocpp.org/wiki/faq/freestore-mgmt#delete-this
I think this quote sums it up nicely
As long as you're careful, it's OK for an object to commit suicide (delete this).
Yes, delete this; has defined results, as long as (as you've noted) you assure the object was allocated dynamically, and (of course) never attempt to use the object after it's destroyed. Over the years, many questions have been asked about what the standard says specifically about delete this;, as opposed to deleting some other pointer. The answer to that is fairly short and simple: it doesn't say much of anything. It just says that delete's operand must be an expression that designates a pointer to an object, or an array of objects. It goes into quite a bit of detail about things like how it figures out what (if any) deallocation function to call to release the memory, but the entire section on delete (§[expr.delete]) doesn't mention delete this; specifically at all. The section on destructors does mention delete this in one place (§[class.dtor]/13):
At the point of definition of a virtual destructor (including an implicit definition (15.8)), the non-array deallocation function is determined as if for the expression delete this appearing in a non-virtual destructor of the destructor’s class (see 8.3.5).
That tends to support the idea that the standard considers delete this; to be valid -- if it was invalid, its type wouldn't be meaningful. That's the only place the standard mentions delete this; at all, as far as I know.
Anyway, some consider delete this a nasty hack, and tell anybody who will listen that it should be avoided. One commonly cited problem is the difficulty of ensuring that objects of the class are only ever allocated dynamically. Others consider it a perfectly reasonable idiom, and use it all the time. Personally, I'm somewhere in the middle: I rarely use it, but don't hesitate to do so when it seems to be the right tool for the job.
The primary time you use this technique is with an object that has a life that's almost entirely its own. One example James Kanze has cited was a billing/tracking system he worked on for a phone company. When you start to make a phone call, something takes note of that and creates a phone_call object. From that point onward, the phone_call object handles the details of the phone call (making a connection when you dial, adding an entry to the database to say when the call started, possibly connect more people if you do a conference call, etc.) When the last people on the call hang up, the phone_call object does its final book-keeping (e.g., adds an entry to the database to say when you hung up, so they can compute how long your call was) and then destroys itself. The lifetime of the phone_call object is based on when the first person starts the call and when the last people leave the call -- from the viewpoint of the rest of the system, it's basically entirely arbitrary, so you can't tie it to any lexical scope in the code, or anything on that order.
For anybody who might care about how dependable this kind of coding can be: if you make a phone call to, from, or through almost any part of Europe, there's a pretty good chance that it's being handled (at least in part) by code that does exactly this.
If it scares you, there's a perfectly legal hack:
void myclass::delete_me()
{
std::unique_ptr<myclass> bye_bye(this);
}
I think delete this is idiomatic C++ though, and I only present this as a curiosity.
There is a case where this construct is actually useful - you can delete the object after throwing an exception that needs member data from the object. The object remains valid until after the throw takes place.
void myclass::throw_error()
{
std::unique_ptr<myclass> bye_bye(this);
throw std::runtime_exception(this->error_msg);
}
Note: if you're using a compiler older than C++11 you can use std::auto_ptr instead of std::unique_ptr, it will do the same thing.
One of the reasons that C++ was designed was to make it easy to reuse code. In general, C++ should be written so that it works whether the class is instantiated on the heap, in an array, or on the stack. "Delete this" is a very bad coding practice because it will only work if a single instance is defined on the heap; and there had better not be another delete statement, which is typically used by most developers to clean up the heap. Doing this also assumes that no maintenance programmer in the future will cure a falsely perceived memory leak by adding a delete statement.
Even if you know in advance that your current plan is to only allocate a single instance on the heap, what if some happy-go-lucky developer comes along in the future and decides to create an instance on the stack? Or, what if he cuts and pastes certain portions of the class to a new class that he intends to use on the stack? When the code reaches "delete this" it will go off and delete it, but then when the object goes out of scope, it will call the destructor. The destructor will then try to delete it again and then you are hosed. In the past, doing something like this would screw up not only the program but the operating system and the computer would need to be rebooted. In any case, this is highly NOT recommended and should almost always be avoided. I would have to be desperate, seriously plastered, or really hate the company I worked for to write code that did this.
It is allowed (just do not use the object after that), but I wouldn't write such code on practice. I think that delete this should appear only in functions that called release or Release and looks like: void release() { ref--; if (ref<1) delete this; }.
Well, in Component Object Model (COM) delete this construction can be a part of Release method that is called whenever you want to release aquisited object:
void IMyInterface::Release()
{
--instanceCount;
if(instanceCount == 0)
delete this;
}
This is the core idiom for reference-counted objects.
Reference-counting is a strong form of deterministic garbage collection- it ensures objects manage their OWN lifetime instead of relying on 'smart' pointers, etc. to do it for them. The underlying object is only ever accessed via "Reference" smart pointers, designed so that the pointers increment and decrement a member integer (the reference count) in the actual object.
When the last reference drops off the stack or is deleted, the reference count will go to zero. Your object's default behavior will then be a call to "delete this" to garbage collect- the libraries I write provide a protected virtual "CountIsZero" call in the base class so that you can override this behavior for things like caching.
The key to making this safe is not allowing users access to the CONSTRUCTOR of the object in question (make it protected), but instead making them call some static member- the FACTORY- like "static Reference CreateT(...)". That way you KNOW for sure that they're always built with ordinary "new" and that no raw pointer is ever available, so "delete this" won't ever blow up.
You can do so. However, you can't assign to this. Thus the reason you state for doing this, "I want to change the view," seems very questionable. The better method, in my opinion, would be for the object that holds the view to replace that view.
Of course, you're using RAII objects and so you don't actually need to call delete at all...right?
This is an old, answered, question, but #Alexandre asked "Why would anyone want to do this?", and I thought that I might provide an example usage that I am considering this afternoon.
Legacy code. Uses naked pointers Obj*obj with a delete obj at the end.
Unfortunately I need sometimes, not often, to keep the object alive longer.
I am considering making it a reference counted smart pointer. But there would be lots of code to change, if I was to use ref_cnt_ptr<Obj> everywhere. And if you mix naked Obj* and ref_cnt_ptr, you can get the object implicitly deleted when the last ref_cnt_ptr goes away, even though there are Obj* still alive.
So I am thinking about creating an explicit_delete_ref_cnt_ptr. I.e. a reference counted pointer where the delete is only done in an explicit delete routine. Using it in the one place where the existing code knows the lifetime of the object, as well as in my new code that keeps the object alive longer.
Incrementing and decrementing the reference count as explicit_delete_ref_cnt_ptr get manipulated.
But NOT freeing when the reference count is seen to be zero in the explicit_delete_ref_cnt_ptr destructor.
Only freeing when the reference count is seen to be zero in an explicit delete-like operation. E.g. in something like:
template<typename T> class explicit_delete_ref_cnt_ptr {
private:
T* ptr;
int rc;
...
public:
void delete_if_rc0() {
if( this->ptr ) {
this->rc--;
if( this->rc == 0 ) {
delete this->ptr;
}
this->ptr = 0;
}
}
};
OK, something like that. It's a bit unusual to have a reference counted pointer type not automatically delete the object pointed to in the rc'ed ptr destructor. But it seems like this might make mixing naked pointers and rc'ed pointers a bit safer.
But so far no need for delete this.
But then it occurred to me: if the object pointed to, the pointee, knows that it is being reference counted, e.g. if the count is inside the object (or in some other table), then the routine delete_if_rc0 could be a method of the pointee object, not the (smart) pointer.
class Pointee {
private:
int rc;
...
public:
void delete_if_rc0() {
this->rc--;
if( this->rc == 0 ) {
delete this;
}
}
}
};
Actually, it doesn't need to be a member method at all, but could be a free function:
map<void*,int> keepalive_map;
template<typename T>
void delete_if_rc0(T*ptr) {
void* tptr = (void*)ptr;
if( keepalive_map[tptr] == 1 ) {
delete ptr;
}
};
(BTW, I know the code is not quite right - it becomes less readable if I add all the details, so I am leaving it like this.)
Delete this is legal as long as object is in heap.
You would need to require object to be heap only.
The only way to do that is to make the destructor protected - this way delete may be called ONLY from class , so you would need a method that would ensure deletion
Is it allowed to delete this; if the delete-statement is the last statement that will be executed on that instance of the class? Of course I'm sure that the object represented by the this-pointer is newly-created.
I'm thinking about something like this:
void SomeModule::doStuff()
{
// in the controller, "this" object of SomeModule is the "current module"
// now, if I want to switch over to a new Module, eg:
controller->setWorkingModule(new OtherModule());
// since the new "OtherModule" object will take the lead,
// I want to get rid of this "SomeModule" object:
delete this;
}
Can I do this?
The C++ FAQ Lite has a entry specifically for this
https://isocpp.org/wiki/faq/freestore-mgmt#delete-this
I think this quote sums it up nicely
As long as you're careful, it's OK for an object to commit suicide (delete this).
Yes, delete this; has defined results, as long as (as you've noted) you assure the object was allocated dynamically, and (of course) never attempt to use the object after it's destroyed. Over the years, many questions have been asked about what the standard says specifically about delete this;, as opposed to deleting some other pointer. The answer to that is fairly short and simple: it doesn't say much of anything. It just says that delete's operand must be an expression that designates a pointer to an object, or an array of objects. It goes into quite a bit of detail about things like how it figures out what (if any) deallocation function to call to release the memory, but the entire section on delete (§[expr.delete]) doesn't mention delete this; specifically at all. The section on destructors does mention delete this in one place (§[class.dtor]/13):
At the point of definition of a virtual destructor (including an implicit definition (15.8)), the non-array deallocation function is determined as if for the expression delete this appearing in a non-virtual destructor of the destructor’s class (see 8.3.5).
That tends to support the idea that the standard considers delete this; to be valid -- if it was invalid, its type wouldn't be meaningful. That's the only place the standard mentions delete this; at all, as far as I know.
Anyway, some consider delete this a nasty hack, and tell anybody who will listen that it should be avoided. One commonly cited problem is the difficulty of ensuring that objects of the class are only ever allocated dynamically. Others consider it a perfectly reasonable idiom, and use it all the time. Personally, I'm somewhere in the middle: I rarely use it, but don't hesitate to do so when it seems to be the right tool for the job.
The primary time you use this technique is with an object that has a life that's almost entirely its own. One example James Kanze has cited was a billing/tracking system he worked on for a phone company. When you start to make a phone call, something takes note of that and creates a phone_call object. From that point onward, the phone_call object handles the details of the phone call (making a connection when you dial, adding an entry to the database to say when the call started, possibly connect more people if you do a conference call, etc.) When the last people on the call hang up, the phone_call object does its final book-keeping (e.g., adds an entry to the database to say when you hung up, so they can compute how long your call was) and then destroys itself. The lifetime of the phone_call object is based on when the first person starts the call and when the last people leave the call -- from the viewpoint of the rest of the system, it's basically entirely arbitrary, so you can't tie it to any lexical scope in the code, or anything on that order.
For anybody who might care about how dependable this kind of coding can be: if you make a phone call to, from, or through almost any part of Europe, there's a pretty good chance that it's being handled (at least in part) by code that does exactly this.
If it scares you, there's a perfectly legal hack:
void myclass::delete_me()
{
std::unique_ptr<myclass> bye_bye(this);
}
I think delete this is idiomatic C++ though, and I only present this as a curiosity.
There is a case where this construct is actually useful - you can delete the object after throwing an exception that needs member data from the object. The object remains valid until after the throw takes place.
void myclass::throw_error()
{
std::unique_ptr<myclass> bye_bye(this);
throw std::runtime_exception(this->error_msg);
}
Note: if you're using a compiler older than C++11 you can use std::auto_ptr instead of std::unique_ptr, it will do the same thing.
One of the reasons that C++ was designed was to make it easy to reuse code. In general, C++ should be written so that it works whether the class is instantiated on the heap, in an array, or on the stack. "Delete this" is a very bad coding practice because it will only work if a single instance is defined on the heap; and there had better not be another delete statement, which is typically used by most developers to clean up the heap. Doing this also assumes that no maintenance programmer in the future will cure a falsely perceived memory leak by adding a delete statement.
Even if you know in advance that your current plan is to only allocate a single instance on the heap, what if some happy-go-lucky developer comes along in the future and decides to create an instance on the stack? Or, what if he cuts and pastes certain portions of the class to a new class that he intends to use on the stack? When the code reaches "delete this" it will go off and delete it, but then when the object goes out of scope, it will call the destructor. The destructor will then try to delete it again and then you are hosed. In the past, doing something like this would screw up not only the program but the operating system and the computer would need to be rebooted. In any case, this is highly NOT recommended and should almost always be avoided. I would have to be desperate, seriously plastered, or really hate the company I worked for to write code that did this.
It is allowed (just do not use the object after that), but I wouldn't write such code on practice. I think that delete this should appear only in functions that called release or Release and looks like: void release() { ref--; if (ref<1) delete this; }.
Well, in Component Object Model (COM) delete this construction can be a part of Release method that is called whenever you want to release aquisited object:
void IMyInterface::Release()
{
--instanceCount;
if(instanceCount == 0)
delete this;
}
This is the core idiom for reference-counted objects.
Reference-counting is a strong form of deterministic garbage collection- it ensures objects manage their OWN lifetime instead of relying on 'smart' pointers, etc. to do it for them. The underlying object is only ever accessed via "Reference" smart pointers, designed so that the pointers increment and decrement a member integer (the reference count) in the actual object.
When the last reference drops off the stack or is deleted, the reference count will go to zero. Your object's default behavior will then be a call to "delete this" to garbage collect- the libraries I write provide a protected virtual "CountIsZero" call in the base class so that you can override this behavior for things like caching.
The key to making this safe is not allowing users access to the CONSTRUCTOR of the object in question (make it protected), but instead making them call some static member- the FACTORY- like "static Reference CreateT(...)". That way you KNOW for sure that they're always built with ordinary "new" and that no raw pointer is ever available, so "delete this" won't ever blow up.
You can do so. However, you can't assign to this. Thus the reason you state for doing this, "I want to change the view," seems very questionable. The better method, in my opinion, would be for the object that holds the view to replace that view.
Of course, you're using RAII objects and so you don't actually need to call delete at all...right?
This is an old, answered, question, but #Alexandre asked "Why would anyone want to do this?", and I thought that I might provide an example usage that I am considering this afternoon.
Legacy code. Uses naked pointers Obj*obj with a delete obj at the end.
Unfortunately I need sometimes, not often, to keep the object alive longer.
I am considering making it a reference counted smart pointer. But there would be lots of code to change, if I was to use ref_cnt_ptr<Obj> everywhere. And if you mix naked Obj* and ref_cnt_ptr, you can get the object implicitly deleted when the last ref_cnt_ptr goes away, even though there are Obj* still alive.
So I am thinking about creating an explicit_delete_ref_cnt_ptr. I.e. a reference counted pointer where the delete is only done in an explicit delete routine. Using it in the one place where the existing code knows the lifetime of the object, as well as in my new code that keeps the object alive longer.
Incrementing and decrementing the reference count as explicit_delete_ref_cnt_ptr get manipulated.
But NOT freeing when the reference count is seen to be zero in the explicit_delete_ref_cnt_ptr destructor.
Only freeing when the reference count is seen to be zero in an explicit delete-like operation. E.g. in something like:
template<typename T> class explicit_delete_ref_cnt_ptr {
private:
T* ptr;
int rc;
...
public:
void delete_if_rc0() {
if( this->ptr ) {
this->rc--;
if( this->rc == 0 ) {
delete this->ptr;
}
this->ptr = 0;
}
}
};
OK, something like that. It's a bit unusual to have a reference counted pointer type not automatically delete the object pointed to in the rc'ed ptr destructor. But it seems like this might make mixing naked pointers and rc'ed pointers a bit safer.
But so far no need for delete this.
But then it occurred to me: if the object pointed to, the pointee, knows that it is being reference counted, e.g. if the count is inside the object (or in some other table), then the routine delete_if_rc0 could be a method of the pointee object, not the (smart) pointer.
class Pointee {
private:
int rc;
...
public:
void delete_if_rc0() {
this->rc--;
if( this->rc == 0 ) {
delete this;
}
}
}
};
Actually, it doesn't need to be a member method at all, but could be a free function:
map<void*,int> keepalive_map;
template<typename T>
void delete_if_rc0(T*ptr) {
void* tptr = (void*)ptr;
if( keepalive_map[tptr] == 1 ) {
delete ptr;
}
};
(BTW, I know the code is not quite right - it becomes less readable if I add all the details, so I am leaving it like this.)
Delete this is legal as long as object is in heap.
You would need to require object to be heap only.
The only way to do that is to make the destructor protected - this way delete may be called ONLY from class , so you would need a method that would ensure deletion
I recently came across some C++ code that looked like this:
class SomeObject
{
private:
// NOT a pointer
BigObject foobar;
public:
BigObject * getFoobar() const
{
return &foobar;
}
};
I asked the programmer why he didn't just make foobar a pointer, and he said that this way he didn't have to worry about allocating/deallocating memory. I asked if he considered using some smart pointer, he said this worked just as well.
Is this bad practice? It seems very hackish.
That's perfectly reasonable, and not "hackish" in any way; although it might be considered better to return a reference to indicate that the object definitely exists. A pointer might be null, and might lead some to think that they should delete it after use.
The object has to exist somewhere, and existing as a member of an object is usually as good as existing anywhere else. Adding an extra level of indirection by dynamically allocating it separately from the object that owns it makes the code less efficient, and adds the burden of making sure it's correctly deallocated.
Of course, the member function can't be const if it returns a non-const reference or pointer to a member. That's another advantage of making it a member: a const qualifier on SomeObject applies to its members too, but doesn't apply to any objects it merely has a pointer to.
The only danger is that the object might be destroyed while someone still has a pointer or reference to it; but that danger is still present however you manage it. Smart pointers can help here, if the object lifetimes are too complex to manage otherwise.
You are returning a pointer to a member variable not a reference. This is bad design.
Your class manages the lifetime of foobar object and by returning a pointer to its members you enable the consumers of your class to keep using the pointer beyond the lifetime of SomeObject object. And also it enables the users to change the state of SomeObject object as they wish.
Instead you should refactor your class to include the operations that would be done on the foobar in SomeObject class as methods.
ps. Consider naming your classes properly. When you define it is a class. When you instantiate, then you have an object of that class.
It's generally considered less than ideal to return pointers to internal data at all; it prevents the class from managing access to its own data. But if you want to do that anyway I see no great problem here; it simplifies the management of memory.
Is this bad practice? It seems very hackish.
It is. If the class goes out of scope before the pointer does, the member variable will no longer exist, yet a pointer to it still exists. Any attempt to dereference that pointer post class destruction will result in undefined behaviour - this could result in a crash, or it could result in hard to find bugs where arbitrary memory is read and treated as a BigObject.
if he considered using some smart pointer
Using smart pointers, specifically std::shared_ptr<T> or the boost version, would technically work here and avoid the potential crash (if you allocate via the shared pointer constructor) - however, it also confuses who owns that pointer - the class, or the caller? Furthermore, I'm not sure you can just add a pointer to an object to a smart pointer.
Both of these two points deal with the technical issue of getting a pointer out of a class, but the real question should be "why?" as in "why are you returning a pointer from a class?" There are cases where this is the only way, but more often than not you don't need to return a pointer. For example, suppose that variable needs to be passed to a C API which takes a pointer to that type. In this case, you would probably be better encapsulating that C call in the class.
As long as the caller knows that the pointer returned from getFoobar() becomes invalid when the SomeObject object destructs, it's fine. Such provisos and caveats are common in older C++ programs and frameworks.
Even current libraries have to do this for historical reasons. e.g. std::string::c_str, which returns a pointer to an internal buffer in the string, which becomes unusable when the string destructs.
Of course, that is difficult to ensure in a large or complex program. In modern C++ the preferred approach is to give everything simple "value semantics" as far as possible, so that every object's life time is controlled by the code that uses it in a trivial way. So there are no naked pointers, no explicit new or delete calls scattered around your code, etc., and so no need to require programmers to manually ensure they are following the rules.
(And then you can resort to smart pointers in cases where you are totally unable to avoid shared responsibility for object lifetimes.)
Two unrelated issues here:
1) How would you like your instance of SomeObject to manage the instance of BigObject that it needs? If each instance of SomeObject needs its own BigObject, then a BigObject data member is totally reasonable. There are situations where you'd want to do something different, but unless that situation arises stick with the simple solution.
2) Do you want to give users of SomeObject direct access to its BigObject? By default the answer here would be "no", on the basis of good encapsulation. But if you do want to, then that doesn't change the assessment of (1). Also if you do want to, you don't necessarily need to do so via a pointer -- it could be via a reference or even a public data member.
A third possible issue might arise that does change the assessment of (1):
3) Do you want to give users of SomeObject direct access to an instance of BigObject that they continue using beyond the lifetime of the instance of SomeObject that they got it from? If so then of course a data member is no good. The proper solution might be shared_ptr, or for SomeObject::getFooBar to be a factory that returns a different BigObject each time it's called.
In summary:
Other than the fact it doesn't compile (getFooBar() needs to return const BigObject*), there is no reason so far to suppose that this code is wrong. Other issues could arise that make it wrong.
It might be better style to return const & rather than const *. Which you return has no bearing on whether foobar should be a BigObject data member.
There is certainly no "just" about making foobar a pointer or a smart pointer -- either one would necessitate extra code to create an instance of BigObject to point to.
I am having my first attempt at using C++11 unique_ptr; I am replacing a polymorphic raw pointer inside a project of mine, which is owned by one class, but passed around quite frequently.
I used to have functions like:
bool func(BaseClass* ptr, int other_arg) {
bool val;
// plain ordinary function that does something...
return val;
}
But I soon realized that I wouldn't be able to switch to:
bool func(std::unique_ptr<BaseClass> ptr, int other_arg);
Because the caller would have to handle the pointer ownership to the function, what I don't want to. So, what is the best solution to my problem?
I though of passing the pointer as reference, like this:
bool func(const std::unique_ptr<BaseClass>& ptr, int other_arg);
But I feel very uncomfortable in doing so, firstly because it seems non instinctive to pass something already typed as _ptr as reference, what would be a reference of a reference. Secondly because the function signature gets even bigger. Thirdly, because in the generated code, it would be necessary two consecutive pointer indirections to reach my variable.
If you want the function to use the pointee, pass a reference to it. There's no reason to tie the function to work only with some kind of smart pointer:
bool func(BaseClass& base, int other_arg);
And at the call site use operator*:
func(*some_unique_ptr, 42);
Alternatively, if the base argument is allowed to be null, keep the signature as is, and use the get() member function:
bool func(BaseClass* base, int other_arg);
func(some_unique_ptr.get(), 42);
The advantage of using std::unique_ptr<T> (aside from not having to remember to call delete or delete[] explicitly) is that it guarantees that a pointer is either nullptr or it points to a valid instance of the (base) object. I will come back to this after I answer your question, but the first message is DO use smart pointers to manage the lifetime of dynamically allocated objects.
Now, your problem is actually how to use this with your old code.
My suggestion is that if you don't want to transfer or share ownership, you should always pass references to the object. Declare your function like this (with or without const qualifiers, as needed):
bool func(BaseClass& ref, int other_arg) { ... }
Then the caller, which has a std::shared_ptr<BaseClass> ptr will either handle the nullptr case or it will ask bool func(...) to compute the result:
if (ptr) {
result = func(*ptr, some_int);
} else {
/* the object was, for some reason, either not created or destroyed */
}
This means that any caller has to promise that the reference is valid and that it will continue to be valid throughout the execution of the function body.
Here is the reason why I strongly believe you should not pass raw pointers or references to smart pointers.
A raw pointer is only a memory address. Can have one of (at least) 4 meanings:
The address of a block of memory where your desired object is located. (the good)
The address 0x0 which you can be certain is not dereferencable and might have the semantics of "nothing" or "no object". (the bad)
The address of a block of memory which is outside of the addressable space of your process (dereferencing it will hopefully cause your program to crash). (the ugly)
The address of a block of memory which can be dereferenced but which doesn't contain what you expect. Maybe the pointer was accidentally modified and now it points to another writable address (of a completely other variable within your process). Writing to this memory location will cause lots of fun to happen, at times, during the execution, because the OS will not complain as long as you are allowed to write there. (Zoinks!)
Correctly using smart pointers alleviates the rather scary cases 3 and 4, which are usually not detectable at compile time and which you generally only experience at runtime when your program crashes or does unexpected things.
Passing smart pointers as arguments has two disadvantages: you cannot change the const-ness of the pointed object without making a copy (which adds overhead for shared_ptr and is not possible for unique_ptr), and you are still left with the second (nullptr) meaning.
I marked the second case as (the bad) from a design perspective. This is a more subtle argument about responsibility.
Imagine what it means when a function receives a nullptr as its parameter. It first has to decide what to do with it: use a "magical" value in place of the missing object? change behavior completely and compute something else (which doesn't require the object)? panic and throw an exception? Moreover, what happens when the function takes 2, or 3 or even more arguments by raw pointer? It has to check each of them and adapt its behavior accordingly. This adds a whole new level on top of input validation for no real reason.
The caller should be the one with enough contextual information to make these decisions, or, in other words, the bad is less frightening the more you know. The function, on the other hand, should just take the caller's promise that the memory it is pointed to is safe to work with as intended. (References are still memory addresses, but conceptually represent a promise of validity.)
I agree with Martinho, but I think it is important to point out the ownership semantics of a pass-by-reference. I think the correct solution is to use a simple pass-by-reference here:
bool func(BaseClass& base, int other_arg);
The commonly accepted meaning of a pass-by-reference in C++ is like as if the caller of the function tells the function "here, you can borrow this object, use it, and modify it (if not const), but only for the duration of the function body." This is, in no way, in conflict with the ownership rules of the unique_ptr because the object is merely being borrowed for a short period of time, there is no actual ownership transfer happening (if you lend your car to someone, do you sign the title over to him?).
So, even though it might seem bad (design-wise, coding practices, etc.) to pull the reference (or even the raw pointer) out of the unique_ptr, it actually is not because it is perfectly in accordance with the ownership rules set by the unique_ptr. And then, of course, there are other nice advantages, like clean syntax, no restriction to only objects owned by a unique_ptr, and so.
Personally, I avoid pulling a reference from a pointer/smart pointer. Because what happens if the pointer is nullptr? If you change the signature to this:
bool func(BaseClass& base, int other_arg);
You might have to protect your code from null pointer dereferences:
if (the_unique_ptr)
func(*the_unique_ptr, 10);
If the class is the sole owner of the pointer, the second of Martinho's alternative seems more reasonable:
func(the_unique_ptr.get(), 10);
Alternatively, you can use std::shared_ptr. However, if there's one single entity responsible for delete, the std::shared_ptr overhead does not pay off.
If I have a function that needs to work with a shared_ptr, wouldn't it be more efficient to pass it a reference to it (so to avoid copying the shared_ptr object)?
What are the possible bad side effects?
I envision two possible cases:
1) inside the function a copy is made of the argument, like in
ClassA::take_copy_of_sp(boost::shared_ptr<foo> &sp)
{
...
m_sp_member=sp; //This will copy the object, incrementing refcount
...
}
2) inside the function the argument is only used, like in
Class::only_work_with_sp(boost::shared_ptr<foo> &sp) //Again, no copy here
{
...
sp->do_something();
...
}
I can't see in both cases a good reason to pass the boost::shared_ptr<foo> by value instead of by reference. Passing by value would only "temporarily" increment the reference count due to the copying, and then decrement it when exiting the function scope.
Am I overlooking something?
Just to clarify, after reading several answers: I perfectly agree on the premature-optimization concerns, and I always try to first-profile-then-work-on-the-hotspots. My question was more from a purely technical code-point-of-view, if you know what I mean.
I found myself disagreeing with the highest-voted answer, so I went looking for expert opinons and here they are.
From http://channel9.msdn.com/Shows/Going+Deep/C-and-Beyond-2011-Scott-Andrei-and-Herb-Ask-Us-Anything
Herb Sutter: "when you pass shared_ptrs, copies are expensive"
Scott Meyers: "There's nothing special about shared_ptr when it comes to whether you pass it by value, or pass it by reference. Use exactly the same analysis you use for any other user defined type. People seem to have this perception that shared_ptr somehow solves all management problems, and that because it's small, it's necessarily inexpensive to pass by value. It has to be copied, and there is a cost associated with that... it's expensive to pass it by value, so if I can get away with it with proper semantics in my program, I'm gonna pass it by reference to const or reference instead"
Herb Sutter: "always pass them by reference to const, and very occasionally maybe because you know what you called might modify the thing you got a reference from, maybe then you might pass by value... if you copy them as parameters, oh my goodness you almost never need to bump that reference count because it's being held alive anyway, and you should be passing it by reference, so please do that"
Update: Herb has expanded on this here: http://herbsutter.com/2013/06/05/gotw-91-solution-smart-pointer-parameters/, although the moral of the story is that you shouldn't be passing shared_ptrs at all "unless you want to use or manipulate the smart pointer itself, such as to share or transfer ownership."
The point of a distinct shared_ptr instance is to guarantee (as far as possible) that as long as this shared_ptr is in scope, the object it points to will still exist, because its reference count will be at least 1.
Class::only_work_with_sp(boost::shared_ptr<foo> sp)
{
// sp points to an object that cannot be destroyed during this function
}
So by using a reference to a shared_ptr, you disable that guarantee. So in your second case:
Class::only_work_with_sp(boost::shared_ptr<foo> &sp) //Again, no copy here
{
...
sp->do_something();
...
}
How do you know that sp->do_something() will not blow up due to a null pointer?
It all depends what is in those '...' sections of the code. What if you call something during the first '...' that has the side-effect (somewhere in another part of the code) of clearing a shared_ptr to that same object? And what if it happens to be the only remaining distinct shared_ptr to that object? Bye bye object, just where you're about to try and use it.
So there are two ways to answer that question:
Examine the source of your entire program very carefully until you are sure the object won't die during the function body.
Change the parameter back to be a distinct object instead of a reference.
General bit of advice that applies here: don't bother making risky changes to your code for the sake of performance until you've timed your product in a realistic situation in a profiler and conclusively measured that the change you want to make will make a significant difference to performance.
Update for commenter JQ
Here's a contrived example. It's deliberately simple, so the mistake will be obvious. In real examples, the mistake is not so obvious because it is hidden in layers of real detail.
We have a function that will send a message somewhere. It may be a large message so rather than using a std::string that likely gets copied as it is passed around to multiple places, we use a shared_ptr to a string:
void send_message(std::shared_ptr<std::string> msg)
{
std::cout << (*msg.get()) << std::endl;
}
(We just "send" it to the console for this example).
Now we want to add a facility to remember the previous message. We want the following behaviour: a variable must exist that contains the most recently sent message, but while a message is currently being sent then there must be no previous message (the variable should be reset before sending). So we declare the new variable:
std::shared_ptr<std::string> previous_message;
Then we amend our function according to the rules we specified:
void send_message(std::shared_ptr<std::string> msg)
{
previous_message = 0;
std::cout << *msg << std::endl;
previous_message = msg;
}
So, before we start sending we discard the current previous message, and then after the send is complete we can store the new previous message. All good. Here's some test code:
send_message(std::shared_ptr<std::string>(new std::string("Hi")));
send_message(previous_message);
And as expected, this prints Hi! twice.
Now along comes Mr Maintainer, who looks at the code and thinks: Hey, that parameter to send_message is a shared_ptr:
void send_message(std::shared_ptr<std::string> msg)
Obviously that can be changed to:
void send_message(const std::shared_ptr<std::string> &msg)
Think of the performance enhancement this will bring! (Never mind that we're about to send a typically large message over some channel, so the performance enhancement will be so small as to be unmeasureable).
But the real problem is that now the test code will exhibit undefined behaviour (in Visual C++ 2010 debug builds, it crashes).
Mr Maintainer is surprised by this, but adds a defensive check to send_message in an attempt to stop the problem happening:
void send_message(const std::shared_ptr<std::string> &msg)
{
if (msg == 0)
return;
But of course it still goes ahead and crashes, because msg is never null when send_message is called.
As I say, with all the code so close together in a trivial example, it's easy to find the mistake. But in real programs, with more complex relationships between mutable objects that hold pointers to each other, it is easy to make the mistake, and hard to construct the necessary test cases to detect the mistake.
The easy solution, where you want a function to be able to rely on a shared_ptr continuing to be non-null throughout, is for the function to allocate its own true shared_ptr, rather than relying on a reference to an existing shared_ptr.
The downside is that copied a shared_ptr is not free: even "lock-free" implementations have to use an interlocked operation to honour threading guarantees. So there may be situations where a program can be significantly sped up by changing a shared_ptr into a shared_ptr &. But it this is not a change that can be safely made to all programs. It changes the logical meaning of the program.
Note that a similar bug would occur if we used std::string throughout instead of std::shared_ptr<std::string>, and instead of:
previous_message = 0;
to clear the message, we said:
previous_message.clear();
Then the symptom would be the accidental sending of an empty message, instead of undefined behaviour. The cost of an extra copy of a very large string may be a lot more significant than the cost of copying a shared_ptr, so the trade-off may be different.
I would advise against this practice unless you and the other programmers you work with really, really know what you are all doing.
First, you have no idea how the interface to your class might evolve and you want to prevent other programmers from doing bad things. Passing a shared_ptr by reference isn't something a programmer should expect to see, because it isn't idiomatic, and that makes it easy to use it incorrectly. Program defensively: make the interface hard to use incorrectly. Passing by reference is just going to invite problems later on.
Second, don't optimize until you know this particular class is going to be a problem. Profile first, and then if your program really needs the boost given by passing by reference, then maybe. Otherwise, don't sweat the small stuff (i.e. the extra N instructions it takes to pass by value) instead worry about design, data structures, algorithms, and long-term maintainability.
Yes, taking a reference is fine there. You don't intend to give the method shared ownership; it only wants to work with it. You could take a reference for the first case too, since you copy it anyway. But for first case, it takes ownership. There is this trick to still copy it only once:
void ClassA::take_copy_of_sp(boost::shared_ptr<foo> sp) {
m_sp_member.swap(sp);
}
You should also copy when you return it (i.e not return a reference). Because your class doesn't know what the client is doing with it (it could store a pointer to it and then big bang happens). If it later turns out it's a bottleneck (first profile!), then you can still return a reference.
Edit: Of course, as others point out, this only is true if you know your code and know that you don't reset the passed shared pointer in some way. If in doubt, just pass by value.
It is sensible to pass shared_ptrs by const&. It will not likely cause trouble (except in the unlikely case that the referenced shared_ptr is deleted during the function call, as detailed by Earwicker) and it will likely be faster if you pass a lot of these around. Remember; the default boost::shared_ptr is thread safe, so copying it includes a thread safe increment.
Try to use const& rather than just &, because temporary objects may not be passed by non-const reference. (Even though a language extension in MSVC allows you to do it anyway)
In the second case, doing this is simpler:
Class::only_work_with_sp(foo &sp)
{
...
sp.do_something();
...
}
You can call it as
only_work_with_sp(*sp);
I would avoid a "plain" reference unless the function explicitely may modify the pointer.
A const & may be a sensible micro-optimization when calling small functions - e.g. to enable further optimizations, like inlining away some conditions. Also, the increment/decrement - since it's thread safe - is a synchronization point. I would not expect this to make a big difference in most scenarios, though.
Generally, you should use the simpler style unless you have reason not to. Then, either use the const & consistently, or add a comment as to why if you use it just in a few places.
I would advocate passing shared pointer by const reference - a semantics that the function being passed with the pointer does NOT own the pointer, which is a clean idiom for developers.
The only pitfall is in multiple thread programs the object being pointed by the shared pointer gets destroyed in another thread. So it is safe to say using const reference of shared pointer is safe in single threaded program.
Passing shared pointer by non-const reference is sometimes dangerous - the reason is the swap and reset functions the function may invoke inside so as to destroy the object which is still considered valid after the function returns.
It is not about premature optimization, I guess - it is about avoiding unnecessary waste of CPU cycles when you are clear what you want to do and the coding idiom has firmly been adopted by your fellow developers.
Just my 2 cents :-)
It seems that all the pros and cons here can actually be generalised to ANY type passed by reference not just shared_ptr. In my opinion, you should know the semantic of passing by reference, const reference and value and use it correctly. But there is absolutely nothing inherently wrong with passing shared_ptr by reference, unless you think that all references are bad...
To go back to the example:
Class::only_work_with_sp( foo &sp ) //Again, no copy here
{
...
sp.do_something();
...
}
How do you know that sp.do_something() will not blow up due to a dangling pointer?
The truth is that, shared_ptr or not, const or not, this could happen if you have a design flaw, like directly or indirectly sharing the ownership of sp between threads, missusing an object that do delete this, you have a circular ownership or other ownership errors.
One thing that I haven't seen mentioned yet is that when you pass shared pointers by reference, you lose the implicit conversion that you get if you want to pass a derived class shared pointer through a reference to a base class shared pointer.
For example, this code will produce an error, but it will work if you change test() so that the shared pointer is not passed by reference.
#include <boost/shared_ptr.hpp>
class Base { };
class Derived: public Base { };
// ONLY instances of Base can be passed by reference. If you have a shared_ptr
// to a derived type, you have to cast it manually. If you remove the reference
// and pass the shared_ptr by value, then the cast is implicit so you don't have
// to worry about it.
void test(boost::shared_ptr<Base>& b)
{
return;
}
int main(void)
{
boost::shared_ptr<Derived> d(new Derived);
test(d);
// If you want the above call to work with references, you will have to manually cast
// pointers like this, EVERY time you call the function. Since you are creating a new
// shared pointer, you lose the benefit of passing by reference.
boost::shared_ptr<Base> b = boost::dynamic_pointer_cast<Base>(d);
test(b);
return 0;
}
I'll assume that you are familiar with premature optimization and are asking this either for academic purposes or because you have isolated some pre-existing code that is under-performing.
Passing by reference is okay
Passing by const reference is better, and can usually be used, as it does not force const-ness on the object pointed to.
You are not at risk of losing the pointer due to using a reference. That reference is evidence that you have a copy of the smart pointer earlier in the stack and only one thread owns a call stack, so that pre-existing copy isn't going away.
Using references is often more efficient for the reasons you mention, but not guaranteed. Remember that dereferencing an object can take work too. Your ideal reference-usage scenario would be if your coding style involves many small functions, where the pointer would get passed from function to function to function before being used.
You should always avoid storing your smart pointer as a reference. Your Class::take_copy_of_sp(&sp) example shows correct usage for that.
Assuming we are not concerned with const correctness (or more, you mean to allow the functions to be able to modify or share ownership of the data being passed in), passing a boost::shared_ptr by value is safer than passing it by reference as we allow the original boost::shared_ptr to control it's own lifetime. Consider the results of the following code...
void FooTakesReference( boost::shared_ptr< int > & ptr )
{
ptr.reset(); // We reset, and so does sharedA, memory is deleted.
}
void FooTakesValue( boost::shared_ptr< int > ptr )
{
ptr.reset(); // Our temporary is reset, however sharedB hasn't.
}
void main()
{
boost::shared_ptr< int > sharedA( new int( 13 ) );
boost::shared_ptr< int > sharedB( new int( 14 ) );
FooTakesReference( sharedA );
FooTakesValue( sharedB );
}
From the example above we see that passing sharedA by reference allows FooTakesReference to reset the original pointer, which reduces it's use count to 0, destroying it's data. FooTakesValue, however, can't reset the original pointer, guaranteeing sharedB's data is still usable. When another developer inevitably comes along and attempts to piggyback on sharedA's fragile existence, chaos ensues. The lucky sharedB developer, however, goes home early as all is right in his world.
The code safety, in this case, far outweighs any speed improvement copying creates. At the same time, the boost::shared_ptr is meant to improve code safety. It will be far easier to go from a copy to a reference, if something requires this kind of niche optimization.
Sandy wrote: "It seems that all the pros and cons here can actually be generalised to ANY type passed by reference not just shared_ptr."
True to some extent, but the point of using shared_ptr is to eliminate concerns regarding object lifetimes and to let the compiler handle that for you. If you're going to pass a shared pointer by reference and allow clients of your reference-counted-object call non-const methods that might free the object data, then using a shared pointer is almost pointless.
I wrote "almost" in that previous sentence because performance can be a concern, and it 'might' be justified in rare cases, but I would also avoid this scenario myself and look for all possible other optimization solutions myself, such as to seriously look at adding another level of indirection, lazy evaluation, etc..
Code that exists past it's author, or even post it's author's memory, that requires implicit assumptions about behavior, in particular behavior about object lifetimes, requires clear, concise, readable documentation, and then many clients won't read it anyway! Simplicity almost always trumps efficiency, and there are almost always other ways to be efficient. If you really need to pass values by reference to avoid deep copying by copy constructors of your reference-counted-objects (and the equals operator), then perhaps you should consider ways to make the deep-copied data be reference counted pointers that can be copied quickly. (Of course, that's just one design scenario that might not apply to your situation).
I used to work in a project that the principle was very strong about passing smart pointers by value. When I was asked to do some performance analysis - I found that for increment and decrement of the reference counters of the smart pointers the application spends between 4-6% of the utilized processor time.
If you want to pass the smart pointers by value just to avoid having issues in weird cases as described from Daniel Earwicker make sure you understand the price you paying for it.
If you decide to go with a reference the main reason to use const reference is to make it possible to have implicit upcasting when you need to pass shared pointer to object from class that inherits the class you use in the interface.
In addition to what litb said, I'd like to point out that it's probably to pass by const reference in the second example, that way you are sure you don't accidentally modify it.
struct A {
shared_ptr<Message> msg;
shared_ptr<Message> * ptr_msg;
}
pass by value:
void set(shared_ptr<Message> msg) {
this->msg = msg; /// create a new shared_ptr, reference count will be added;
} /// out of method, new created shared_ptr will be deleted, of course, reference count also be reduced;
pass by reference:
void set(shared_ptr<Message>& msg) {
this->msg = msg; /// reference count will be added, because reference is just an alias.
}
pass by pointer:
void set(shared_ptr<Message>* msg) {
this->ptr_msg = msg; /// reference count will not be added;
}
Every code piece must carry some sense. If you pass a shared pointer by value everywhere in the application, this means "I am unsure about what's going on elsewhere, hence I favour raw safety". This is not what I call a good confidence sign to other programmers who could consult the code.
Anyway, even if a function gets a const reference and you are "unsure", you can still create a copy of the shared pointer at the head of the function, to add a strong reference to the pointer. This could also be seen as a hint about the design ("the pointer could be modified elsewhere").
So yes, IMO, the default should be "pass by const reference".