Smart-pointers are generally tiny so passing by value isn't a problem, but is there any problem passing references to them; or rather are there specific cases where this mustn't be done?
I'm writing a wrapper library and several of my classes wrap smart-pointer objects in the underlying library... my classes are not smart-pointers but the APIs currently pass smart-pointer objects by value.
e.g current code:
void class::method(const AnimalPtr pAnimal) { ... }
becomes
void class::method(const MyAnimal &animal){...}
where MyAnimal is my new wrapper class encapsulating AnimalPtr.
There is no guarantee the Wrapper classes won't one day grow beyond wrapping a smart-pointer, so passing by value makes me nervous.
You should pass shared pointers by reference, not value, in most cases. While the size of a std::shared_ptr is small, the cost of copying involves an atomic operation (conceptually an atomic increment and an atomic decrement on destruction of the copy, although I believe that some implementations manage to do a non-atomic increment).
In other cases, for example std::unique_ptr you might prefer to pass by value, as the copy will have to be a move and it clearly documents that ownership of the object is transferred to the function (if you don't want to transfer ownership, then pass a reference to the real object, not the std::unique_ptr).
In other cases your mileage might vary. You need to be aware of what the semantics of copy are for your smart pointer, and whether you need to pay for the cost or not.
It's ok to pass a smart pointer by reference, except if it's to a constructor. In a constructor, it's possible to store a reference to the original object, which violates the contract of the smart pointers. You would likely get memory corruption if you did that. Even if your constructor does not today store the reference, I would still be wary because code changes and it's an easy thing to miss if you decide later you need to hold the variable longer.
In a normal function, you cannot store a function parameter as a reference anywhere because references must be set during their initialization. You could assign the reference to some longer-living non-reference variable, but that would be a copy and so would increase its lifetime appropriately. So in either case, you could not hold onto it past when the calling function might have freed it. In this case, you might get a small performance boost with a reference, but I wouldn't plan on noticing it in most cases.
So I would say - constructor, always pass by value; other functions, pass by reference if you want.
Related
I've been reading quite a number of discussions about performance issues when smart pointers are involved in an application. One of the frequent recommendations is to pass a smart pointer as const& instead of a copy, like this:
void doSomething(std::shared_ptr<T> o) {}
versus
void doSomething(const std::shared_ptr<T> &o) {}
However, doesn't the second variant actually defeat the purpose of a shared pointer? We are actually sharing the shared pointer here, so if for some reasons the pointer is released in the calling code (think of reentrancy or side effects) that const pointer becomes invalid. A situation the shared pointer actually should prevent. I understand that const& saves some time as there is no copying involved and no locking to manage the ref count. But the price is making the code less safe, right?
The advantage of passing the shared_ptr by const& is that the reference count doesn't have to be increased and then decreased. Because these operations have to be thread-safe, they can be expensive.
You are quite right that there is a risk that you can have a chain of passes by reference that later invalidates the head of the chain. This happened to me once in a real-world project with real-world consequences. One function found a shared_ptr in a container and passed a reference to it down a call stack. A function deep in the call stack removed the object from the container, causing all the references to suddenly refer to an object that no longer existed.
So when you pass something by reference, the caller must ensure it survives for the life of the function call. Don't use a pass by reference if this is an issue.
(I'm assuming you have a use case where there's some specific reason to pass by shared_ptr rather than by reference. The most common such reason would be that the function called may need to extend the life of the object.)
Update: Some more details on the bug for those interested: This program had objects that were shared and implemented internal thread safety. They were held in containers and it was common for functions to extend their lifetimes.
This particular type of object could live in two containers. One when it was active and one when it was inactive. Some operations worked on active objects, some on inactive objects. The error case occurred when a command was received on an inactive object that made it active while the only shared_ptr to the object was held by the container of inactive objects.
The inactive object was located in its container. A reference to the shared_ptr in the container was passed, by reference, to the command handler. Through a chain of references, this shared_ptr ultimately got to the code that realized this was an inactive object that had to be made active. The object was removed from the inactive container (which destroyed the inactive container's shared_ptr) and added to the active container (which added another reference to the shared_ptr passed to the "add" routine).
At this point, it was possible that the only shared_ptr to the object that existed was the one in the inactive container. Every other function in the call stack just had a reference to it. When the object was removed from the inactive container, the object could be destroyed and all those references were to a shared_ptr that that no longer existed.
It took about a month to untangle this.
First of all, don't pass a shared_ptr down a call chain unless there is a possibility that one of the called functions will store a copy of it. Pass a reference to the referred object, or a raw pointer to that object, or possibly a box, depending on whether it can be optional or not.
But when you do pass a shared_ptr, then preferably pass it by reference to const, because copying a shared_ptr has additional overhead. The copying must update the shared reference count, and this update must be thread safe. Hence there is a little inefficiency that can be (safely) avoided.
Regarding
” the price is making the code less safe, right?
No. The price is an extra indirection in naïvely generated machine code, but the compiler manages that. So it's all about just avoiding a minor but totally needless overhead that the compiler can't optimize away for you, unless it's super-smart.
As David Schwarz exemplified in his answer, when you pass by reference to const the aliasing problem, where the function you call in turn changes or calls a function that changes the original object, is possible. And by Murphy's law it will happen at the most inconvenient time, at maximum cost, and with the most convoluted impenetrable code. But this is so regardless of whether the argument is a string or a shared_ptr or whatever. Happily it's a very rare problem. But do keep it in mind, also for passing shared_ptr instances.
First of all there is a semantic difference between the two:
passing shared pointer by value indicates your function is going to take its part of the underlying object ownership.
Passing shared_ptr as const reference does not indicate any intent on top of just passing the underlying object by const reference (or raw pointer) apart from inforcing users of this function to use shared_ptr. So mostly rubbish.
Comparing performance implications of those is irrelevant as long as they are semantically different.
from https://herbsutter.com/2013/06/05/gotw-91-solution-smart-pointer-parameters/
Don’t pass a smart pointer as a function parameter unless you want to
use or manipulate the smart pointer itself, such as to share or
transfer ownership.
and this time I totally agree with Herb :)
And another quote from the same, which answers the question more directly
Guideline: Use a non-const shared_ptr& parameter only to modify the shared_ptr. Use a const shared_ptr& as a parameter only if you’re not sure whether or not you’ll take a copy and share ownership; otherwise use * instead (or if not nullable, a &)
As pointed out in C++ - shared_ptr: horrible speed, copying a shared_ptr takes time. The construction involves an atomic increment and the destruction an atomic decrement, an atomic update (whether increment or decrement) may prevent a number of compiler optimizations (memory loads/stores cannot migrate across the operation) and at hardware level involves the CPU cache coherency protocol to ensure that the whole cache line is owned (exclusive mode) by the core doing the modification.
So, you are right, std::shared_ptr<T> const& may be used as a performance improvement over just std::shared_ptr<T>.
You are also right that there is a theoretical risk for the pointer/reference to become dangling because of some aliasing.
That being said, the risk is latent in any C++ program already: any single use of a pointer or reference is a risk. I would argue that the few occurrences of std::shared_ptr<T> const& should be a drop in the water compared to the total number of uses of T&, T const&, T*, ...
Lastly, I would like to point that passing a shared_ptr<T> const& is weird. The following cases are common:
shared_ptr<T>: I need a copy of the shared_ptr
T*/T const&/T&/T const&: I need a (possibly null) handle to T
The next case is much less common:
shared_ptr<T>&: I may reseat the shared_ptr
But passing shared_ptr<T> const&? Legitimate uses are very very rare.
Passing shared_ptr<T> const& where all you want is a reference to T is an anti-pattern: you force the user to use shared_ptr when they could be allocating T another way! Most of the times (99,99..%), you should not care how T is allocated.
The only case where you would pass a shared_ptr<T> const& is if you are not sure whether you will need a copy or not, and because you have profiled the program and showed that this atomic increment/decrement was a bottleneck you have decided to defer the creation of the copy to only the cases where it is needed.
This is such an edge case that any use of shared_ptr<T> const& should be viewed with the highest degree of suspicion.
If no modification of ownership is involved in your method, there's no benefit for your method to take a shared_ptr by copy or by const reference, it pollutes the API and potentially incur overhead (if passing by copy)
The clean way is to pass the underlying type by const ref or ref depending of your use case
void doSomething(const T& o) {}
auto s = std::make_shared<T>(...);
// ...
doSomething(*s);
The underlying pointer can't be released during the method call
I think its perfectly reasonable to pass by const & if the target function is synchronous and only makes use of the parameter during execution, and has no further need of it upon return. Here it is reasonable to save on the cost of increasing the reference count - as you don't really need the extra safety in these limited circumstances - provided you understand the implications and are sure the code is safe.
This is as opposed to when the function needs to save the parameter (for example in a class member) for later re-reference.
Someone made question "should I pass shared_ptr by reference" and he got this reply which has plenty upvotes. https://stackoverflow.com/a/8385731/5543597
It makes me wonder why he has so many upvotes, and if it's true what he is saying:
That depends on what you want. Should the callee share ownership of
the object? Then it needs its own copy of the shared_ptr. So pass it
by value.
If a function simply needs to access an object owned by the caller, go
ahead and pass by (const) reference, to avoid the overhead of copying
the shared_ptr.
The best practice in C++ is always to have clearly defined ownership
semantics for your objects. There is no universal "always do this" to
replace actual thought.
If you always pass shared pointers by value, it gets costly (because
they're a lot more expensive to copy than a raw pointer). If you never
do it, then there's no point in using a shared pointer in the first
place.
Copy the shared pointer when a new function or object needs to share
ownership of the pointee.
Especially here:
Should the callee share ownership of the object? Then it needs its own
copy of the shared_ptr. So pass it by value.
Why creating copy of shared_ptr by passing by value, when it could be reference and callee could just make copy of shared_ptr using reference he received, once he decide to store it in his data?
Or this:
If you never do it, then there's no point in using a shared pointer in
the first place.
When passing it by refence to functions, it still exists in parent function. And once decided to store it, it can be stored without problems.
Are these statements correct?
Yes that answer is correct.
Why creating copy of shared_ptr by passing by value, when it could be reference and callee could just make copy of shared_ptr using reference he received, once he decide to store it in his data?
You could do that. But passing by value serves as self-documentation that the callee might will make a copy. Also, passing by value has the advantage that the function can be called with an rvalue and moved out of, e.g. func( make_shared<T>() );
I have a collection (currently boost::ptr_vector) of objects (lets call this vec) that needs to be passed to a few functors. I want all of the functors to have a reference/pointer to the same vec which is essentially a cache so that each functor has the same data cache. There are three ways that I can think of doing this:
Passing a boost::ptr_vector<object>& to the constructor of Functor and having a boost::ptr_vector<object>& member in the Functor class
Passing a boost::ptr_vector<object>* to the constructor of Functor and having a boost::ptr_vector<object>* member in the Functor class
avoid the use of boost::ptr_vector and directly pass an array (object*) to the constructor
I have tried to use method 3, but have been told constantly that I should use a vector instead of a raw pointer. So, I tried method 2 but this added latency to my program due to the extra level of indirection added by the pointer. I am using method 1 at the moment, however I may need to reassign the cache during the lifetime of the functor (as the data cache may change) so this may not be an appropriate alternative.
Which I don't fully understand. I assume somewhere along the way the functor is being copied (although these are all stored in a ptr_vector themselves).
Is method 3 the best for my case? method 2, is too slow (latency is very crucial), and as for method 1, I have been advised time and again to use vectors instead.
Any advice is much appreciated
A reference in C++ can only be initialized ('bound') to a variable.
After that point, a reference can not be "reseated" (made to refer to a different variable) during it's lifetime.
This is why a default copy constructor could conceivably be generated, but never the assignment operator, since that would require the reference to be 'changed'.
My recommended approach here is to use a smart pointer instead of a reference.
std::unique_ptr (simplest, takes care of allocation/deallocation)
std::shared_ptr (more involved, allows sharing of the ownership)
In this case:
std::shared_ptr<boost::ptr_vector<object> > m_coll;
would seem to be a good fit
When passing objects into functions, do the same rules apply to smart pointers as to other objects that contain dynamic memory?
When I pass, for example, a std::vector<std::string> into a function I always consider the following options:
I'm going to change the state of the vector object, but I do not want those changes reflected after the function has finished, AKA make a copy.
void function(std::vector<std::string> vec);
I'm going to change the state of the vector object, and I do want those changes reflected after the function has finished, AKA make a reference.
void function(std::vector<std::string> & vec);
This object is pretty big, so I'd better pass a reference, but tell the compiler not to let me change it.
void function(std::vector<std::string> const& vec);
Now is this the same logic with smart pointers? And when should I consider move semantics? Some guidelines on how I should pass smart pointers is what I desire most.
Smart pointers have pointer semantics, not value semantics (well, not the way you mean it). Think of shared_ptr<T> as a T*; treat it as such (well, except for the reference counting and automatic deletion). Copying a smart pointer does not copy the object it points to, just like copying a T* does not copy the T it points to.
You can't copy a unique_ptr at all. The whole point of the class is that it cannot be copied; if it could, then it wouldn't be a unique (ie: singular) pointer to an object. You have to either pass it by some form of reference or by moving it.
Smart pointers are all about ownership of what they point to. Who owns this memory and who will be responsible for deleting it. unique_ptr represents unique ownership: exactly one piece of code owns this memory. You can transfer ownership (via move), but in so doing, you lose ownership of the memory. shared_ptr represents shared ownership.
In all cases, the use of a smart pointer in a parameter list represents transferring ownership. Therefore, if a function takes a smart pointer, then it is going to claim ownership of that object. If a function isn't supposed to take ownership, then it shouldn't be taking a smart pointer at all; use a reference (T&) or if you have need of nullability, a pointer but never store it.
If you are passing someone a unique_ptr, you are giving them ownership. Which means, by the nature of unique ownership, you are losing ownership of the memory. Thus, there's almost no reason to ever pass a unique_ptr by anything except by value.
Similarly, if you want to share ownership of some object, you pass in a shared_ptr. Whether you do it by reference or by value is up to you. Since you're sharing ownership, it's going to make a copy anyway (presumably), so you might as well take it by value. The function can use std::move to move it into class members or the like.
If the function isn't going to modify or make a copy of the pointer, just use a dumb pointer instead. Smart pointers are used to control the lifetime of an object, but the function isn't going to change the lifetime so it doesn't need a smart pointer, and using a dumb pointer gives you some flexibility in the type used by the caller.
void function(std::string * ptr);
function(my_unique_ptr.get());
function(my_shared_ptr.get());
function(my_dumb_ptr);
unique_ptr can't be copied without invalidating the original, so if you must pass it you must pass a reference.
For a more in-depth look at this recommendation by someone a lot smarter than me, see Herb Sutter's GotW #91 Solution: Smart Pointer Parameters. He goes beyond this recommendation and suggests that if the pointer can't be null, you should pass by reference instead of pointer. This requires dereferencing the pointer at the site of the call.
void function(std::string & val);
assert(my_unique_ptr != nullptr && my_shared_ptr != nullptr && my_dumb_ptr != nullptr);
function(*my_unique_ptr);
function(*my_shared_ptr);
function(*my_dumb_ptr);
A smart pointer is an object that refer to another object an manages its lifetime.
Passing a smart pointer reuquires to respect the semantics the smart poitner support:
Passing as const smartptr<T>& always work (and you cannot change the pointer, but can change the state of what it points to).
Passing as smartptr<T>& always work (and you can change the pointer as well).
Passing as smartptr<T> (by copy) works only if smartptr is copyable. It works with std::shared_ptr, but not with std::unique_ptr, unless you "move" it on call, like in func(atd::move(myptr)), thus nullifying myptr, moving the pointer to the passed parameter. (Note that move is implicit if myptr is temporary).
Passing as smartptr<T>&& (by move) imposes the pointer to be moved on call, by forcing you to explicitly use std::move (but requires "move" to make sense for the particular pointer).
I know the person asked this question is more knowledgeable than me in C++ and There are some perfect answer to this question but I believe This question better be answer in a way that it doesn't scares people from C++, although it could get a little complicated, and this is my try:
Consider there is only one type of smart pointer in C++ and it is shared_ptr, so we have these options to pass it to a function:
1 - by value : void f(std::shared_ptr<Object>);
2 - by reference : void f(std::shared_ptr<Object>&);
The biggest difference is ,the First one lend you the ownership and The second one let you manipulate the ownership.
further reading and details could be at this link which helped me before.
So, as we're all hopefully aware, in Object-oriented programming when the occasion comes when you need somehow access an instance of a class in another class's method, you turn to passing that instance through arguments.
I'm curious, what's the difference in terms of good practice / less prone to breaking things when it comes to either passing an Object, or a Pointer to that object?
Get into the habit of passing objects by reference.
void DoStuff(const vector<int>& values)
If you need to modify the original object, omit the const qualifier.
void DoStuff(vector<int>& values)
If you want to be able to accept an empty/nothing answer, pass it by pointer.
void DoStuff(vector<int>* values)
If you want to do stuff to a local copy, pass it by value.
void DoStuff(vector<int> values)
Problems will only really pop up when you introduce tons of concurrency. By that time, you will know enough to know when to not use certain passing techniques.
Pass a pointer to the object if you want to be able to indicate nonexistence (by passing a NULL).
Try not to pass by value for objects, as that invokes a copy constructor to create a local version of the object within the scope of the call function. Instead, pass by reference. However, there are two modes here. In order to get the exact same effective behavior of passing by value (immutable "copy") without the overhead, pass by const reference. If you feel you will need to alter the passed object, pass by (non-const) reference.
I choose const reference as a default. Of course, non-const if you must mutate the object for the client. Deviation from using references is rarely required.
Pointers are not very C++ - like, since references are available. References are nice because they are forbidden to refer to nothing. Update: To clarify, proper containers for types and arrays are preferred, but for some internal implementations, you will need to pass a pointer.
Objects/values, are completely different in semantics. If I need a copy, I will typically just create it inside the function where needed:
void method(const std::string& str) {
std::string myCopy(str);
...
In fact what you can pass to a method is a pointer to object, a reference to the object and a copy of the object and all of these can also be constant. Depending on your needs you should choose the one that best suits your needs.
First descision you can make is whether the thing you pass should be able to change in your method or not. If you do not intend to change it then a const reference in probably the best alternative(by not changing I also mean you do not intend to call any non-const methods of that object). What are the advantages to that? You safe time for compying the object and also the method signature itself will say "I will not change that parameter".
If you have to change this object you can pass either a reference or a pointer to it. It is not very obligatory to choose just one of these options so you can go for either. The only difference I can think of is that pointer can be NULL(i.e. not pointing to any object at all) while a reference is always pointing to an existent object.
If what you need in your method is a copy of your object, then what you should pass a copy of the object(not a reference and not a pointer). For instance if your method looks like
void Foo(const A& a) {
A temp = a;
}
Then that is a clear indication that passing a copy is a better alternative.
Hope this makes things a bit clearer.
Actually, there's really no good reason for passing a pointer to an object, unless you want to somehow indicate that no object exists.
If you want to change the object, pass a reference to it. If you want to protect it from change within the function, pass it by value or at least const reference.
Some people pass by reference for the speed improvements (passing only an address of a large structure rather than the structure itself for example) but I don't agree with that. In most cases, I'd prefer my software to be safe than fast, a corollary of the saying: "you can't get any less optimised than wrong". :-)
Object-oriented programming is about polymorphism, Liskov Substitution Principle, old code calling new code, you name it. Pass a concrete (derived) object to a routine that works with more abstract (base) objects. If you are not doing that, you are not doing OOP.
This is only achievable when passing references or pointers. Passing by value is best reserved for, um, values.
It is useful to distinguish between values and objects. Values are always concrete, there's no polymorphism. They are often immutable. 5 is 5 and "abc" is "abc". You can pass them by value or by (const) reference.
Objects are always abstract to some degree. Given an object, one can almost always refine it to a more concrete object. A RectangularArea could be a Drawable which could be a Window which could be a ToplevelWindow which could be a ManagedWindow which could be... These must be passed by reference.
Pointers are a wholly separate can of worms. In my experience, naked pointers are best avoided. Use a smart pointer that cannot be NULL. If you need an optional argument, use an explicit optional class template such as boost::optional.