Related
I think it's best if I describe the situation using a code example:
int MyFuncA()
{
MyClass someInstance;
//<Work with and fill someInstance...>
MyFuncB( &someInstance )
}
int MyFuncB( MyClass* instance )
{
//Do anything you could imagine with instance, *except*:
//* Allowing references to it or any of it's data members to escape this function
//* Freeing anything the class will free in it's destructor, including itself
instance->DoThis();
instance->ModifyThat();
}
And here come my straightforward questions:
Is the above concept guranteed, by C and C++ standards, to work as expected? Why (not)?
Is this considered doing this, sparingly and with care, bad practice?
Is the above concept guranteed, by C and C++ standards, to work as expected? Why (not)?
Yes, it will work as expected. someInstance is available through the scope of MyFuncA. The call to MyFuncB is within that scope.
Is this considered doing this, sparingly and with care, bad practice?
Don't see why.
I don't see any problem in actually using the pointer you were passed to call functions on the object. As long as you call public methods of MyClass, everything remains valid C/C++.
The actual instance you create at the beginning of MyFuncA() will get destroyed at the end of MyFuncA(), and you are guaranteed that the instance will remain valid for the whole execution of MyFuncB() because someInstance is still valid in the scope of MyFuncA().
Yes it will work. It does not matter if the pointer you pass into MyFuncB is on the stack or on the heap (in this specific case).
In regards for the bad practice part you can probably argue both ways. In general it's bad I think because if for any reason any object which is living outside of MyFuncA gets hold of the object reference then it will die a horrible death later on and cause sometime very hard to track bugs. It rewally depends how extensive the usage of the object becomes in MyFuncB. Especially when it starts involving another 3rd class it can get messy.
Others have answered the basic question, with "yeah, that's legal". And in the absence of greater architecture it is hard to call it good or bad practice. But I'll try and wax philosophical on the broader question you seem to be picking up about pointers, object lifetimes, and expectations across function calls...
In the C++ language, there's no built-in way to pass a pointer to a function and "enforce" that it won't stow that away after the call is complete. And since C++ pointers are "weak references" by default, the objects pointed to may disappear out from under someone you pass it to.
But explicitly weak pointer abstractions do exist, for instance in Qt:
http://doc.qt.nokia.com/latest/qweakpointer.html
These are designed to specifically encode the "paranoia" to the recipient that the object it is holding onto can disappear out from under it. Anyone dereferencing one sort of realizes something is up, and they have to take the proper cautions under the design contract.
Additionally, abstractions like shared pointer exist which signal a different understanding to the recipient. Passing them one of those gives them the right to keep the object alive as long as they want, giving you something like garbage collection:
http://doc.qt.nokia.com/4.7-snapshot/qsharedpointer.html
These are only some options. But in the most general sense, if you come up with any interesting invariant for the lifetimes of your object...consider not passing raw pointers. Instead pass some pointer-wrapping class that embodies and documents the rules of the "game" in your architecture.
(One of major the reasons to use C++ instead of other languages is the wealth of tools you have to do cool things like that, without too much runtime cost!)
i don't think there should be any problem with that barring, as you say, something that frees the object, or otherwise trashes its state. i think whatever unexpected things happen would not have anything to do with using the class this way. (nothing in life is guaranteed of course, but classes are intended to be passed around and operated on, whether it's a local variable or otherwise i do not believe is relevant.)
the one thing you would not be able to do is keep a reference to the class after it goes out of scope when MyFuncA() returns, but that's just the nature of the scoping rules.
I have an object which implements reference counting mechanism. If the number of references to it becomes zero, the object is deleted.
I found that my object is never deleted, even when I am done with it. This is leading to memory overuse. All I have is the number of references to the object and I want to know the places which reference it so that I can write appropriate cleanup code.
Is there some way to accomplish this without having to grep in the source files? (That would be very cumbersome.)
A huge part of getting reference counting (refcounting) done correctly in C++ is to use Resource Allocation Is Initialization so it's much harder to accidentally leak references. However, this doesn't solve everything with refcounts.
That said, you can implement a debug feature in your refcounting which tracks what is holding references. You can then analyze this information when necessary, and remove it from release builds. (Use a configuration macro similar in purpose to how DEBUG macros are used.)
Exactly how you should implement it is going to depend on all your requirements, but there are two main ways to do this (with a brief overview of differences):
store the information on the referenced object itself
accessible from your debugger
easier to implement
output to a special trace file every time a reference is acquired or released
still available after the program exits (even abnormally)
possible to use while the program is running, without running in your debugger
can be used even in special release builds and sent back to you for analysis
The basic problem, of knowing what is referencing a given object, is hard to solve in general, and will require some work. Compare: can you tell me every person and business that knows your postal address or phone number?
One known weakness of reference counting is that it does not work when there are cyclic references, i.e. (in the simplest case) when one object has a reference to another object which in turn has a reference to the former object. This sounds like a non-issue, but in data structures such as binary trees with back-references to parent nodes, there you are.
If you don't explicitly provide for a list of "reverse" references in the referenced (un-freed) object, I don't see a way to figure out who is referencing it.
In the following suggestions, I assume that you don't want to modify your source, or if so, just a little.
You could of course walk the whole heap / freestore and search for the memory address of your un-freed object, but if its address turns up, it's not guaranteed to actually be a memory address reference; it could just as well be any random floating point number, of anything else. However, if the found value lies inside a block a memory that your application allocated for an object, chances improve a little that it's indeed a pointer to another object.
One possible improvement over this approach would be to modify the memory allocator you use -- e.g. your global operator new -- so that it keeps a list of all allocated memory blocks and their sizes. (In a complete implementation of this, operator delete would have remove the list entry for the freed block of memory.) Now, at the end of your program, you have a clue where to search for the un-freed object's memory address, since you have a list of memory blocks that your program actually used.
The above suggestions don't sound very reliable to me, to be honest; but maybe defining a custom global operator new and operator delete that does some logging / tracing goes in the right direction to solve your problem.
I am assuming you have some class with say addRef() and release() member functions, and you call these when you need to increase and decrease the reference count on each instance, and that the instances that cause problems are on the heap and referred to with raw pointers. The simplest fix may be to replace all pointers to the controlled object with boost::shared_ptr. This is surprisingly easy to do and should enable you to dispense with your own reference counting - you can just make those functions I mentioned do nothing. The main change required in your code is in the signatures of functions that pass or return your pointers. Other places to change are in initializer lists (if you initialize pointers to null) and if()-statements (if you compare pointers with null). The compiler will find all such places after you change the declarations of the pointers.
If you do not want to use the shared_ptr - maybe you want to keep the reference count intrinsic to the class - you can craft your own simple smart pointer just to deal with your class. Then use it to control the lifetime of your class objects. So for example, instead of pointer assignment being done with raw pointers and you "manually" calling addRef(), you just do an assignment of your smart pointer class which includes the addRef() automatically.
I don't think it's possible to do something without code change. With code change you can for example remember the pointers of the objects which increase reference count, and then see what pointer is left and examine it in the debugger. If possible - store more verbose information, such as object name.
I have created one for my needs. You can compare your code with this one and see what's missing. It's not perfect but it should work in most of the cases.
http://sites.google.com/site/grayasm/autopointer
when I use it I do:
util::autopointer<A> aptr=new A();
I never do it like this:
A* ptr = new A();
util::autopointer<A> aptr = ptr;
and later to start fulling around with ptr; That's not allowed.
Further I am using only aptr to refer to this object.
If I am wrong I have now the chance to get corrections. :) See ya!
I'm currently doing my first real project in C++ and so, fairly new to pointers. I know what they are and have read some basic usage rules. Probably not enough since I still do not really understand when to use them, and when not.
The problem is that most places just mention that most people either overuse them or underuse them. My question is, when to use them, and when not?.
Currently, in many cases i'm asking myself, should I use a pointer here or just pass the variable itself to the function.
For instance, I know that you can send a pointer to a function so the function can actually alter the variable itself instead of a copy of it. But when you just need to get some information of the object once (for instance the method needs a getValue() something), are pointers usefull in that case?
I would love to see either reactions but also links that might be helpfull. Since it is my first time using C++ I do not yet have a good C++ book (was thinking about buying one if I keep on using c++ which I probably will).
For the do's and dont's of C++:
Effective C++ and More Effective C++ by Scott Meyers.
For pointers (and references):
use pass by value if the type fits into 4 Bytes and don't want to have it changed after the return of the call.
use pass by reference to const if the type is larger and you don't want to have it changed after the return of the call.
use pass by reference if the parameter can't be NULL
use a pointer otherwise.
dont't use raw pointers if you don't need to. Most of the time, a smart pointer (see Boost) is the better option.
From the c++ faq:
Use references when you can, and
pointers when you have to.
https://isocpp.org/wiki/faq/references#refs-vs-ptrs
1) I tend to use member variables scoped with the class. They are constructed in the initializer of the class, and I don't need to worry about pointers.
2) You can pass by reference to a function, and not worry about passing pointers. This effectively will pass a pointer to the method / function that can be used as if you passed the class, but without the overhead of copying the class itself.
3) If I need to control the lifetime of an object that is independent of my main application architecture's classes... then I will use an auto_ptr from the STL to automatically handle the pointer's destruction when no one longer references it. Check it out - it's the way to go.
Use it whenever you are dealing with allocated memory or passing arguments by reference to a method; I don't think there is a rule for not using pointers.
My rules of thumb:
Always pass function parameters as const references,
unless they are built-in types, in which case they are copied (and const/non-const becomes a question of style as the caller isn't affected) or
unless they are meant to be changed inside the function so that the changes reflect at the caller's, in which case they are passed by non-const reference or
unless the function should be callable even if callers don't have an object to pass, then they are passed as pointers, so that callers can pass in NULL pointers instead (apply #1 and #3 to decide whether to pass per const T* or per T*)
Streams must always be passed around as non-const references.
Generally, when you can use references instead of pointers it is a good idea. A reference must have a target (no NULL pointer violations), they allow the same semantics as pointers when being passed as arguments to a function, and they are generally nicer to use for beginners (or those not coming from a C background).
Pointers are required when you want to do dynamic allocation of memory; when you need to deal with an unknown amount of things that will be later specified. In this case the interface to access memory is through new and delete which deal in pointers.
My philosophy is to always pass by value, unless you need to modify the variable passed or copying the object is expensive. In both these cases, consider using a reference instead of a pointer first: if you don't need to change which object you're referencing, nor do you need a possible extremal value (NULL pointer), you can use a reference.
Don't forget about iterators either.
All good answers above. Additionally, if you are performing some processor-intensive work, it's important to realize that dereferencing a pointer will likely be a cache miss on your processor. It's a good idea to keep your data accessible with minimal pointer dereferences.
Class attribute: pointer
Variables declared in methods: no pointers, so we avoid memory leaks.
In this way, prevent memory leaks and controlle attribute's consistency.
Salu2.
I've been programming C, mainly in an embedded environment, for years now and have a perfectly good mental model of pointers - I don't have to explicitly think about how to use them, am 100% comfortable with pointer arithmetic, arrays of pointers, pointers-to-pointers etc.
I've written very little C++ and really don't have a good way of thinking about references. I've been advised in the past to "think of them as pointers that can't be NULL" but this question shows that that is far from the full story.
So for more experienced C++ programmers - how do you think of references? Do you think of them as a special sort of pointer, or as their own thing entirely? What's a good way for a C programmer to get their head round the concept?
I've get used to think about references as an alias for main object.
EDIT(Due to request in comments):
I used to think about reference as kind of aliasing is because it behaves in the exact same way as the original variable without any need to make an extra manipulation in order to affect the variable referenced.
For me, when I see a pointer in code (as a local variable in a function or a member on a class), I have to think about
Is the pointer null, or is it valid
Who created the object it points to (is it me?, have I done it yet?)
Who is responsible for deleting the
object
Does it always point to the same
object
I don't have to think about any of that stuff if it's a reference, it's somebody else's problem (i.e. think of a reference as an SEP Field for a pointer)
P.S. Yes, it's probably still my problem, just not right now
I'm not all too fond of the "ever-valid" view, as references can become invalid, e.g.
int* p = new int(100);
int& ref = *p;
delete p; // oops - ref now references garbage
So, I think of references as non-rebindable (that is, you can't change the target of a reference once it's initialized) pointers with syntactic sugar to help me get rid of the "->" pointer syntax.
In general you just don't think about references. You use references in every function unless you have a specific need for calling by value or pointer magic.
References are essentially pointers that always point to the same thing. A reference doesn't need to be dereferenced, and can instead be accessed as a normal variable. That's pretty much all that there is to it. You use pointers when you need to do pointer arithmetic or change what the pointer points to, and references for just about everything else.
References are pointer-consts with different syntax. ie. the reference
T&
is pretty much
T * const
as in, the pointer cannot be changed. The content of both is identical - a memory address of a T - and neither can be changed.
Then apart from that pretty much the only difference is the syntax: . for references and -> and * for pointer.
That's it really - references ARE pointers, just with different syntax (and they're const).
How about "pointers that can't be NULL and can't be changed after initialisation". Also, they have no size by themselves (because they have no identity of themselves).
I think of the reference as being the object it refers to. You access the object using . symantecs (as opposed to ->), re-enforcing this idea for me.
I think your mental model of pointers, and then a list of all the edge cases you've encountered, is the best way.
Those who don't get pointers are going to fare far worse.
Incidentally, they can be NULL or any other non-accessible memory location (it just takes effort):
char* test = "aha";
char& ok = *test;
test = NULL;
char& bad = *test;
One way to think about them is as importing another name for an object from a possibly different scope.
For instance : Obj o; Obj& r = o;
There is really little difference between semantics of o and r.
The major one seems that the compiler watches the scope of o for calling the destructor.
I think of it as a pointer container.
If you use linux, you can think of references as hard links and pointers as symbolic links (symlinks).
Hard link is just another name for a file. The file gets "deleted" when all hard links to this file are removed.
Same about references. Just substitue "hard link" with "reference" and "file" with "value" (or probably "memory location"?).
A variable gets destroyed when all references are gone out of scope.
You can't create a hard link to a nonexistent file. Similary, it's not possible to create a reference to nothing.
However you can create a symlink to a nonexistent file. Much like an uninitialized pointer. Actually uninitialized pointers do point to some random locations (correct me if I'm wrong). But what I mean is that you are not supposed to use them :)
From a syntactic POV, a reference is an alias for an existing object. From a semantic POV, a reference behaves like a pointer with a few problems (invalidation, ownership etc.) removed and an object-like syntax added. From a practical POV, prefer references unless you have the need to say "no object". (Resource ownership isn't a reason to prefer pointers, as this should be done using smart pointers.)
Update: Here's one additional difference between references and pointers which I forgot about: A temporary object (an rvalue) bound to a const reference will have its life-time extended to the life of the reference:
const std::string& result = function_returning_a_string();
Here, the temporary returned by the function is bound to result and will not cease to exist at the end of the expression, but will exist until result dies. This is nice, because in the absence of rvalue references and overloading based on them (as in C++11), this allows you to get rid of one unnecessary copy in the above example.
This is a rule introduced especially for const references and there's no way to achieve this with pointers.
If I have a function that needs to work with a shared_ptr, wouldn't it be more efficient to pass it a reference to it (so to avoid copying the shared_ptr object)?
What are the possible bad side effects?
I envision two possible cases:
1) inside the function a copy is made of the argument, like in
ClassA::take_copy_of_sp(boost::shared_ptr<foo> &sp)
{
...
m_sp_member=sp; //This will copy the object, incrementing refcount
...
}
2) inside the function the argument is only used, like in
Class::only_work_with_sp(boost::shared_ptr<foo> &sp) //Again, no copy here
{
...
sp->do_something();
...
}
I can't see in both cases a good reason to pass the boost::shared_ptr<foo> by value instead of by reference. Passing by value would only "temporarily" increment the reference count due to the copying, and then decrement it when exiting the function scope.
Am I overlooking something?
Just to clarify, after reading several answers: I perfectly agree on the premature-optimization concerns, and I always try to first-profile-then-work-on-the-hotspots. My question was more from a purely technical code-point-of-view, if you know what I mean.
I found myself disagreeing with the highest-voted answer, so I went looking for expert opinons and here they are.
From http://channel9.msdn.com/Shows/Going+Deep/C-and-Beyond-2011-Scott-Andrei-and-Herb-Ask-Us-Anything
Herb Sutter: "when you pass shared_ptrs, copies are expensive"
Scott Meyers: "There's nothing special about shared_ptr when it comes to whether you pass it by value, or pass it by reference. Use exactly the same analysis you use for any other user defined type. People seem to have this perception that shared_ptr somehow solves all management problems, and that because it's small, it's necessarily inexpensive to pass by value. It has to be copied, and there is a cost associated with that... it's expensive to pass it by value, so if I can get away with it with proper semantics in my program, I'm gonna pass it by reference to const or reference instead"
Herb Sutter: "always pass them by reference to const, and very occasionally maybe because you know what you called might modify the thing you got a reference from, maybe then you might pass by value... if you copy them as parameters, oh my goodness you almost never need to bump that reference count because it's being held alive anyway, and you should be passing it by reference, so please do that"
Update: Herb has expanded on this here: http://herbsutter.com/2013/06/05/gotw-91-solution-smart-pointer-parameters/, although the moral of the story is that you shouldn't be passing shared_ptrs at all "unless you want to use or manipulate the smart pointer itself, such as to share or transfer ownership."
The point of a distinct shared_ptr instance is to guarantee (as far as possible) that as long as this shared_ptr is in scope, the object it points to will still exist, because its reference count will be at least 1.
Class::only_work_with_sp(boost::shared_ptr<foo> sp)
{
// sp points to an object that cannot be destroyed during this function
}
So by using a reference to a shared_ptr, you disable that guarantee. So in your second case:
Class::only_work_with_sp(boost::shared_ptr<foo> &sp) //Again, no copy here
{
...
sp->do_something();
...
}
How do you know that sp->do_something() will not blow up due to a null pointer?
It all depends what is in those '...' sections of the code. What if you call something during the first '...' that has the side-effect (somewhere in another part of the code) of clearing a shared_ptr to that same object? And what if it happens to be the only remaining distinct shared_ptr to that object? Bye bye object, just where you're about to try and use it.
So there are two ways to answer that question:
Examine the source of your entire program very carefully until you are sure the object won't die during the function body.
Change the parameter back to be a distinct object instead of a reference.
General bit of advice that applies here: don't bother making risky changes to your code for the sake of performance until you've timed your product in a realistic situation in a profiler and conclusively measured that the change you want to make will make a significant difference to performance.
Update for commenter JQ
Here's a contrived example. It's deliberately simple, so the mistake will be obvious. In real examples, the mistake is not so obvious because it is hidden in layers of real detail.
We have a function that will send a message somewhere. It may be a large message so rather than using a std::string that likely gets copied as it is passed around to multiple places, we use a shared_ptr to a string:
void send_message(std::shared_ptr<std::string> msg)
{
std::cout << (*msg.get()) << std::endl;
}
(We just "send" it to the console for this example).
Now we want to add a facility to remember the previous message. We want the following behaviour: a variable must exist that contains the most recently sent message, but while a message is currently being sent then there must be no previous message (the variable should be reset before sending). So we declare the new variable:
std::shared_ptr<std::string> previous_message;
Then we amend our function according to the rules we specified:
void send_message(std::shared_ptr<std::string> msg)
{
previous_message = 0;
std::cout << *msg << std::endl;
previous_message = msg;
}
So, before we start sending we discard the current previous message, and then after the send is complete we can store the new previous message. All good. Here's some test code:
send_message(std::shared_ptr<std::string>(new std::string("Hi")));
send_message(previous_message);
And as expected, this prints Hi! twice.
Now along comes Mr Maintainer, who looks at the code and thinks: Hey, that parameter to send_message is a shared_ptr:
void send_message(std::shared_ptr<std::string> msg)
Obviously that can be changed to:
void send_message(const std::shared_ptr<std::string> &msg)
Think of the performance enhancement this will bring! (Never mind that we're about to send a typically large message over some channel, so the performance enhancement will be so small as to be unmeasureable).
But the real problem is that now the test code will exhibit undefined behaviour (in Visual C++ 2010 debug builds, it crashes).
Mr Maintainer is surprised by this, but adds a defensive check to send_message in an attempt to stop the problem happening:
void send_message(const std::shared_ptr<std::string> &msg)
{
if (msg == 0)
return;
But of course it still goes ahead and crashes, because msg is never null when send_message is called.
As I say, with all the code so close together in a trivial example, it's easy to find the mistake. But in real programs, with more complex relationships between mutable objects that hold pointers to each other, it is easy to make the mistake, and hard to construct the necessary test cases to detect the mistake.
The easy solution, where you want a function to be able to rely on a shared_ptr continuing to be non-null throughout, is for the function to allocate its own true shared_ptr, rather than relying on a reference to an existing shared_ptr.
The downside is that copied a shared_ptr is not free: even "lock-free" implementations have to use an interlocked operation to honour threading guarantees. So there may be situations where a program can be significantly sped up by changing a shared_ptr into a shared_ptr &. But it this is not a change that can be safely made to all programs. It changes the logical meaning of the program.
Note that a similar bug would occur if we used std::string throughout instead of std::shared_ptr<std::string>, and instead of:
previous_message = 0;
to clear the message, we said:
previous_message.clear();
Then the symptom would be the accidental sending of an empty message, instead of undefined behaviour. The cost of an extra copy of a very large string may be a lot more significant than the cost of copying a shared_ptr, so the trade-off may be different.
I would advise against this practice unless you and the other programmers you work with really, really know what you are all doing.
First, you have no idea how the interface to your class might evolve and you want to prevent other programmers from doing bad things. Passing a shared_ptr by reference isn't something a programmer should expect to see, because it isn't idiomatic, and that makes it easy to use it incorrectly. Program defensively: make the interface hard to use incorrectly. Passing by reference is just going to invite problems later on.
Second, don't optimize until you know this particular class is going to be a problem. Profile first, and then if your program really needs the boost given by passing by reference, then maybe. Otherwise, don't sweat the small stuff (i.e. the extra N instructions it takes to pass by value) instead worry about design, data structures, algorithms, and long-term maintainability.
Yes, taking a reference is fine there. You don't intend to give the method shared ownership; it only wants to work with it. You could take a reference for the first case too, since you copy it anyway. But for first case, it takes ownership. There is this trick to still copy it only once:
void ClassA::take_copy_of_sp(boost::shared_ptr<foo> sp) {
m_sp_member.swap(sp);
}
You should also copy when you return it (i.e not return a reference). Because your class doesn't know what the client is doing with it (it could store a pointer to it and then big bang happens). If it later turns out it's a bottleneck (first profile!), then you can still return a reference.
Edit: Of course, as others point out, this only is true if you know your code and know that you don't reset the passed shared pointer in some way. If in doubt, just pass by value.
It is sensible to pass shared_ptrs by const&. It will not likely cause trouble (except in the unlikely case that the referenced shared_ptr is deleted during the function call, as detailed by Earwicker) and it will likely be faster if you pass a lot of these around. Remember; the default boost::shared_ptr is thread safe, so copying it includes a thread safe increment.
Try to use const& rather than just &, because temporary objects may not be passed by non-const reference. (Even though a language extension in MSVC allows you to do it anyway)
In the second case, doing this is simpler:
Class::only_work_with_sp(foo &sp)
{
...
sp.do_something();
...
}
You can call it as
only_work_with_sp(*sp);
I would avoid a "plain" reference unless the function explicitely may modify the pointer.
A const & may be a sensible micro-optimization when calling small functions - e.g. to enable further optimizations, like inlining away some conditions. Also, the increment/decrement - since it's thread safe - is a synchronization point. I would not expect this to make a big difference in most scenarios, though.
Generally, you should use the simpler style unless you have reason not to. Then, either use the const & consistently, or add a comment as to why if you use it just in a few places.
I would advocate passing shared pointer by const reference - a semantics that the function being passed with the pointer does NOT own the pointer, which is a clean idiom for developers.
The only pitfall is in multiple thread programs the object being pointed by the shared pointer gets destroyed in another thread. So it is safe to say using const reference of shared pointer is safe in single threaded program.
Passing shared pointer by non-const reference is sometimes dangerous - the reason is the swap and reset functions the function may invoke inside so as to destroy the object which is still considered valid after the function returns.
It is not about premature optimization, I guess - it is about avoiding unnecessary waste of CPU cycles when you are clear what you want to do and the coding idiom has firmly been adopted by your fellow developers.
Just my 2 cents :-)
It seems that all the pros and cons here can actually be generalised to ANY type passed by reference not just shared_ptr. In my opinion, you should know the semantic of passing by reference, const reference and value and use it correctly. But there is absolutely nothing inherently wrong with passing shared_ptr by reference, unless you think that all references are bad...
To go back to the example:
Class::only_work_with_sp( foo &sp ) //Again, no copy here
{
...
sp.do_something();
...
}
How do you know that sp.do_something() will not blow up due to a dangling pointer?
The truth is that, shared_ptr or not, const or not, this could happen if you have a design flaw, like directly or indirectly sharing the ownership of sp between threads, missusing an object that do delete this, you have a circular ownership or other ownership errors.
One thing that I haven't seen mentioned yet is that when you pass shared pointers by reference, you lose the implicit conversion that you get if you want to pass a derived class shared pointer through a reference to a base class shared pointer.
For example, this code will produce an error, but it will work if you change test() so that the shared pointer is not passed by reference.
#include <boost/shared_ptr.hpp>
class Base { };
class Derived: public Base { };
// ONLY instances of Base can be passed by reference. If you have a shared_ptr
// to a derived type, you have to cast it manually. If you remove the reference
// and pass the shared_ptr by value, then the cast is implicit so you don't have
// to worry about it.
void test(boost::shared_ptr<Base>& b)
{
return;
}
int main(void)
{
boost::shared_ptr<Derived> d(new Derived);
test(d);
// If you want the above call to work with references, you will have to manually cast
// pointers like this, EVERY time you call the function. Since you are creating a new
// shared pointer, you lose the benefit of passing by reference.
boost::shared_ptr<Base> b = boost::dynamic_pointer_cast<Base>(d);
test(b);
return 0;
}
I'll assume that you are familiar with premature optimization and are asking this either for academic purposes or because you have isolated some pre-existing code that is under-performing.
Passing by reference is okay
Passing by const reference is better, and can usually be used, as it does not force const-ness on the object pointed to.
You are not at risk of losing the pointer due to using a reference. That reference is evidence that you have a copy of the smart pointer earlier in the stack and only one thread owns a call stack, so that pre-existing copy isn't going away.
Using references is often more efficient for the reasons you mention, but not guaranteed. Remember that dereferencing an object can take work too. Your ideal reference-usage scenario would be if your coding style involves many small functions, where the pointer would get passed from function to function to function before being used.
You should always avoid storing your smart pointer as a reference. Your Class::take_copy_of_sp(&sp) example shows correct usage for that.
Assuming we are not concerned with const correctness (or more, you mean to allow the functions to be able to modify or share ownership of the data being passed in), passing a boost::shared_ptr by value is safer than passing it by reference as we allow the original boost::shared_ptr to control it's own lifetime. Consider the results of the following code...
void FooTakesReference( boost::shared_ptr< int > & ptr )
{
ptr.reset(); // We reset, and so does sharedA, memory is deleted.
}
void FooTakesValue( boost::shared_ptr< int > ptr )
{
ptr.reset(); // Our temporary is reset, however sharedB hasn't.
}
void main()
{
boost::shared_ptr< int > sharedA( new int( 13 ) );
boost::shared_ptr< int > sharedB( new int( 14 ) );
FooTakesReference( sharedA );
FooTakesValue( sharedB );
}
From the example above we see that passing sharedA by reference allows FooTakesReference to reset the original pointer, which reduces it's use count to 0, destroying it's data. FooTakesValue, however, can't reset the original pointer, guaranteeing sharedB's data is still usable. When another developer inevitably comes along and attempts to piggyback on sharedA's fragile existence, chaos ensues. The lucky sharedB developer, however, goes home early as all is right in his world.
The code safety, in this case, far outweighs any speed improvement copying creates. At the same time, the boost::shared_ptr is meant to improve code safety. It will be far easier to go from a copy to a reference, if something requires this kind of niche optimization.
Sandy wrote: "It seems that all the pros and cons here can actually be generalised to ANY type passed by reference not just shared_ptr."
True to some extent, but the point of using shared_ptr is to eliminate concerns regarding object lifetimes and to let the compiler handle that for you. If you're going to pass a shared pointer by reference and allow clients of your reference-counted-object call non-const methods that might free the object data, then using a shared pointer is almost pointless.
I wrote "almost" in that previous sentence because performance can be a concern, and it 'might' be justified in rare cases, but I would also avoid this scenario myself and look for all possible other optimization solutions myself, such as to seriously look at adding another level of indirection, lazy evaluation, etc..
Code that exists past it's author, or even post it's author's memory, that requires implicit assumptions about behavior, in particular behavior about object lifetimes, requires clear, concise, readable documentation, and then many clients won't read it anyway! Simplicity almost always trumps efficiency, and there are almost always other ways to be efficient. If you really need to pass values by reference to avoid deep copying by copy constructors of your reference-counted-objects (and the equals operator), then perhaps you should consider ways to make the deep-copied data be reference counted pointers that can be copied quickly. (Of course, that's just one design scenario that might not apply to your situation).
I used to work in a project that the principle was very strong about passing smart pointers by value. When I was asked to do some performance analysis - I found that for increment and decrement of the reference counters of the smart pointers the application spends between 4-6% of the utilized processor time.
If you want to pass the smart pointers by value just to avoid having issues in weird cases as described from Daniel Earwicker make sure you understand the price you paying for it.
If you decide to go with a reference the main reason to use const reference is to make it possible to have implicit upcasting when you need to pass shared pointer to object from class that inherits the class you use in the interface.
In addition to what litb said, I'd like to point out that it's probably to pass by const reference in the second example, that way you are sure you don't accidentally modify it.
struct A {
shared_ptr<Message> msg;
shared_ptr<Message> * ptr_msg;
}
pass by value:
void set(shared_ptr<Message> msg) {
this->msg = msg; /// create a new shared_ptr, reference count will be added;
} /// out of method, new created shared_ptr will be deleted, of course, reference count also be reduced;
pass by reference:
void set(shared_ptr<Message>& msg) {
this->msg = msg; /// reference count will be added, because reference is just an alias.
}
pass by pointer:
void set(shared_ptr<Message>* msg) {
this->ptr_msg = msg; /// reference count will not be added;
}
Every code piece must carry some sense. If you pass a shared pointer by value everywhere in the application, this means "I am unsure about what's going on elsewhere, hence I favour raw safety". This is not what I call a good confidence sign to other programmers who could consult the code.
Anyway, even if a function gets a const reference and you are "unsure", you can still create a copy of the shared pointer at the head of the function, to add a strong reference to the pointer. This could also be seen as a hint about the design ("the pointer could be modified elsewhere").
So yes, IMO, the default should be "pass by const reference".