Provided, I want to pass a modifiable parameter to a function, what should I choose: to pass it by pointer or to pass it by reference?
bool GetFoo ( Foo& whereToPlaceResult );
bool GetFoo ( Foo* whereToPlaceResult );
I am asking this because I always considered it the best practice to pass parameter by reference (1), but after examining some local code database, I came to a conclusion, that the most common way is (2). Moreover, the man himself (Bjarne Stroustrup) recommends using (2). What are the [dis]advantages of (1) and (2), or is it just a matter of personal taste?
I prefer a reference instead of a pointer when:
It can't be null
It can't be changed (to point to something else)
It mustn't be deleted (by whoever receives the pointer)
Some people say though that the difference between a reference and a const reference is too subtle for many people, and is invisible in the code which calls the method (i.e., if you read the calling code which passes a parameter by reference, you can't see whether it's a const or a non-const reference), and that therefore you should make it a pointer (to make it explicit in the calling code that you're giving away the address of your variable, and that therefore the value of your variable may be altered by the callee).
I personally prefer a reference, for the following reason:
I think that a routine should know what subroutine it's calling
A subroutine shouldn't assume anything about what routine it's being called from.
[1.] implies that making the mutability visible to the caller doesn't matter much, because the caller should already (by other means) understand what the subroutine does (including the fact that it will modify the parameter).
[2.] implies that if it's a pointer then the subroutine should handle the possibility of the parameter's being a null pointer, which may be extra and IMO useless code.
Furthermore, whenever I see a pointer I think, "who's going to delete this, and when?", so whenever/wherever ownership/lifetime/deletion isn't an issue I prefer to use a reference.
For what it's worth I'm in the habit of writing const-correct code: so if I declare that a method has a non-const reference parameter, the fact that it's non-const is significant. If people weren't writing const-correct code then maybe it would be harder to tell whether a parameter will be modified in a subroutine, and the argument for another mechanism (e.g. a pointer instead of a reference) would be a bit stronger.
Advantages to passing by reference:
Forces user to supply a value.
Less error-prone: Handles pointer dereferencing itself. Don't have to check for null inside.
Makes the calling code look much cleaner.
Advantages to passing pointer by value:
Allows null to be passed for "optional" parameters. Kinda an ugly hack, but sometimes useful.
Forces caller to know what is being done w/ the parameter.
Gives the reader half a clue of what might be being done w/ the parameter without having to read the API.
Since reference passing is in the language, any non-pointer parameters might be getting modified too, and you don't know that pointer values are being changed. I've seen APIs where they are treated as constants. So pointer passing doesn't really give readers any info that they can count on. For some people that might be good enough, but for me it isn't.
Really, pointer passing is just an error-prone messy hack leftover from C which had no other way to pass values by reference. C++ has a way, so the hack is no longer needed.
One advantage to passing by reference is that they cannot be null (unlike pointers), obviating the need to null-check every out parameter.
I'd recommend that you consider (may not be best for every situation) returning Foo from the function rather than modifying a parameter. Your function prototype would look like this:
Foo GetFoo() // const (if a member function)
As you appear to be returning a success/failure flag, using an exception might be a better strategy.
Advantages:
You avoid all of the pointer/reference issues
Simplifies life for the caller. Can pass the return value to other functions without using a local variable, for example.
Caller cannot ignore error status if you throw an exception.
Return value optimization means that it may be as efficient as modifying a parameter.
I choose #2 because it obvious at the point of call that the parameter will be changed.
GetFoo(&var) rather than GetFoo(var)
I prefer pass by reference for just const references, where I am trying to avoid a copy constructor call.
Pass by reference, and avoid the whole NULL pointer problem.
I seem to recall that in c++ references where not null and pointers could be. Now I've not done c++ for a long time so my memory could be rusty.
The difference here is relatively minor.
A reference cannot be NULL.
A nullpointer may be passed.
Thus you can check if that happens and react accordingly.
I personally can't think of a real advantage of one of the two possibilities.
I find this a matter of personal taste. I actually prefer to pass by reference because pointers give more freedom but they also tend to cause a lot of problems.
The benefit to a pointer is that you can pass nothing, ie. use it as if the parameter was completely optional and not have a variable the caller passes in.
References otherwise are safer, if you have one its guaranteed to exist and be writeable (unless const of course)
I think its a matter of preference otherwise, but I don't like mixing the two as I think it makes maintainace and readability of your code harder to do (especially as your 2 functions look the same to the caller)
These days I use const references for input parameters and pointers for out parameters. FWIIW, Google C++ Style Guide recommends the same approach (not that I always agree with their style guide - for instance they don't use exceptions, which usually does not make much sense)
My preference is a reference. First, because it rhymes. :) Also because of the issues pointed out by other answers: no need to dereference, and no possibility of a reference being NULL. Another reason, which I have not seen mentioned, is that when you see a pointer you cannot be sure whether or not it points to dynamically allocated memory, and you may be tempted to call delete on it. A reference, on the other hand, dispenses with any ambiguity regarding memory management.
Having said that, there are of course many cases when passing a pointer is preferable, or even necessary. If you know in advance that the parameter is optional, then allowing it to be NULL is very useful. Similarly, you may know in advance that the parameter is always dynamically allocated and have the memory management all worked out.
Related
I understand the syntax and general semantics of pointers versus references, but how should I decide when it is more-or-less appropriate to use references or pointers in an API?
Naturally some situations need one or the other (operator++ needs a reference argument), but in general I'm finding I prefer to use pointers (and const pointers) as the syntax is clear that the variables are being passed destructively.
E.g. in the following code:
void add_one(int& n) { n += 1; }
void add_one(int* const n) { *n += 1; }
int main() {
int a = 0;
add_one(a); // Not clear that a may be modified
add_one(&a); // 'a' is clearly being passed destructively
}
With the pointer, it's always (more) obvious what's going on, so for APIs and the like where clarity is a big concern are pointers not more appropriate than references? Does that mean references should only be used when necessary (e.g. operator++)? Are there any performance concerns with one or the other?
EDIT (OUTDATED):
Besides allowing NULL values and dealing with raw arrays, it seems the choice comes down to personal preference. I've accepted the answer below that references Google's C++ Style Guide, as they present the view that "References can be confusing, as they have value syntax but pointer semantics.".
Due to the additional work required to sanitise pointer arguments that should not be NULL (e.g. add_one(0) will call the pointer version and break during runtime), it makes sense from a maintainability perspective to use references where an object MUST be present, though it is a shame to lose the syntactic clarity.
Use reference wherever you can, pointers wherever you must.
Avoid pointers until you can't.
The reason is that pointers make things harder to follow/read, less safe and far more dangerous manipulations than any other constructs.
So the rule of thumb is to use pointers only if there is no other choice.
For example, returning a pointer to an object is a valid option when the function can return nullptr in some cases and it is assumed it will. That said, a better option would be to use something similar to std::optional (requires C++17; before that, there's boost::optional).
Another example is to use pointers to raw memory for specific memory manipulations. That should be hidden and localized in very narrow parts of the code, to help limit the dangerous parts of the whole code base.
In your example, there is no point in using a pointer as argument because:
if you provide nullptr as the argument, you're going in undefined-behaviour-land;
the reference attribute version doesn't allow (without easy to spot tricks) the problem with 1.
the reference attribute version is simpler to understand for the user: you have to provide a valid object, not something that could be null.
If the behaviour of the function would have to work with or without a given object, then using a pointer as attribute suggests that you can pass nullptr as the argument and it is fine for the function. That's kind of a contract between the user and the implementation.
The performances are exactly the same, as references are implemented internally as pointers. Thus you do not need to worry about that.
There is no generally accepted convention regarding when to use references and pointers. In a few cases you have to return or accept references (copy constructor, for instance), but other than that you are free to do as you wish. A rather common convention I've encountered is to use references when the parameter must refer an existing object and pointers when a NULL value is ok.
Some coding convention (like Google's) prescribe that one should always use pointers, or const references, because references have a bit of unclear-syntax: they have reference behaviour but value syntax.
From C++ FAQ Lite -
Use references when you can, and pointers when you have to.
References are usually preferred over pointers whenever you don't need
"reseating". This usually means that references are most useful in a
class's public interface. References typically appear on the skin of
an object, and pointers on the inside.
The exception to the above is where a function's parameter or return
value needs a "sentinel" reference — a reference that does not refer
to an object. This is usually best done by returning/taking a pointer,
and giving the NULL pointer this special significance (references must
always alias objects, not a dereferenced NULL pointer).
Note: Old line C programmers sometimes don't like references since
they provide reference semantics that isn't explicit in the caller's
code. After some C++ experience, however, one quickly realizes this is
a form of information hiding, which is an asset rather than a
liability. E.g., programmers should write code in the language of the
problem rather than the language of the machine.
My rule of thumb is:
Use pointers for outgoing or in/out parameters. So it can be seen that the value is going to be changed. (You must use &)
Use pointers if NULL parameter is acceptable value. (Make sure it's const if it's an incoming parameter)
Use references for incoming parameter if it cannot be NULL and is not a primitive type (const T&).
Use pointers or smart pointers when returning a newly created object.
Use pointers or smart pointers as struct or class members instead of references.
Use references for aliasing (eg. int ¤t = someArray[i])
Regardless which one you use, don't forget to document your functions and the meaning of their parameters if they are not obvious.
Disclaimer: other than the fact that references cannot be NULL nor "rebound" (meaning thay can't change the object they're the alias of), it really comes down to a matter of taste, so I'm not going to say "this is better".
That said, I disagree with your last statement in the post, in that I don't think the code loses clarity with references. In your example,
add_one(&a);
might be clearer than
add_one(a);
since you know that most likely the value of a is going to change. On the other hand though, the signature of the function
void add_one(int* const n);
is somewhat not clear either: is n going to be a single integer or an array? Sometimes you only have access to (poorly documentated) headers, and signatures like
foo(int* const a, int b);
are not easy to interpret at first sight.
Imho, references are as good as pointers when no (re)allocation nor rebinding (in the sense explained before) is needed. Moreover, if a developer only uses pointers for arrays, functions signatures are somewhat less ambiguous. Not to mention the fact that operators syntax is way more readable with references.
Like others already answered: Always use references, unless the variable being NULL/nullptr is really a valid state.
John Carmack's viewpoint on the subject is similar:
NULL pointers are the biggest problem in C/C++, at least in our code. The dual use of a single value as both a flag and an address causes an incredible number of fatal issues. C++ references should be favored over pointers whenever possible; while a reference is “really” just a pointer, it has the implicit contract of being not-NULL. Perform NULL checks when pointers are turned into references, then you can ignore the issue thereafter.
http://www.altdevblogaday.com/2011/12/24/static-code-analysis/
Edit 2012-03-13
User Bret Kuhns rightly remarks:
The C++11 standard has been finalized. I think it's time in this thread to mention that most code should do perfectly fine with a combination of references, shared_ptr, and unique_ptr.
True enough, but the question still remains, even when replacing raw pointers with smart pointers.
For example, both std::unique_ptr and std::shared_ptr can be constructed as "empty" pointers through their default constructor:
http://en.cppreference.com/w/cpp/memory/unique_ptr/unique_ptr
http://en.cppreference.com/w/cpp/memory/shared_ptr/shared_ptr
... meaning that using them without verifying they are not empty risks a crash, which is exactly what J. Carmack's discussion is all about.
And then, we have the amusing problem of "how do we pass a smart pointer as a function parameter?"
Jon's answer for the question C++ - passing references to boost::shared_ptr, and the following comments show that even then, passing a smart pointer by copy or by reference is not as clear cut as one would like (I favor myself the "by-reference" by default, but I could be wrong).
It is not a matter of taste. Here are some definitive rules.
If you want to refer to a statically declared variable within the scope in which it was declared then use a C++ reference, and it will be perfectly safe. The same applies to a statically declared smart pointer. Passing parameters by reference is an example of this usage.
If you want to refer to anything from a scope that is wider than the scope in which it is declared then you should use a reference counted smart pointer for it to be perfectly safe.
You can refer to an element of a collection with a reference for syntactic convenience, but it is not safe; the element can be deleted at anytime.
To safely hold a reference to an element of a collection you must use a reference counted smart pointer.
There is problem with "use references wherever possible" rule and it arises if you want to keep reference for further use. To illustrate this with example, imagine you have following classes.
class SimCard
{
public:
explicit SimCard(int id):
m_id(id)
{
}
int getId() const
{
return m_id;
}
private:
int m_id;
};
class RefPhone
{
public:
explicit RefPhone(const SimCard & card):
m_card(card)
{
}
int getSimId()
{
return m_card.getId();
}
private:
const SimCard & m_card;
};
At first it may seem to be a good idea to have parameter in RefPhone(const SimCard & card) constructor passed by a reference, because it prevents passing wrong/null pointers to the constructor. It somehow encourages allocation of variables on stack and taking benefits from RAII.
PtrPhone nullPhone(0); //this will not happen that easily
SimCard * cardPtr = new SimCard(666); //evil pointer
delete cardPtr; //muahaha
PtrPhone uninitPhone(cardPtr); //this will not happen that easily
But then temporaries come to destroy your happy world.
RefPhone tempPhone(SimCard(666)); //evil temporary
//function referring to destroyed object
tempPhone.getSimId(); //this can happen
So if you blindly stick to references you trade off possibility of passing invalid pointers for the possibility of storing references to destroyed objects, which has basically same effect.
edit: Note that I sticked to the rule "Use reference wherever you can, pointers wherever you must. Avoid pointers until you can't." from the most upvoted and accepted answer (other answers also suggest so). Though it should be obvious, example is not to show that references as such are bad. They can be misused however, just like pointers and they can bring their own threats to the code.
There are following differences between pointers and references.
When it comes to passing variables, pass by reference looks like pass by value, but has pointer semantics (acts like pointer).
Reference can not be directly initialized to 0 (null).
Reference (reference, not referenced object) can not be modified (equivalent to "* const" pointer).
const reference can accept temporary parameter.
Local const references prolong the lifetime of temporary objects
Taking those into account my current rules are as follows.
Use references for parameters that will be used locally within a function scope.
Use pointers when 0 (null) is acceptable parameter value or you need to store parameter for further use. If 0 (null) is acceptable I am adding "_n" suffix to parameter, use guarded pointer (like QPointer in Qt) or just document it. You can also use smart pointers. You have to be even more careful with shared pointers than with normal pointers (otherwise you can end up with by design memory leaks and responsibility mess).
Any performance difference would be so small that it wouldn't justify using the approach that's less clear.
First, one case that wasn't mentioned where references are generally superior is const references. For non-simple types, passing a const reference avoids creating a temporary and doesn't cause the confusion you're concerned about (because the value isn't modified). Here, forcing a person to pass a pointer causes the very confusion you're worried about, as seeing the address taken and passed to a function might make you think the value changed.
In any event, I basically agree with you. I don't like functions taking references to modify their value when it's not very obvious that this is what the function is doing. I too prefer to use pointers in that case.
When you need to return a value in a complex type, I tend to prefer references. For example:
bool GetFooArray(array &foo); // my preference
bool GetFooArray(array *foo); // alternative
Here, the function name makes it clear that you're getting information back in an array. So there's no confusion.
The main advantages of references are that they always contain a valid value, are cleaner than pointers, and support polymorphism without needing any extra syntax. If none of these advantages apply, there is no reason to prefer a reference over a pointer.
Copied from wiki-
A consequence of this is that in many implementations, operating on a variable with automatic or static lifetime through a reference, although syntactically similar to accessing it directly, can involve hidden dereference operations that are costly. References are a syntactically controversial feature of C++ because they obscure an identifier's level of indirection; that is, unlike C code where pointers usually stand out syntactically, in a large block of C++ code it may not be immediately obvious if the object being accessed is defined as a local or global variable or whether it is a reference (implicit pointer) to some other location, especially if the code mixes references and pointers. This aspect can make poorly written C++ code harder to read and debug (see Aliasing).
I agree 100% with this, and this is why I believe that you should only use a reference when you a have very good reason for doing so.
Points to keep in mind:
Pointers can be NULL, references cannot be NULL.
References are easier to use, const can be used for a reference when we don't want to change value and just need a reference in a function.
Pointer used with a * while references used with a &.
Use pointers when pointer arithmetic operation are required.
You can have pointers to a void type int a=5; void *p = &a; but cannot have a reference to a void type.
Pointer Vs Reference
void fun(int *a)
{
cout<<a<<'\n'; // address of a = 0x7fff79f83eac
cout<<*a<<'\n'; // value at a = 5
cout<<a+1<<'\n'; // address of a increment by 4 bytes(int) = 0x7fff79f83eb0
cout<<*(a+1)<<'\n'; // value here is by default = 0
}
void fun(int &a)
{
cout<<a<<'\n'; // reference of original a passed a = 5
}
int a=5;
fun(&a);
fun(a);
Verdict when to use what
Pointer: For array, linklist, tree implementations and pointer arithmetic.
Reference: In function parameters and return types.
The following are some guidelines.
A function uses passed data without modifying it:
If the data object is small, such as a built-in data type or a small structure, pass it by value.
If the data object is an array, use a pointer because that’s your only choice. Make the pointer a pointer to const.
If the data object is a good-sized structure, use a const pointer or a const
reference to increase program efficiency.You save the time and space needed to
copy a structure or a class design. Make the pointer or reference const.
If the data object is a class object, use a const reference.The semantics of class design often require using a reference, which is the main reason C++ added
this feature.Thus, the standard way to pass class object arguments is by reference.
A function modifies data in the calling function:
1.If the data object is a built-in data type, use a pointer. If you spot code
like fixit(&x), where x is an int, it’s pretty clear that this function intends to modify x.
2.If the data object is an array, use your only choice: a pointer.
3.If the data object is a structure, use a reference or a pointer.
4.If the data object is a class object, use a reference.
Of course, these are just guidelines, and there might be reasons for making different
choices. For example, cin uses references for basic types so that you can use cin >> n
instead of cin >> &n.
Your properly written example should look like
void add_one(int& n) { n += 1; }
void add_one(int* const n)
{
if (n)
*n += 1;
}
That's why references are preferable if possible
...
References are cleaner and easier to use, and they do a better job of hiding information.
References cannot be reassigned, however.
If you need to point first to one object and then to another, you must use a pointer. References cannot be null, so if any chance exists that the object in question might be null, you must not use a reference. You must use a pointer.
If you want to handle object manipulation on your own i.e if you want to allocate memory space for an object on the Heap rather on the Stack you must use Pointer
int *pInt = new int; // allocates *pInt on the Heap
In my practice I personally settled down with one simple rule - Use references for primitives and values that are copyable/movable and pointers for objects with long life cycle.
For Node example I would definitely use
AddChild(Node* pNode);
Just putting my dime in. I just performed a test. A sneeky one at that. I just let g++ create the assembly files of the same mini-program using pointers compared to using references.
When looking at the output they are exactly the same. Other than the symbolnaming. So looking at performance (in a simple example) there is no issue.
Now on the topic of pointers vs references. IMHO I think clearity stands above all. As soon as I read implicit behaviour my toes start to curl. I agree that it is nice implicit behaviour that a reference cannot be NULL.
Dereferencing a NULL pointer is not the problem. it will crash your application and will be easy to debug. A bigger problem is uninitialized pointers containing invalid values. This will most likely result in memory corruption causing undefined behaviour without a clear origin.
This is where I think references are much safer than pointers. And I agree with a previous statement, that the interface (which should be clearly documented, see design by contract, Bertrand Meyer) defines the result of the parameters to a function. Now taking this all into consideration my preferences go to
using references wherever/whenever possible.
For pointers, you need them to point to something, so pointers cost memory space.
For example a function that takes an integer pointer will not take the integer variable. So you will need to create a pointer for that first to pass on to the function.
As for a reference, it will not cost memory. You have an integer variable, and you can pass it as a reference variable. That's it. You don't need to create a reference variable specially for it.
With C++ how do i decide if i should pass an argument by value or by reference/pointer? (tell me the answer for both 32 and 64bits) Lets take A. Is 2 32bit values more less or equal work as a pointer to a 32bit value?
B to me seems like i always should pass by value. C i think i should pass by value but someone told me (however i haven't seen proof) that processors don't handle values not their bitsize and so it is more work. So if i were passing them around would it be more work to pass by value thus byref is faster? Finally i threw in an enum. I think enums should always be by value
Note: When i say by ref i mean a const reference or pointer (can't forget the const...)
struct A { int a, b; }
struct B { int a; }
struct C { char a, b; }
enum D { a,b,c }
void fn(T a);
Now tell me the answer if i were pushing the parameters many times and the code doesn't use a tail call? (lets say the values isnt used until 4 or so calls deep)
Forget the stack size. You should pass by reference if you want to change it, otherwise you should pass by value.
Preventing the sort of bugs introduced by allowing functions to change your data unexpectedly is far more important than a few bytes of wasted stack space.
If stack space becomes a problem, stop using so many levels (such as replacing a recursive solution with an iterative one) or expand your stack. Four levels of recursion isn't usually that onerous, unless your structures are massive or you're operating in the embedded world.
If performance becomes a problem, find a faster algorithm :-) If that's not possible, then you can look at passing by reference, but you need to understand that it's breaking the contract between caller and callee. If you can live with that, that's okay. I generally can't :-)
The intent of the value/reference dichotomy is to control what happens to the thing you pass as a parameter at the language level, not to fiddle with the way an implementation of the language works.
I pass all parameters by reference for consistency, including builtins (of course, const is used where possible).
I did test this in performance critical domains -- worst case loss compared to builtins was marginal. Reference can be quite a bit faster, for non-builtins, and when the calls are deep (as a generalization). This was important for me as I was doing quite a bit of deep TMP, where function bodies were tiny.
You might consider breaking that convention if you're counting instructions, the hardware is register-starved (e.g. embedded), or if the function is not a good candidate for inlining.
Unfortunately, the question you ask is more complex than it appears -- the answer may vary greatly by your platform, ABI, calling conventions, register counts, etc.
A lot depends on your requirement but best practice is to pass by reference as it reduces the memory foot print.
If you pass large objects by value, a copy of it is made in memory andthe copy constructor is called for making a copy of this.
So it will take more machine cycles and also, if you pass by value, changes are not reflected in the original object.
So try passing them by reference.
Hope this has been helpful to you.
Regards, Ken
First, reference and pointers aren't the same.
Pass by pointer
Pass parameters by pointers if any/some of these apply:
The passed element could be null.
The resource is allocated inside the called function and the caller is responsible should be responsible for freeing such a resource. Remember in this case to provide a free() function for that resource.
The value is of a variable type, like for example void*. When it's type is determined at runtime or depending on the usage pattern (or hiding implementation - i.e Win32 HANDLE), such as a thread procedure argument. (Here favor c++ templates and std::function, and use pointers for this purpose only if your environment does not permit otherwise.
Pass by reference
Pass parameters by reference if any/some of these apply:
Most of the time. (prefer passing by const reference)
If you want the modifications to the passed arguments to be visible to the caller. (unless const reference is used).
If the passed argument is never null.
If you know what is the passed argument type and you have control over function's signature.
Pass by copy
Pass a copy if any/some of these apply:
Generally try to avoid this.
If you want to operate on a copy of the passed argument. i.e you know that the called function would create a copy anyway.
With primitive types smaller than the system's pointer size - as it makes no performance/memory difference compared to a const ref.
This is tricky - when you know that the type implements a move constructor (such as std::string in C++11). It then looks as if you're passing by copy.
Any of these three lists can go more longer, but these are - I would say - the basic rules of thumb.
Your complete question is a bit unclear to me, but I can answer when you would use passing by value or by reference.
When passing by value, you have a complete copy of the parameter into the call stack. It's like you're making a local variable in the function call initialized with whatever you passed into it.
When passing by reference, you... well, pass by reference. The main difference is that you can modify the external object.
There is the benefit of reducing memory load for large objects passing by reference. For basic data types (32-bit or 64-bit integers, for example), the performance is negligible.
Generally, if you're going to work in C/C++ you should learn to use pointers. Passing objects as parameters will almost always be passed via a pointer (vs reference). The few instances you absolutely must use references is in the copy constructor. You'll want to use it in the operators as well, but it's not required.
Copying objects by value is usually a bad idea - more CPU to do the constructor function; more memory for the actual object. Use const to prevent the function modifying the object. The function signature should tell the caller what might happen to the referenced object.
Things like int, char, pointers are usually passed by value.
As to the structures you outlined, passing by value will not really matter. You need to do profiling to find out, but on the grand scheme of a program you be better off looking elsewhere for increasing performance in terms of CPU and/or memory.
I would consider whether you want value or reference semantics before you go worrying about optimizations. Generally you would pass by reference if you want the method you are calling to be able to modify the parameter. You can pass a pointer in this case, like you would in C, but idiomatic C++ tends to use references.
There is no rule that says that small types or enums should always be passed by value. There is plenty of code that passes int& parameters, because they rely on the semantics of passing by reference. Also, you should keep in mind that for any relatively small data type, you won't notice a difference in speed between passing by reference and by value.
That said, if you have a very large structure, you probably don't want to make lots of copies of it. This is where const references are handy. Do keep in mind though that const in C++ is not strictly enforced (even if it's considered bad practice, you can always const_cast it away). There is no reason to pass a const int& over an int, although there is a reason to pass a const ClassWithManyMembers& over a ClassWithManyMembers.
All of the structs that you listed I would say are fine to pass by value if you are intending them to be treated as values. Consider that if you call a function that takes one parameter of type struct Rectangle{int x, y, w, h}, this is the same as passing those 4 parameters independently, which is really not a big deal. Generally you should be more worried about the work that the copy constructor has to do - for example, passing a vector by value is probably not such a good idea, because it will have to dynamically allocate memory and iterate through a list whose size you don't know, and invoke many more copy constructors.
While you should keep all this in mind, a good general rule is: if you want refence semantics, pass by refence. Otherwise, pass intrinsics by value, and other things by const reference.
Also, C++11 introduced r-value references which complicate things even further. But that's a different topic.
These are the rules that I use:
for native types:
by value when they are input arguments
by non-const reference when they are mandatory output arguments
for structs or classes:
by const reference when they are input arguments
by non-const reference when they are output arguments
for arrays:
by const pointer when they are input arguments (const applies to the data, not the pointer here, i.e. const TYPE *)
by pointer when they are output arguments (const applies to the data, not the pointer)
I've found that there are very few times that require making an exception to the above rules. The one exception that comes to mind is for a struct or class argument that is optional, in which case a reference would not work. In that case I use a const pointer (input) or a non-const pointer (output), so that you can also pass 0.
If you want a copy, then pass by value. If you want to change it and you want those changes to be seen outside the function, then pass by reference. If you want speed and don't want to change it, pass by const reference.
So, as we're all hopefully aware, in Object-oriented programming when the occasion comes when you need somehow access an instance of a class in another class's method, you turn to passing that instance through arguments.
I'm curious, what's the difference in terms of good practice / less prone to breaking things when it comes to either passing an Object, or a Pointer to that object?
Get into the habit of passing objects by reference.
void DoStuff(const vector<int>& values)
If you need to modify the original object, omit the const qualifier.
void DoStuff(vector<int>& values)
If you want to be able to accept an empty/nothing answer, pass it by pointer.
void DoStuff(vector<int>* values)
If you want to do stuff to a local copy, pass it by value.
void DoStuff(vector<int> values)
Problems will only really pop up when you introduce tons of concurrency. By that time, you will know enough to know when to not use certain passing techniques.
Pass a pointer to the object if you want to be able to indicate nonexistence (by passing a NULL).
Try not to pass by value for objects, as that invokes a copy constructor to create a local version of the object within the scope of the call function. Instead, pass by reference. However, there are two modes here. In order to get the exact same effective behavior of passing by value (immutable "copy") without the overhead, pass by const reference. If you feel you will need to alter the passed object, pass by (non-const) reference.
I choose const reference as a default. Of course, non-const if you must mutate the object for the client. Deviation from using references is rarely required.
Pointers are not very C++ - like, since references are available. References are nice because they are forbidden to refer to nothing. Update: To clarify, proper containers for types and arrays are preferred, but for some internal implementations, you will need to pass a pointer.
Objects/values, are completely different in semantics. If I need a copy, I will typically just create it inside the function where needed:
void method(const std::string& str) {
std::string myCopy(str);
...
In fact what you can pass to a method is a pointer to object, a reference to the object and a copy of the object and all of these can also be constant. Depending on your needs you should choose the one that best suits your needs.
First descision you can make is whether the thing you pass should be able to change in your method or not. If you do not intend to change it then a const reference in probably the best alternative(by not changing I also mean you do not intend to call any non-const methods of that object). What are the advantages to that? You safe time for compying the object and also the method signature itself will say "I will not change that parameter".
If you have to change this object you can pass either a reference or a pointer to it. It is not very obligatory to choose just one of these options so you can go for either. The only difference I can think of is that pointer can be NULL(i.e. not pointing to any object at all) while a reference is always pointing to an existent object.
If what you need in your method is a copy of your object, then what you should pass a copy of the object(not a reference and not a pointer). For instance if your method looks like
void Foo(const A& a) {
A temp = a;
}
Then that is a clear indication that passing a copy is a better alternative.
Hope this makes things a bit clearer.
Actually, there's really no good reason for passing a pointer to an object, unless you want to somehow indicate that no object exists.
If you want to change the object, pass a reference to it. If you want to protect it from change within the function, pass it by value or at least const reference.
Some people pass by reference for the speed improvements (passing only an address of a large structure rather than the structure itself for example) but I don't agree with that. In most cases, I'd prefer my software to be safe than fast, a corollary of the saying: "you can't get any less optimised than wrong". :-)
Object-oriented programming is about polymorphism, Liskov Substitution Principle, old code calling new code, you name it. Pass a concrete (derived) object to a routine that works with more abstract (base) objects. If you are not doing that, you are not doing OOP.
This is only achievable when passing references or pointers. Passing by value is best reserved for, um, values.
It is useful to distinguish between values and objects. Values are always concrete, there's no polymorphism. They are often immutable. 5 is 5 and "abc" is "abc". You can pass them by value or by (const) reference.
Objects are always abstract to some degree. Given an object, one can almost always refine it to a more concrete object. A RectangularArea could be a Drawable which could be a Window which could be a ToplevelWindow which could be a ManagedWindow which could be... These must be passed by reference.
Pointers are a wholly separate can of worms. In my experience, naked pointers are best avoided. Use a smart pointer that cannot be NULL. If you need an optional argument, use an explicit optional class template such as boost::optional.
I'm moving from Java to C++ and am a bit confused of the language's flexibility. One point is that there are three ways to store objects: A pointer, a reference and a scalar (storing the object itself if I understand it correctly).
I tend to use references where possible, because that is as close to Java as possible. In some cases, e.g. getters for derived attributes, this is not possible:
MyType &MyClass::getSomeAttribute() {
MyType t;
return t;
}
This does not compile, because t exists only within the scope of getSomeAttribute() and if I return a reference to it, it would point nowhere before the client can use it.
Therefore I'm left with two options:
Return a pointer
Return a scalar
Returning a pointer would look like this:
MyType *MyClass::getSomeAttribute() {
MyType *t = new MyType;
return t;
}
This'd work, but the client would have to check this pointer for NULL in order to be really sure, something that's not necessary with references. Another problem is that the caller would have to make sure that t is deallocated, I'd rather not deal with that if I can avoid it.
The alternative would be to return the object itself (scalar):
MyType MyClass::getSomeAttribute() {
MyType t;
return t;
}
That's pretty straightforward and just what I want in this case: It feels like a reference and it can't be null. If the object is out of scope in the client's code, it is deleted. Pretty handy. However, I rarely see anyone doing that, is there a reason for that? Is there some kind of performance problem if I return a scalar instead of a pointer or reference?
What is the most common/elegant approach to handle this problem?
Return by value. The compiler can optimize away the copy, so the end result is what you want. An object is created, and returned to the caller.
I think the reason why you rarely see people do this is because you're looking at the wrong C++ code. ;)
Most people coming from Java feel uncomfortable doing something like this, so they call new all over the place. And then they get memory leaks all over the place, have to check for NULL and all the other problems that can cause. :)
It might also be worth pointing out that C++ references have very little in common with Java references.
A reference in Java is much more similar to a pointer (it can be reseated, or set to NULL).
In fact the only real differences are that a pointer can point to a garbage value as well (if it is uninitialized, or it points to an object that has gone out of scope), and that you can do pointer arithmetics on a pointer into an array.
A C++ references is an alias for an object. A Java reference doesn't behave like that.
Quite simply, avoid using pointers and dynamic allocation by new wherever possible. Use values, references and automatically allocated objects instead. Of course you can't always avoid dynamic allocation, but it should be a last resort, not a first.
Returning by value can introduce performance penalties because this means the object needs to be copied. If it is a large object, like a list, that operation might be very expensive.
But modern compilers are very good about making this not happen. The C++ standards explicitly states that the compiler is allowed to elide copies in certain circumstances. The particular instance that would be relevant in the example code you gave is called the 'return value optimization'.
Personally, I return by (usually const) reference when I'm returning a member variable, and return some sort of smart pointer object of some kind (frequently ::std::auto_ptr) when I need to dynamically allocate something. Otherwise I return by value.
I also very frequently have const reference parameters, and this is very common in C++. This is a way of passing a parameter and saying "the function is not allowed to touch this". Basically a read-only parameter. It should only be used for objects that are more complex than a single integer or pointer though.
I think one big change from Java is that const is important and used very frequently. Learn to understand it and make it your friend.
I also think Neil's answer is correct in stating that avoiding dynamic allocation whenever possible is a good idea. You should not contort your design too much to make that happen, but you should definitely prefer design choices in which it doesn't have to happen.
Returning by value is a common thing practised in C++. However, when you are passing an object, you pass by reference.
Example
main()
{
equity trader;
isTraderAllowed(trader);
....
}
bool isTraderAllowed(const equity& trdobj)
{
... // Perform your function routine here.
}
The above is a simple example of passing an object by reference. In reality, you would have a method called isTraderAllowed for the class equity, but I was showing you a real use of passing by reference.
A point regarding passing by value or reference:
Considering optimizations, assuming a function is inline, if its parameter is declared as "const DataType objectName" that DataType could be anything even primitives, no object copy will be involved; and if its parameter is declared as "const DataType & objectName" or "DataType & objectName" that again DataType could be anything even primitives, no address taking or pointer will be involved. In both previous cases input arguments are used directly in assembly code.
A point regarding references:
A reference is not always a pointer, as instance when you have following code in the body of a function, the reference is not a pointer:
int adad=5;
int & reference=adad;
A point regarding returning by value:
as some people have mentioned, using good compilers with capability of optimizations, returning by value of any type will not cause an extra copy.
A point regarding return by reference:
In case of inline functions and optimizations, returning by reference will not involve address taking or pointer.
I'm moving from Java to C++ and am a bit confused of the language's flexibility. One point is that there are three ways to store objects: A pointer, a reference and a scalar (storing the object itself if I understand it correctly).
I tend to use references where possible, because that is as close to Java as possible. In some cases, e.g. getters for derived attributes, this is not possible:
MyType &MyClass::getSomeAttribute() {
MyType t;
return t;
}
This does not compile, because t exists only within the scope of getSomeAttribute() and if I return a reference to it, it would point nowhere before the client can use it.
Therefore I'm left with two options:
Return a pointer
Return a scalar
Returning a pointer would look like this:
MyType *MyClass::getSomeAttribute() {
MyType *t = new MyType;
return t;
}
This'd work, but the client would have to check this pointer for NULL in order to be really sure, something that's not necessary with references. Another problem is that the caller would have to make sure that t is deallocated, I'd rather not deal with that if I can avoid it.
The alternative would be to return the object itself (scalar):
MyType MyClass::getSomeAttribute() {
MyType t;
return t;
}
That's pretty straightforward and just what I want in this case: It feels like a reference and it can't be null. If the object is out of scope in the client's code, it is deleted. Pretty handy. However, I rarely see anyone doing that, is there a reason for that? Is there some kind of performance problem if I return a scalar instead of a pointer or reference?
What is the most common/elegant approach to handle this problem?
Return by value. The compiler can optimize away the copy, so the end result is what you want. An object is created, and returned to the caller.
I think the reason why you rarely see people do this is because you're looking at the wrong C++ code. ;)
Most people coming from Java feel uncomfortable doing something like this, so they call new all over the place. And then they get memory leaks all over the place, have to check for NULL and all the other problems that can cause. :)
It might also be worth pointing out that C++ references have very little in common with Java references.
A reference in Java is much more similar to a pointer (it can be reseated, or set to NULL).
In fact the only real differences are that a pointer can point to a garbage value as well (if it is uninitialized, or it points to an object that has gone out of scope), and that you can do pointer arithmetics on a pointer into an array.
A C++ references is an alias for an object. A Java reference doesn't behave like that.
Quite simply, avoid using pointers and dynamic allocation by new wherever possible. Use values, references and automatically allocated objects instead. Of course you can't always avoid dynamic allocation, but it should be a last resort, not a first.
Returning by value can introduce performance penalties because this means the object needs to be copied. If it is a large object, like a list, that operation might be very expensive.
But modern compilers are very good about making this not happen. The C++ standards explicitly states that the compiler is allowed to elide copies in certain circumstances. The particular instance that would be relevant in the example code you gave is called the 'return value optimization'.
Personally, I return by (usually const) reference when I'm returning a member variable, and return some sort of smart pointer object of some kind (frequently ::std::auto_ptr) when I need to dynamically allocate something. Otherwise I return by value.
I also very frequently have const reference parameters, and this is very common in C++. This is a way of passing a parameter and saying "the function is not allowed to touch this". Basically a read-only parameter. It should only be used for objects that are more complex than a single integer or pointer though.
I think one big change from Java is that const is important and used very frequently. Learn to understand it and make it your friend.
I also think Neil's answer is correct in stating that avoiding dynamic allocation whenever possible is a good idea. You should not contort your design too much to make that happen, but you should definitely prefer design choices in which it doesn't have to happen.
Returning by value is a common thing practised in C++. However, when you are passing an object, you pass by reference.
Example
main()
{
equity trader;
isTraderAllowed(trader);
....
}
bool isTraderAllowed(const equity& trdobj)
{
... // Perform your function routine here.
}
The above is a simple example of passing an object by reference. In reality, you would have a method called isTraderAllowed for the class equity, but I was showing you a real use of passing by reference.
A point regarding passing by value or reference:
Considering optimizations, assuming a function is inline, if its parameter is declared as "const DataType objectName" that DataType could be anything even primitives, no object copy will be involved; and if its parameter is declared as "const DataType & objectName" or "DataType & objectName" that again DataType could be anything even primitives, no address taking or pointer will be involved. In both previous cases input arguments are used directly in assembly code.
A point regarding references:
A reference is not always a pointer, as instance when you have following code in the body of a function, the reference is not a pointer:
int adad=5;
int & reference=adad;
A point regarding returning by value:
as some people have mentioned, using good compilers with capability of optimizations, returning by value of any type will not cause an extra copy.
A point regarding return by reference:
In case of inline functions and optimizations, returning by reference will not involve address taking or pointer.