So, as we're all hopefully aware, in Object-oriented programming when the occasion comes when you need somehow access an instance of a class in another class's method, you turn to passing that instance through arguments.
I'm curious, what's the difference in terms of good practice / less prone to breaking things when it comes to either passing an Object, or a Pointer to that object?
Get into the habit of passing objects by reference.
void DoStuff(const vector<int>& values)
If you need to modify the original object, omit the const qualifier.
void DoStuff(vector<int>& values)
If you want to be able to accept an empty/nothing answer, pass it by pointer.
void DoStuff(vector<int>* values)
If you want to do stuff to a local copy, pass it by value.
void DoStuff(vector<int> values)
Problems will only really pop up when you introduce tons of concurrency. By that time, you will know enough to know when to not use certain passing techniques.
Pass a pointer to the object if you want to be able to indicate nonexistence (by passing a NULL).
Try not to pass by value for objects, as that invokes a copy constructor to create a local version of the object within the scope of the call function. Instead, pass by reference. However, there are two modes here. In order to get the exact same effective behavior of passing by value (immutable "copy") without the overhead, pass by const reference. If you feel you will need to alter the passed object, pass by (non-const) reference.
I choose const reference as a default. Of course, non-const if you must mutate the object for the client. Deviation from using references is rarely required.
Pointers are not very C++ - like, since references are available. References are nice because they are forbidden to refer to nothing. Update: To clarify, proper containers for types and arrays are preferred, but for some internal implementations, you will need to pass a pointer.
Objects/values, are completely different in semantics. If I need a copy, I will typically just create it inside the function where needed:
void method(const std::string& str) {
std::string myCopy(str);
...
In fact what you can pass to a method is a pointer to object, a reference to the object and a copy of the object and all of these can also be constant. Depending on your needs you should choose the one that best suits your needs.
First descision you can make is whether the thing you pass should be able to change in your method or not. If you do not intend to change it then a const reference in probably the best alternative(by not changing I also mean you do not intend to call any non-const methods of that object). What are the advantages to that? You safe time for compying the object and also the method signature itself will say "I will not change that parameter".
If you have to change this object you can pass either a reference or a pointer to it. It is not very obligatory to choose just one of these options so you can go for either. The only difference I can think of is that pointer can be NULL(i.e. not pointing to any object at all) while a reference is always pointing to an existent object.
If what you need in your method is a copy of your object, then what you should pass a copy of the object(not a reference and not a pointer). For instance if your method looks like
void Foo(const A& a) {
A temp = a;
}
Then that is a clear indication that passing a copy is a better alternative.
Hope this makes things a bit clearer.
Actually, there's really no good reason for passing a pointer to an object, unless you want to somehow indicate that no object exists.
If you want to change the object, pass a reference to it. If you want to protect it from change within the function, pass it by value or at least const reference.
Some people pass by reference for the speed improvements (passing only an address of a large structure rather than the structure itself for example) but I don't agree with that. In most cases, I'd prefer my software to be safe than fast, a corollary of the saying: "you can't get any less optimised than wrong". :-)
Object-oriented programming is about polymorphism, Liskov Substitution Principle, old code calling new code, you name it. Pass a concrete (derived) object to a routine that works with more abstract (base) objects. If you are not doing that, you are not doing OOP.
This is only achievable when passing references or pointers. Passing by value is best reserved for, um, values.
It is useful to distinguish between values and objects. Values are always concrete, there's no polymorphism. They are often immutable. 5 is 5 and "abc" is "abc". You can pass them by value or by (const) reference.
Objects are always abstract to some degree. Given an object, one can almost always refine it to a more concrete object. A RectangularArea could be a Drawable which could be a Window which could be a ToplevelWindow which could be a ManagedWindow which could be... These must be passed by reference.
Pointers are a wholly separate can of worms. In my experience, naked pointers are best avoided. Use a smart pointer that cannot be NULL. If you need an optional argument, use an explicit optional class template such as boost::optional.
Related
I'd like to work out conventions on passing parameters to functions/methods. I know it's a common issue and it has been answered many times, but I searched a lot and found nothing that fully satisfies me.
Passing by value is obvious and I won't mention this. What I came up with is:
Passing by non-const reference means, that object is MODIFIED
Passing by const reference means, that object is USED
Passing by pointer means, that a reference to object is going to be STORED. Whether ownership is passed or not will depend on the context.
It seems to be consistent, but when I want to pick heap-allocated object and pass it to 2. case parameter, it'd look like this:
void use(const Object &object) { ... }
//...
Object *obj = getOrCreateObject();
use(*obj);
or
Object &obj = *getOrCreateObject();
use(obj);
Both look weird to me. What would you advise?
PS I know that one should avoid raw pointers and use smart instead (easier memory managment and expressiveness in ownership) and it can be the next step in refactoring the project I work on.
You can use these conventions if you like. But keep in mind that you cannot assume conventions when dealing with code written by other people. You also cannot assume that people reading your code are aware of your conventions. You should document an interface with comments when it might be ambiguous.
Passing by pointer means, that object is going to be STORED. Who's its owner will depend on the context.
I can think of only one context where the ownership of a pointer argument should transfer to the callee: Constructor of a smart pointer.
Besides possible intention of storing, a pointer argument can alternatively have the same meaning as a reference argument, with the addition that the argument is optional. You typically cannot represent an optional argument with a reference since they cannot be null - although with custom types you could use a reference to a sentinel value.
Both look weird to me. What would you advise?
Neither look weird to me, so my advise is to get accustomed.
The main problem with your conventions is that you make no allowance for the possibility of interfacing to code (e.g. written by someone else) that doesn't follow your conventions.
Generally speaking, I use a different set of conventions, and rarely find a need to work around them. (The main exception will be if there is a need to use a pointer to a pointer, but I rarely need to do that directly).
Passing by non-const reference is appropriate if ANY of the following MAY be true;
The object may be changed;
The object may be passed to another function by a non-const reference [relevant when using third party code by developers who choose to omit the const - which is actually something a lot of beginners or lazy developers do];
The object may be passed to another function by a non-const pointer [relevant when using third party code be developers who choose to omit the const, or when using legacy APIs];
Non-const member functions of the object are called (regardless of whether they change the object or not) [also often a consideration when using third-party code by developers who prefer to avoid using const].
Conversely, const references may be passed if ALL of the following are true;
No non-mutable members of the object are changed;
The object is only passed to other functions by const reference, by const pointer, or by value;
Only const member functions of the object are called (even if those members are able to change mutable members.
I'll pass by value instead of by const reference in cases where the function would copy the object anyway. (e.g. I won't pass by const reference, and then construct a copy of the passed object within the function).
Passing non-const pointers is relevant if it is appropriate to pass a non-const reference but there is also a possibility of passing no object (e.g. a nullptr).
Passing const pointers is relevant if it is appropriate to pass a const reference but there is also a possibility of passing no object (e.g. a nullptr).
I would not change the convention for either of the following
Storing a reference or pointer to the object within the function for later use - it is possible to convert a pointer to a reference or vice versa. And either one can be stored (a pointer can be assigned, a reference can be used to construct an object);
Distinguishing between dynamically allocated and other objects - since I mostly either avoid using dynamic memory allocation at all (e.g. use standard containers, and pass them around by reference or simply pass iterators from them around) or - if I must use a new expression directly - store the pointer in another object that becomes responsible for deallocation (e.g. a std::smart_pointer) and then pass the containing object around.
In my opionion, they are the same. In the first part of your post, you are talking about the signature, but your example is about function call.
With C++ how do i decide if i should pass an argument by value or by reference/pointer? (tell me the answer for both 32 and 64bits) Lets take A. Is 2 32bit values more less or equal work as a pointer to a 32bit value?
B to me seems like i always should pass by value. C i think i should pass by value but someone told me (however i haven't seen proof) that processors don't handle values not their bitsize and so it is more work. So if i were passing them around would it be more work to pass by value thus byref is faster? Finally i threw in an enum. I think enums should always be by value
Note: When i say by ref i mean a const reference or pointer (can't forget the const...)
struct A { int a, b; }
struct B { int a; }
struct C { char a, b; }
enum D { a,b,c }
void fn(T a);
Now tell me the answer if i were pushing the parameters many times and the code doesn't use a tail call? (lets say the values isnt used until 4 or so calls deep)
Forget the stack size. You should pass by reference if you want to change it, otherwise you should pass by value.
Preventing the sort of bugs introduced by allowing functions to change your data unexpectedly is far more important than a few bytes of wasted stack space.
If stack space becomes a problem, stop using so many levels (such as replacing a recursive solution with an iterative one) or expand your stack. Four levels of recursion isn't usually that onerous, unless your structures are massive or you're operating in the embedded world.
If performance becomes a problem, find a faster algorithm :-) If that's not possible, then you can look at passing by reference, but you need to understand that it's breaking the contract between caller and callee. If you can live with that, that's okay. I generally can't :-)
The intent of the value/reference dichotomy is to control what happens to the thing you pass as a parameter at the language level, not to fiddle with the way an implementation of the language works.
I pass all parameters by reference for consistency, including builtins (of course, const is used where possible).
I did test this in performance critical domains -- worst case loss compared to builtins was marginal. Reference can be quite a bit faster, for non-builtins, and when the calls are deep (as a generalization). This was important for me as I was doing quite a bit of deep TMP, where function bodies were tiny.
You might consider breaking that convention if you're counting instructions, the hardware is register-starved (e.g. embedded), or if the function is not a good candidate for inlining.
Unfortunately, the question you ask is more complex than it appears -- the answer may vary greatly by your platform, ABI, calling conventions, register counts, etc.
A lot depends on your requirement but best practice is to pass by reference as it reduces the memory foot print.
If you pass large objects by value, a copy of it is made in memory andthe copy constructor is called for making a copy of this.
So it will take more machine cycles and also, if you pass by value, changes are not reflected in the original object.
So try passing them by reference.
Hope this has been helpful to you.
Regards, Ken
First, reference and pointers aren't the same.
Pass by pointer
Pass parameters by pointers if any/some of these apply:
The passed element could be null.
The resource is allocated inside the called function and the caller is responsible should be responsible for freeing such a resource. Remember in this case to provide a free() function for that resource.
The value is of a variable type, like for example void*. When it's type is determined at runtime or depending on the usage pattern (or hiding implementation - i.e Win32 HANDLE), such as a thread procedure argument. (Here favor c++ templates and std::function, and use pointers for this purpose only if your environment does not permit otherwise.
Pass by reference
Pass parameters by reference if any/some of these apply:
Most of the time. (prefer passing by const reference)
If you want the modifications to the passed arguments to be visible to the caller. (unless const reference is used).
If the passed argument is never null.
If you know what is the passed argument type and you have control over function's signature.
Pass by copy
Pass a copy if any/some of these apply:
Generally try to avoid this.
If you want to operate on a copy of the passed argument. i.e you know that the called function would create a copy anyway.
With primitive types smaller than the system's pointer size - as it makes no performance/memory difference compared to a const ref.
This is tricky - when you know that the type implements a move constructor (such as std::string in C++11). It then looks as if you're passing by copy.
Any of these three lists can go more longer, but these are - I would say - the basic rules of thumb.
Your complete question is a bit unclear to me, but I can answer when you would use passing by value or by reference.
When passing by value, you have a complete copy of the parameter into the call stack. It's like you're making a local variable in the function call initialized with whatever you passed into it.
When passing by reference, you... well, pass by reference. The main difference is that you can modify the external object.
There is the benefit of reducing memory load for large objects passing by reference. For basic data types (32-bit or 64-bit integers, for example), the performance is negligible.
Generally, if you're going to work in C/C++ you should learn to use pointers. Passing objects as parameters will almost always be passed via a pointer (vs reference). The few instances you absolutely must use references is in the copy constructor. You'll want to use it in the operators as well, but it's not required.
Copying objects by value is usually a bad idea - more CPU to do the constructor function; more memory for the actual object. Use const to prevent the function modifying the object. The function signature should tell the caller what might happen to the referenced object.
Things like int, char, pointers are usually passed by value.
As to the structures you outlined, passing by value will not really matter. You need to do profiling to find out, but on the grand scheme of a program you be better off looking elsewhere for increasing performance in terms of CPU and/or memory.
I would consider whether you want value or reference semantics before you go worrying about optimizations. Generally you would pass by reference if you want the method you are calling to be able to modify the parameter. You can pass a pointer in this case, like you would in C, but idiomatic C++ tends to use references.
There is no rule that says that small types or enums should always be passed by value. There is plenty of code that passes int& parameters, because they rely on the semantics of passing by reference. Also, you should keep in mind that for any relatively small data type, you won't notice a difference in speed between passing by reference and by value.
That said, if you have a very large structure, you probably don't want to make lots of copies of it. This is where const references are handy. Do keep in mind though that const in C++ is not strictly enforced (even if it's considered bad practice, you can always const_cast it away). There is no reason to pass a const int& over an int, although there is a reason to pass a const ClassWithManyMembers& over a ClassWithManyMembers.
All of the structs that you listed I would say are fine to pass by value if you are intending them to be treated as values. Consider that if you call a function that takes one parameter of type struct Rectangle{int x, y, w, h}, this is the same as passing those 4 parameters independently, which is really not a big deal. Generally you should be more worried about the work that the copy constructor has to do - for example, passing a vector by value is probably not such a good idea, because it will have to dynamically allocate memory and iterate through a list whose size you don't know, and invoke many more copy constructors.
While you should keep all this in mind, a good general rule is: if you want refence semantics, pass by refence. Otherwise, pass intrinsics by value, and other things by const reference.
Also, C++11 introduced r-value references which complicate things even further. But that's a different topic.
These are the rules that I use:
for native types:
by value when they are input arguments
by non-const reference when they are mandatory output arguments
for structs or classes:
by const reference when they are input arguments
by non-const reference when they are output arguments
for arrays:
by const pointer when they are input arguments (const applies to the data, not the pointer here, i.e. const TYPE *)
by pointer when they are output arguments (const applies to the data, not the pointer)
I've found that there are very few times that require making an exception to the above rules. The one exception that comes to mind is for a struct or class argument that is optional, in which case a reference would not work. In that case I use a const pointer (input) or a non-const pointer (output), so that you can also pass 0.
If you want a copy, then pass by value. If you want to change it and you want those changes to be seen outside the function, then pass by reference. If you want speed and don't want to change it, pass by const reference.
Is it generally considered better to pass parameters as pointers rather than as value when you can? Obviously it largely depends on the situation, but when there is a choice, is it better to use pointers?
Is this simply for reasons of memory?
And what is better to pass through if it is true, a pointer or a reference?
Some general rules of thumb:
If you need to modify it, pass a pointer or a reference. If the value might be null, pass a pointer, otherwise pass a reference.
If it's large, pass a const pointer or const reference, depending on whether null is a legal value.
If using pointers, prefer smart pointers to bare pointers.
Otherwise, pass by value.
In C++, when you pass by value, it calls the copy constructor for custom classes. This can be really expensive if you are passing vectors or large data structures.
You should use const and reference to not copy it and still protect the value. Otherwise, using value for smaller things like ints is typically reasonable.
It best practive to NEVER pass a pointer.
Pass by const reference to avoid the cost of copying.
Pass by reference if you want the function to modify the original.
Otherwise pass by value.
Pointers should never by passed in the RAW (no ownership semantics)
Pointers should never be held outside a smart pointer or a container class.
Only if the object can potentially be NULL and ownership is explicitly not passed (via lots of documentation) should you ever pass a pointer.
The only time I expect to see a pointer is when I can not use a reference (ie it could be NULL) and that is practically never (or wrapped deep inside a container method).
It depends on which kind of parameter you are passing. If the parameter size is reasonably (again, decided by you) low then you may want to pass by value.
For large size structs/arrays it is always a good practice to pass by pointer or reference. On top of this, if the parameter is not supposed to be modifiable then you may also add const.
I usually pass by reference (possible a const one) for input parameters, and by pointers for output parameters.
This way it is immediately visible which parameters are input ones and which parameters are the output ones (the callee modify their content).
Of course, form small types like int, long etc I don't bother with references.
I'm having a problem with a class like this:
class Sprite {
...
bool checkCollision(Sprite &spr);
...
};
So, if I have that class, I can do this:
ball.checkCollision(bar1);
But if I change the class to this:
class Sprite {
...
bool checkCollision(Sprite* spr);
...
};
I have to do this:
ball.checkCollision(&bar1);
So, what's the difference?? It's better a way instead other?
Thank you.
In both cases you are actually passing the address of bar1 (and you're not copying the value), since both pointers (Sprite *) and references (Sprite &) have reference semantics, in the first case explicit (you have to explicitly dereference the pointer to manipulate the pointed object, and you have to explicitly pass the address of the object to a pointer parameter), in the second case implicit (when you manipulate a reference it's as if you're manipulating the object itself, so they have value syntax, and the caller's code doesn't explicitly pass a pointer using the & operator).
So, the big difference between pointers and references is on what you can do on the pointer/reference variable: pointer variables themselves can be modified, so they may be changed to point to something else, can be NULLed, incremented, decremented, etc, so there's a strong separation between activities on the pointer (that you access directly with the variable name) and on the object that it points to (that you access with the * operator - or, if you want to access to the members, with the -> shortcut).
References, instead, aim to be just an alias to the object they point to, and do not allow changes to the reference itself: you initialize them with the object they refer to, and then they act as if they were such object for their whole life.
In general, in C++ references are preferred over pointers, for the motivations I said and for some other that you can find in the appropriate section of C++ FAQ.
In terms of performance, they should be the same, because a reference is actually a pointer in disguise; still, there may be some corner case in which the compiler may optimize more when the code uses a reference instead of a pointer, because references are guaranteed not to change the address they hide (i.e., from the beginning to the end of their life they always point to the same object), so in some strange case you may gain something in performance using references, but, again, the point of using references is about good programming style and readability, not performance.
A reference cannot be null. A pointer can.
If you don't want to allow passing null pointers into your function then use a reference.
With the pointer you need to specifically let the compiler know you want to pass the address of the object, with a reference, the compiler already knows you want the ptr. Both are ok, it's a matter of taste, I personally don't like references because I like to see whats going on but thats just me.
They both do the (essentially) same thing - they pass an object to a function by reference so that only the address of the object is copied. This is efficient and means the function can change the object.
In the simple case you give they are equivalent.
Main differences are that the reference cannot be null, so you don't have to test for null in the function - but you also cannot pass a null object if the case of no object is valid.
Some people also dislike the pass by reference version because it is not obvious in the calling code that the object you pass in might be modified. Some coding standards recommend you only pass const references to functions.
All,
I recently posted this question on DAL design. From that it would seem that passing a reference to an object into a function, with the function then populating that object, would be a good interface for a C++ Data Access Layer, e.g.
bool DAL::loadCar(int id, Car& car) {}
I'm now wondering if using a reference to a boost::shared_ptr would be better, e.g.
bool DAL::loadCar(int id, boost::shared_ptr<Car> &car)
Any thoughts? Does one offer advantages over the other?
What would be the implications of applying const correctness to both calls?
Thanks in advance.
As sbi says, "It depends on what the function does. "
However, I think the most important aspect of the above is not whether NULL is allowed or not, but whether the function stores a pointer to the object for later use. If the function just fills in some data then I would use reference for the following reasons:
the function can still be used by clients who do not use shared_ptr, used for stack objects, etc.
using the function with shared_ptr is still trivial - shared_ptr has dereferencing operator that returns a reference
passing NULL is not possible
less typing
I don't like using "stuff" when I don't have to
If the function needs to store pointer for later use or you anticipate the function might change in such a way that will require storing a pointer, then use shared_ptr.
It depends on what the function does.
In general, a function taking a pointer indicates that callers might call this function even if they don't have an object to pass to it - they can always pass NULL. If that fits the function's spec, then use a (smart) pointer. Passing reference counting smart pointers by references instead copying them is an optimization (and not a premature one, I should add), because it avoids needlessly increasing and decreasing the reference count, which can, in MT environments, be a noticeable performance hit.
A function taking a non-const reference as an argument expects to be passed a valid object that it might change. Callers cannot (legally) call that function unless they have a valid object and they will not call it unless they are willing to have the function change the object's state. If that better fits the function's spec, use a reference.
If you must receive a valid object (i.e. you don't want the caller to pass NULL), then by all means, do not use boost::shared_ptr. Your second example passes a reference to a "smart pointer".... ignoring the details, it is a "pointer to pointer to Car". Because it's reference, the shared_ptr object cannot be NULL.... but it doesn't meant that it can't have a NULL value (i.e. point to a "null" object).
I don't understand exactly why you would think that a reference to a smart pointer would be "better" - does the caller function use smart pointer already?
As for the implications of "const"... do you mean something like
bool DAL::loadCar(int id, const Car& car) {}
?
If yes, it would be counter-productive, you communicate to the compiler the fact that "car" doesn't change (but presumably you want it to change!).
Or do you mean to make the function "const", something like
class DAL{
bool loadCar(int id, Car& car) const;
}
?
In the latter case, you comunicate to the compiler/API user that the method "loadCar" does not modify the DAL object. It's a good idea to do so if this is true - not only that it enables some compiler optimizations, but it is generally a good thing to specify in the "contract" (function signature) that the function makes no modifications to DAL, especially if you make this implicit assumption in your code (this way you make sure that this stays true, and that in the future nobody will modify the "loadCar" function in a way that will change the "DAL" object)
In the first case you simply pass a Car and "fill it" with information. For example, you may create a "default" Car and then fill it. I see one inconvenience in this: it's not very OO to have two classes of Cars: one poor, default, useless, "empty" Car, and one truly filled in Car after it comes from the function. To me, a Car is a Car, so it should be a valid Car (one I can drive from location A to B, for example; one that I can accelerate, brake, start, stop) before and after your function.
I typically work with traditional pointers, not boost (without a problem, by the way) so I really can't comment on the latter alternative.