Can I get the compiler to check that my function which expects a pointer argument has been called with &someValidVariable rather than NULL, some variable, or some literal address?
I'd like to use pointer arguments over reference arguments because those ampersands make, IMO, code easier to understand but I'm lazy to do non-NULL checks.
Can I get the best of both worlds?
For nullptr you can add an overload that takes a dummy std::nullptr_t as argument.
But other than that it's not really possible, with the exception for arrays which you can do e.g.
template<std::size_t N>
void your_function(int (&array)[N]) { ... }
instead of letting it decay to a pointer.
In C++11, you can declare an overload of the function that accepts a std::nullptr_t, and not define that function. This will typically cause a linker error (as distinct from a compiler error). (Although that won't stop the caller from doing something like your_function((YourVariable *)nullptr) - which will call your function with a NULL pointer).
Other than that (in any version of C++) it is not possible, apart from some special cases like passing a reference to an array (such a function will not be passed NULL or (in C++11) nullptr). The reason is that a basic property of pointers is that they are passed by value, and the compiler permits that if the type (or permitted type conversions) is valid. Once the value is passed, the only way to check is at run-time within the function. The only exception is passing the value of an uninitialised pointer which, in itself, causes undefined behaviour (so anything can happen, and all bets are off).
But, really, the real solution to your problem is to pass references. One of the purposes is providing a guarantee that they reference an actual object (dereferencing a NULL to create a reference gives undefined behaviour, as does using a dangling reference to an object that has been destroyed). So you really need to work to better understand what references are and how to use them properly, rather than trying to avoid using them.
Related
In what circumstances should I prefer pass-by-reference? Pass-by-value?
There are four main cases where you should use pass-by-reference over pass-by-value:
If you are calling a function that needs to modify its arguments, use pass-by-reference or pass-by-pointer. Otherwise, you’ll get a copy of the argument.
If you're calling a function that needs to take a large object as a parameter, pass it by const reference to avoid making an unnecessary copy of that object and taking a large efficiency hit.
If you're writing a copy or move constructor which by definition must take a reference, use pass by reference.
If you're writing a function that wants to operate on a polymorphic class, use pass by reference or pass by pointer to avoid slicing.
There are several considerations, including:
Performance
Passing by value copies the data, so passing large data structures by value can inhibit performance. Passing by reference passes only a reference (basically the address) to the data. For large data structures, this can greatly improve performance. For smaller data structures (like an int), passing by reference can inhibit performance.
Modifications
Passing by value copies the data so if the target code modifies that copy, it will not affect the original. Passing by reference passes only the address of the data, so modifications made against that reference will be "visible" to the calling code.
Yes.
Pass by value for things like native types that are small enough that passing them directly is efficient. Otherwise use pass by (const) reference.
The hard part is writing a template that could apply to either (in which case, you usually want to use pass by reference -- the potential penalty for passing a large object by value is much worse than the potential penalty for passing by reference when passing by value would have been preferred).
Edit: this, of course, is assuming a situation where the required semantics would allow either one -- obviously if you're working with something like polymorphic objects, there's no real "preference" involved, because you must use a pointer or reference to get correct behavior.
As others already have replied to your question sufficiently well, I would like to add an important point:
If the class does not have public copy-constructor, then you don't have choice to pass by value; you have to pass by reference (or you can pass pointer).
The following program would not compile:
class A
{
public:
A(){}
private:
A(const A&) {}
};
//source of error : pass by value
void f(A ) {}
int main() {
A a;
f(a);
return 0;
}
Error:
prog.cpp: In function ‘int main()’:
prog.cpp:10: error: ‘A::A(const A&)’ is private
prog.cpp:18: error: within this context
prog.cpp:18: error: initializing argument 1 of ‘void f(A)’
See yourself at ideone : http://www.ideone.com/b2WLi
But once you make function f pass by reference, then it compiles fine : http://www.ideone.com/i6XXB
here's the simple rule:
pass by reference when the value is large.
the other answers are amazing. Just trying to make this simplest.
You have tagged your question with both C and C++.
Therefore, I suggest that you consider using pass by reference in C++ which supports this feature and that you do not consider using it in C which does not support this feature.
pass by reference can be called only in below conditions:
Pass-by-references is more efficient than pass-by-value, because it does not copy the arguments. The formal parameter is an alias for the argument. When the called function read or write the formal parameter, it is actually read or write the argument itself.
The difference between pass-by-reference and pass-by-value is that modifications made to arguments passed in by reference in the called function have effect in the calling function, whereas modifications made to arguments passed in by value in the called function can not affect the calling function.
Use pass-by-reference if you want to modify the argument value in the calling function. Otherwise, use pass-by-value to pass arguments.
The difference between pass-by-reference and pass-by-pointer is
that pointers can be NULL or reassigned whereas references cannot.
Use pass-by-pointer if NULL is a valid parameter value or if you want to reassign the pointer.
Otherwise, use constant or non-constant references to pass arguments.
While pointers are references, "reference" in c++ usually refers to the practice of tagging a parameter of SomeType&.
Which you should never do. The only place it is appropriate is as a magic syntax required to implement the various pre-defined operators. Otherwise:
You should never pass out parameters by reference - pass by pointer, otherwise you make code reviews all but impossible. Pass by reference makes it impossible to tell by examining a call which parameters can be expected to be changed.
You should never pass in parameter by reference either. Again, this means you are performing a meta optimization. You should always just pass-by-value, otherwise you are guilty of peeking inside an object, examining its implementation and deciding that pass-by-reference is preferred for some reason.
Any c++ class should implement all the copy and assignment constructors and overloads necessary to be passed around by value. Otherwise it has not done its job, of abstracting the programmer from the implementation details of the class.
Q: Is pass-by-value/reference defined strictly by behavior or implementation wise in C++, and can you provide an authoritative citation?
I had a conversion with a friend about pass-by-value/reference in C++. We came to a disagreement on the definition of pass-by-value/reference. I understand that passing a pointer to a function is still pass-by-value since the value of the pointer is copied, and this copy is used in the function. Subsequently, dereferencing the pointer in the function and mutating it will modify the original variable. This is where the disagreement appears.
His stance: Just because a pointer value was copied and passed to the function, performing operations on the dereferenced pointer has the ability to affect the original variable, so it has the behavior of pass-by-reference, passing a pointer to a function.
My stance: Passing a pointer to a function does copy the value of the pointer, and operations in the function may affect the original variable; however, just because it may affect the original, this behavior does not constitute it to be pass-by-reference since it is the implementation of the language that is what defines these terms, pass-by-value/reference.
Quoting from the definition given by the highest voted answer here: Language Agnostic
Pass by Reference
When a parameter is passed by reference, the caller and the callee use the same variable for the parameter. If the callee modifies the parameter variable, the effect is visible to the caller's variable.
Pass by Value
When a parameter is passed by value, the caller and callee have two independent variables with the same value. If the callee modifies the parameter variable, the effect is not visible to the caller.
I still have an ambiguous feeling after reading these. For example, the pass by value/reference quotes can support either of our claims. Can anyone clear up the definitions of whether these definition stem from behavior or implementation and provide a citation? Thanks!
Edit: I should be a little more careful of my vocabulary. Let me extend my question with a clarification. What I mean when questioning pass-by-reference is not talking purely about the C++ implementation of & reference, but instead also the theory. In C++, is it that the & pass-by-reference is true PBR because not only can it modify the original value, but also the memory address of the value. This leads to this, example with pointers also count as PBR?
void foo(int ** bar){
*bar = *bar+(sizeof(int*));
cout<<"Inside:"<<*bar<<endl;
}
int main(){
int a = 42;
int* ptrA = &a;
cout<<"Before"<<ptrA<<endl;
foo(&ptrA);
cout<<"After:"<<ptrA<<endl;
}
The output would be that After ptrA is equal to Inside, meaning that not only can the function modify a, but ptrA. Because of this, does this define call-by-reference as a theory: being able to not only modify the value, but the memory address of the value. Sorry for the convoluted example.
You talk a lot about pointers here, which they are indeed passed by value most of the time, but you don't mention actual C++ references, which are actual references.
int a{};
int& b = a;
// Prints true
std::cout << std::boolalpha << (&b == &a) << std::endl;
Here, as you can see, both variables have the same address. Put it simply, especially in this case, references act as being another name for a variable.
References in C++ are special. They are not objects, unlike pointers. You cannot have an array of references, because it would require that references has a size. Reference are not required to have a storage at all.
What about actually passing a variable by reference then?
Take a look at this code:
void foo(int& i) {
i++;
}
int main() {
int i{};
foo(i);
// prints 1
std::cout << i << std::endl;
}
In that particular case, the compiler must have a way to send to which variable the reference is bound. Indeed references are not required to have any storage, but they are not required to not have one either. In this case, if optimizations are disabled, it is most likely that the compiler implements the behavior of references using pointers.
Of course, if optimizations are enabled, it may skip the passing and completely inline the function. In that case, the reference don't exist, or don't have any storage, because the original variable will be used directly.
Other similar optimization happens with pointers too, but that's not the point: The point is, the way references are implemented is implementation defined. They are most likely implemented in term of pointers, but they are not forced to, and the way a reference is implemented may vary from case to case. The behavior of references are defined by the standard, and really is pass-by-reference.
What about pointers? Do they count as passing by reference?
I would say no. Pointers are objects, just like int, or std::string. You can even pass a reference to a pointer, allowing you to change the original pointer.
However, pointers do have reference semantics. They are not reference indeed, just like std::reference_wrapper is not a reference either, but they have reference semantics. I wouldn't call passing a pointer "passing by reference", because you don't have an actual reference, but you indeed have reference semantics.
A lot of things have reference semantics, pointers, std::reference_wrapper, a handle to a resource, even GLuint, which are handle to an opengl object, all have reference semantics, but they are not references. You don't have a reference to the actual object, but you can change the pointed-to object through these handles.
There are other good articles and answers you can read about. They are all very informative about value and reference semantics.
isocpp.org: Reference and Value Semantics
Andrzej's C++ blog: Value semantics
Stack Overflow: What are the differences between a pointer variable and a reference variable in C++?
Passing by value/reference (you forgot one which is passing the address to the location in memory by using a pointer) is part of the implementation of C++.
There is one more way to pass variables to functions, and that is by address. Passing an argument by address involves passing the address of the argument variable (using a pointer) rather than the argument variable itself. Because the argument is an address, the function parameter must be a pointer. The function can then dereference the pointer to access or change the value being pointed to.
Take a look here at what I have always thought to be an authoritative Source: Passing Arguments by Address.
You're correct in regards to a value being copied when passing by value. This is the default behavior in C++. The advantage of passing by value into a function is that the original value cannot be changed by the function when the value is passed into it and this prevents any unwanted bugs and/or side effects when changing the value of an argument.
The problem with passing by Value is that you will incur a huge performance penalty if you pass an entire struct or class many times into your function as you will be passing entire copies of the value you are trying to pass AND in the case of a mutator method in a class, you will not be able to change the original values and will therefore end up creating multiple copies of the data you are trying to modify because you will be forced to return the new value from the function itself instead of from the location in memory where the data structure resides. This is just completely inefficient.
You only want to pass by value when you don't have to change the value of the argument.
Here is a good source on the topic of Passing Arguments by Value.
Now, you will want to use the "Pass by Reference" behavior when you do need to change the value of an argument in the case of arrays, Classes, or structs. It is more efficient to change the value of a data structure by Passing a Reference to the location in memory where the data structure resides into the function. This has the benefit that you will not have to return the new value from the function but rather, the function can then change the value of the reference you have given it directly where it resides in memory.
Take a look here to read more about about Passing an Argument by Reference.
EDIT: In regards to the issue as to whether or not you are passing a non-const by reference or by value when using a pointer, it seems to me the answer is clear. When using a pointer to a non-const, it is neither. When passing a pointer as an argument to a function, you in fact are "Passing the Value" of the ADDRESS into the function and since it is a copy of the ADDRESS of the location in memory where the non-const resides, then you are able to change the Value of the data at that location and not the value of the pointer itself. If you do not want to change the value of the data located at the address pointed to by the pointer being passed by value as an argument into your function, it is good form to make the pointer to an argument a const since the function will not be changing the value of the data itself.
Hope that makes sense.
References are different from pointers. The main reason references were introduced is to support Operator Overloading. C++ is derived from C and during the process, Pointers were inherited from C. As Stroustrup says:
C++ inherited pointers from C, so I couldn't remove them without causing serious compatibility problems.
So, effectively there are three different ways of parameters passing:
Pass by value
Pass by reference &
Pass by pointers.
Now, pass by pointer has the same effect as pass by reference. So how to decide on what you want to use? Going back to what Stroustrup said:
That depends on what you are trying to achieve:
If you want to change the object passed, call by reference or use a pointer; e.g. void f(X&); or void f(X*);
If you don't want to change the object passed and it is big, call by const reference; e.g. void f(const X&);
Otherwise, call by value; e.g. void f(X);
Ref: http://www.stroustrup.com/bs_faq2.html#pointers-and-references
Those terms are about the variable that is passed, in this case the pointer. If you pass a pointer to a function then the variable that is passed is the pointer - holding the address of the object - to an object and not the object it points to.
If you pass a pointer by value then chaning the object it is pointing to in the function would not affect the pointer that was passed to the function.
If you pass the pointer by reference then you can change in the function where the pointer is pointing to and it would modifiy the pointer that was passed to this function.
Thats how it is defined. Otherwise you could argue that if you have a global std::map<int,SomeObject> and you pass an int as key to the object, would also be a pass by reference because you can modify the objects in that global map, and the caller would see those changes. Because this int is also just a pointer to an object.
If a have a function that takes a Eigen matrix as an argument, what would be the difference between:
void foo(Eigen::MatrixXd& container){
for(i=0;i<container.rows();i++){
for(j=0;j<container.cols();j++){
container(i,j)=47;
}
}
}
and
void foo(Eigen::MatrixXd* container){
for(i=0;i<container->rows();i++){
for(j=0;j<container->cols();j++){
container->coeffRef(i,j)=47;
}
}
}
In Eigen documentation, they only present the first method - does that mean that there are any advantages to that approach? And what are the drawbacks of not using const when passing the Matrix reference in the first case?
References are nice because there is no such thing as a null reference, so using a reference parameter reduces the risk of someone calling your function with an invalid value.
On the other hand some coding standards recommend making parameters you intend to modify pointers instead of non-const references. This forces the caller to explicitly take the address of any value they pass in making it more obvious the value will be modified. The choice of pointer vs. non-const reference is up to you.
However, if you do not intend to modify the parameter then making it a const reference is definitely the way to go. It avoids the problem of passing invalid pointers, allows you to pass in temporaries, and the caller doesn't care if the parameter is taken by reference since it isn't going to be modified.
With C++ code, there is the expectation that if a parameter is passed as pointer rather than reference, then the null pointer is a valid argument.
That is, by default you should use reference parameters. Only use pointers if the parameter is in some way "optional" and you want the caller to be able to pass the null pointer to signify "no value".
see the line:
container(i,j)=47.
That's not a constant operation, so you're not going to be able to set it to const.
One way a reference is different than a pointer is that your container reference can't be null. Pass by reference is a good way to avoid some errors while getting the benefits of not copying.
I have a function defined like this:
void doSomethingWithCustomer (const Customer &customer);
One of my fellow developers called it like this:
Customer *customer = order.getCustomer();
doSomethingWithCustomer (*customer);
Unfortunately, the getCustomer method can return a nullptr, if the order is not tied to a customer.
If getCustomer returns a nullptr, then the application does not crash at the time of the call to doSomethingWithCustomer but rather within the function, where the customer reference is used.
Of course the correct way to write this is to check for customer not being a nullptr first, then call the function if we have a valid customer.
Normally we expect that if a function/method has a reference argument, that the caller checks the validity of it (which wasn't the case here), instead of the function itself checking the argument.
I know that Visual Studio 2010 (and earlier versions) passes references by actually passing the pointer, but I wonder if this is indicated somewhere in the C++ standard. Can we assume that a reference is always passed as a pointer (personally, I wouldn't rely on this, but it's interesting to know it)?
Is it possible to tell Visual Studio that when passing a reference, it should automatically dereference it first and crash at the time of the call rather then somewhere much deeper (doing this in a debug version might be sufficient)?
Is it valid to assume that a reference is passed as a pointer?
No, it is not.
The standard does not mandate that the reference should be implemented in terms of pointer.
How to actually implement a reference is an implementation detail which the standard leaves out for implementations to decide on. It only describes the expected behavior from an Reference and one of them is a reference can never be NULL in a standard conformant program.
If your function parameter is expected to be NULL sometimes then you should pass it as an pointer.
No. It's completely implementation defined.
For diagnostic purposes, I've created a little container type which validates the parameter. You would then declare the function/method:
void doSomethingWithCustomer(const t_nonnull<const Customer>& pCustomer);
where the t_nonnull type validated the parameter at construction. However, I've found it more useful to just use references more and more frequently (IOW, don't return a pointer in this case -- just consider it an error to access the customer when the customer does not exist).
The behavior is undefined. That means that you cannot rely on any particular way of discovering the error. A good compiler might be able to warn you at compile time, while another one might mask the error completely, depending on how you use the variable. It is your responsibility to make sure that references are never NULL.
It's not observable, and it's irrelevant. A program using this code has undefined behavior because it dereferences a null pointer.
Reference might be implemented using pointer internally (like what happens in Java). But below 2 difference are crucial between them:
By default a pointer can be rebind to the other object. However
reference has a permanent binding with the object it was initialized
A pointer can be assigned with 0, but a reference cannot
Analogically T& is equivalent to T* const.
You don't need to inform compiler about deferencing a variable, because that happens by yourself.
void foo(T&);
T t, *p = 0;
foo(t); // the only way to pass reference
foo(*p); // the only way to pass reference
As others have said, the standard does not impose how references should be implemented. To catch "dereferenced NULL passed as reference" errors early, you could do this inside your functions(s):
void doSomethingWithCustomer (const Customer &customer)
{
assert(&customer);
//... rest of function
}
In what circumstances should I prefer pass-by-reference? Pass-by-value?
There are four main cases where you should use pass-by-reference over pass-by-value:
If you are calling a function that needs to modify its arguments, use pass-by-reference or pass-by-pointer. Otherwise, you’ll get a copy of the argument.
If you're calling a function that needs to take a large object as a parameter, pass it by const reference to avoid making an unnecessary copy of that object and taking a large efficiency hit.
If you're writing a copy or move constructor which by definition must take a reference, use pass by reference.
If you're writing a function that wants to operate on a polymorphic class, use pass by reference or pass by pointer to avoid slicing.
There are several considerations, including:
Performance
Passing by value copies the data, so passing large data structures by value can inhibit performance. Passing by reference passes only a reference (basically the address) to the data. For large data structures, this can greatly improve performance. For smaller data structures (like an int), passing by reference can inhibit performance.
Modifications
Passing by value copies the data so if the target code modifies that copy, it will not affect the original. Passing by reference passes only the address of the data, so modifications made against that reference will be "visible" to the calling code.
Yes.
Pass by value for things like native types that are small enough that passing them directly is efficient. Otherwise use pass by (const) reference.
The hard part is writing a template that could apply to either (in which case, you usually want to use pass by reference -- the potential penalty for passing a large object by value is much worse than the potential penalty for passing by reference when passing by value would have been preferred).
Edit: this, of course, is assuming a situation where the required semantics would allow either one -- obviously if you're working with something like polymorphic objects, there's no real "preference" involved, because you must use a pointer or reference to get correct behavior.
As others already have replied to your question sufficiently well, I would like to add an important point:
If the class does not have public copy-constructor, then you don't have choice to pass by value; you have to pass by reference (or you can pass pointer).
The following program would not compile:
class A
{
public:
A(){}
private:
A(const A&) {}
};
//source of error : pass by value
void f(A ) {}
int main() {
A a;
f(a);
return 0;
}
Error:
prog.cpp: In function ‘int main()’:
prog.cpp:10: error: ‘A::A(const A&)’ is private
prog.cpp:18: error: within this context
prog.cpp:18: error: initializing argument 1 of ‘void f(A)’
See yourself at ideone : http://www.ideone.com/b2WLi
But once you make function f pass by reference, then it compiles fine : http://www.ideone.com/i6XXB
here's the simple rule:
pass by reference when the value is large.
the other answers are amazing. Just trying to make this simplest.
You have tagged your question with both C and C++.
Therefore, I suggest that you consider using pass by reference in C++ which supports this feature and that you do not consider using it in C which does not support this feature.
pass by reference can be called only in below conditions:
Pass-by-references is more efficient than pass-by-value, because it does not copy the arguments. The formal parameter is an alias for the argument. When the called function read or write the formal parameter, it is actually read or write the argument itself.
The difference between pass-by-reference and pass-by-value is that modifications made to arguments passed in by reference in the called function have effect in the calling function, whereas modifications made to arguments passed in by value in the called function can not affect the calling function.
Use pass-by-reference if you want to modify the argument value in the calling function. Otherwise, use pass-by-value to pass arguments.
The difference between pass-by-reference and pass-by-pointer is
that pointers can be NULL or reassigned whereas references cannot.
Use pass-by-pointer if NULL is a valid parameter value or if you want to reassign the pointer.
Otherwise, use constant or non-constant references to pass arguments.
While pointers are references, "reference" in c++ usually refers to the practice of tagging a parameter of SomeType&.
Which you should never do. The only place it is appropriate is as a magic syntax required to implement the various pre-defined operators. Otherwise:
You should never pass out parameters by reference - pass by pointer, otherwise you make code reviews all but impossible. Pass by reference makes it impossible to tell by examining a call which parameters can be expected to be changed.
You should never pass in parameter by reference either. Again, this means you are performing a meta optimization. You should always just pass-by-value, otherwise you are guilty of peeking inside an object, examining its implementation and deciding that pass-by-reference is preferred for some reason.
Any c++ class should implement all the copy and assignment constructors and overloads necessary to be passed around by value. Otherwise it has not done its job, of abstracting the programmer from the implementation details of the class.